Brent oil price fluctuates continuously causing instability in the economy. Therefore, it is essential to accurately predict the trend of oil prices, as it helps to improve profits for investors and benefits the community at large. Oil prices usually fluctuate over time as a time series and as such several sequence-based models can be used to predict them. Hence, this study proposes an efficient model named BOP-BL based on Bidirectional Long Short-Term Memory (Bi-LSTM) for oil price prediction. The proposed framework consists of two modules as follows: The first module has three Bi-LSTM layers which help learning useful information features in both forward and backward directions. The last fully connected layer is utilized in the second module to predict the oil price using important features extracted from the previous module. Finally, empirical experiments are conducted and performed on the Brent Oil Price (BOP) dataset to evaluate the prediction performance in terms of several common error metrics such as Mean Square Error (MSE), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE) among BOP-BL and three state-of-the-art models (for time series forecasting) including Long Short-Term Memory (LSTM), the combination of Convolutional Neural Network and LSTM (CNN-LSTM), and the combination of CNN and Bi-LSTM (CNN-Bi-LSTM). The experimental results demonstrate that the BOP-BL model outperforms state-of-the-art methods for predicting Brent oil price on the BOP dataset.

Nowadays, many applications of artificial intelligence in various areas such as data mining systems [

Recently, several techniques for time series prediction are utilized in a number of price prediction problems such as stock price trend prediction [

The rest of this article is organized as follows. Section 2 summarizes the related works on time series prediction, recurrent neural network, and the combination of CNN and sequence-based models. The material and method are introduced in Section 3. In this section, this study presents the details of the BOP dataset in Section 3.1. The proposed framework to predict the oil price is illustrated in Section 3.2. Next, Section 4 provides the experimental results of the proposed model and other methods on the experimental dataset. The conclusions as well as future directions are presented in Section 5.

The problem of time series prediction is one of the important problems in machine learning with various practical applications such as stock price trend prediction [

The Recurrent Neural Network (RNN) known as one of the effective approaches refers to tackling the sequential problem or temporal aspects of data as time series. Many previous studies have utilized the RNN as well as several variants of RNN such as Long-Short-Term Memory (LSTM), and Bidirectional Long-Short-Term Memory (Bi-LSTM) to obtain promising results in various applications. Recently, Le et al. [

Besides, several studies have adopted the combination of CNN and sequence-based models. The idea behind these approaches is to improve the effectiveness of the model by utilizing CNN to extract and reduce the feature dimensions and then the sequenced model to predict for specific values. This approach improves not only the performance of the predictive models but also time consumption [

The BOP dataset provides the time series of the daily oil prices from 1987 to 2019. There is a total of 8,217 observations of daily oil prices collected by the U.S. Energy Information Administration. Each observation in this dataset consists of the date and oil price. Two visualizations of the BOP dataset are shown in

According to

According to the characteristics of the oil price dataset analyzed in the previous section, we propose an efficient framework named BOP-BL for predicting the Brent oil prices. The proposed framework utilizes Bi-LSTM in the first module for capturing the information in forward and backward directions of oil price time series. The overview of the training phase is summarized as follows. First, the input vectors from the BOP dataset are passed into the first Bi-LSTM layer in the first module to learn the essential features related to the changes in oil price trends. In other words, the first layer captures the essential information in forwarding and backward directions of oil price time series. After that, the outputs of the first layer are continuously put into two remaining Bi-LSTM layers in the first module to refine the useful information for the next step. Finally, the second module that has a fully connected layer is utilized to reduce the feature dimensions based on the obtained features from the previous module to predict oil prices. In the testing phase, the proposed framework puts the feature vectors of the test set into the trained model to predict oil prices. The overview of the proposed framework is illustrated in

To understand our proposed framework, the feature vector from the BOP dataset

where ^{th} step, and

where ^{th} step.

Finally, the outputs of the first module are passed into one fully connected layer (the second module) to generate the oil price in our framework (see

#No | Layer Type | Neurons | #Parameters |
---|---|---|---|

1 | Bi-LSTM | (None, 90, 120) | 29,760 |

2 | Dropout | (None, 90, 120) | 0 |

3 | Bi-LSTM | (None, 90, 120) | 86,880 |

4 | Dropout | (None, 90, 120) | 0 |

5 | Bi-LSTM | (None, 90, 120) | 86,880 |

6 | Dropout | (None, 90, 120) | 0 |

7 | Fully connected layer | (None, 1) | 121 |

This section compares the performance results among the proposed method and the state-of-the-art methods for time series prediction including LSTM, CNN-LSTM, CNN-Bi-LSTM approaches to verify the effectiveness of BOP-BL model for the oil price prediction problem. Notice that the experimental methods are performed in the same environment. In the first method namely LSTM, this study utilizes three LSTM layers in the first module following with one fully connected layer in the second module. In addition, the CNN-LSTM model consists of two CNN layers, two LSTM layers, and one fully connected layer. Meanwhile, the CNN-Bi-LSTM model utilizes Bi-LSTM instead of LSTM in the same architecture of the CNN-LSTM model. The Keras library is utilized as a background framework for all experimental methods. The experimental computer is with CPU core i7 (2.7 GHz), 16 GB RAM, and GPU 940MX. Moreover, all experimental models are trained in 50 epochs, and a batch size of 15. The Adam optimizer algorithm is applied in the experimental methods for optimizing with an initializing learning rate of 0.001.

In addition, this study utilizes four performance metrics in time series prediction problems such as MSE, RMSE, MAE, and MAPE. Each metric shows a different meaning in time series prediction. MSE metric measures the average of the squares of errors while RMSE metric considers how spread out these residuals are by the standard deviation of prediction errors. Moreover, MAE metric measures the average magnitude of the prediction errors. Finally, MAPE represents the accuracy of the model prediction. The following equations show how to calculate the above-mentioned error metrics.

where

Firstly, this study conducts the experiment on the stability and convergence of BOP-BL model.

The experimental results illustrated in

Next, the experimental results for the BOP dataset are presented in

Method | MSE | RMSE | MAE | MAPE |
---|---|---|---|---|

CNN-LSTM | 14.05 | 3.75 | 2.97 | 5.42 |

CNN-Bi-LSTM | 12 | 3.46 | 2.69 | 4.84 |

LSTM | 4.54 | 2.13 | 1.69 | 2.95 |

Based on the empirical analysis from

In this study, we developed an efficient framework based on the Bi-LSTM network for predicting Brent oil prices for the BOP dataset. The proposed framework named BOP-BL consists of two modules as follows. The Bi-LSTM network is utilized in the first module to preserve the essential information of input features in both forward and backward directions, while a fully connected layer is used in the second module to predict the oil prices. The experiments are conducted on the BOP dataset to indicate that the proposed framework is superior to the state-of-the-art methods for time series prediction including LSTM, CNN-LSTM, and CNN-Bi-LSTM in terms of several popular metrics including MSE, RMSE, MAE, and MAPE. In addition, the empirical analysis of this study also confirms that the sequence-based models such as Bi-LSTM and LSTM are more effective than the complex models such as CNN-LSTM, and CNN-Bi-LSTM on the POP dataset. Therefore, BOP-BL and the models based on Bi-LSTM are recommended to predict the oil price.

In the future, we intend to utilize Bayesian optimization to improve the performance of the proposed model for predicting the oil price following time series. Besides, we will continuously compare the efficiency of the proposed model by extension of the oil price dataset or by trying it on another dataset.