Forecasting river flow is crucial for optimal planning, management, and sustainability using freshwater resources. Many machine learning (ML) approaches have been enhanced to improve streamflow prediction. Hybrid techniques have been viewed as a viable method for enhancing the accuracy of univariate streamflow estimation when compared to standalone approaches. Current researchers have also emphasised using hybrid models to improve forecast accuracy. Accordingly, this paper conducts an updated literature review of applications of hybrid models in estimating streamflow over the last five years, summarising data preprocessing, univariate machine learning modelling strategy, advantages and disadvantages of standalone ML techniques, hybrid models, and performance metrics. This study focuses on two types of hybrid models: parameter optimisation-based hybrid models (OBH) and hybridisation of parameter optimisation-based and preprocessing-based hybrid models (HOPH). Overall, this research supports the idea that meta-heuristic approaches precisely improve ML techniques. It's also one of the first efforts to comprehensively examine the efficiency of various meta-heuristic approaches (classified into four primary classes) hybridised with ML techniques. This study revealed that previous research applied swarm, evolutionary, physics, and hybrid metaheuristics with 77%, 61%, 12%, and 12%, respectively. Finally, there is still room for improving OBH and HOPH models by examining different data pre-processing techniques and metaheuristic algorithms.

Water scarcity, high requests for electricity consumption, irrigation requirements, industrial and residential uses are the primary issues compelling academics to precisely estimate streamflow for the effective use of freshwater resources [

The hybrid techniques combine various models for supporting (manipulating the data) and optimising the primary method [

Multiple combined techniques have been advanced and effectively used to enhance the precision of univariate streamflow prediction. According to Hajirahimi et al. [

The parameter optimisation-based hybrid models (OBH): The rationale behind OBH models is to make use of optimisation methods [

Hybridisation of parameter optimisation-based and preprocessing-based hybrid models (HOPH): It is developed by combining the metaheuristic algorithms with the preprocessing-based hybrid models (PBH) models to seek appropriate parameters [

The reason behind studying these hybrid models in detail is because they enhanced predictive performance and are better than standalone methods, as demonstrated in

The univariate prediction technique is a data-driven modelling strategy that uses the same time series data as input, such as streamflow time series data [

Different metaheuristic optimisation methods could solve various issues for diverse application fields. The significant benefits of optimisation approaches are their capability to choose the optimum values for system hyperparameters across a wide range of operating situations [

In addition, many other review articles have introduced the uses of machine learning to predict streamflow [

Reference | Keywords | Summary |
---|---|---|

[ |
Streamflow prediction, empirical and process-based methods, data-driven models, machine learning, flood prevention, Canada | Modelling river flow in cold and ungauged regions: a review of the purposes, methods, and challenges |

[ |
Artificial intelligence, artificial neural network (ANN), support vector machine (SVM), adaptive neuro-fuzzy inference system (ANFIS), optimisation algorithms, genetic algorithms (GA), particle swarm optimisation (PSO), artificial bee colony (ABC), streamflow forecasting | A review of the hybrid artificial intelligence and optimisation modelling of hydrological streamflow forecasting |

[ |
– | Generating ensemble streamflow forecasts: a review of methods and approaches over the past 40 years |

[ |
Naturalisation methods, streamflow, human influences, impacted |
Streamflow naturalisation methods: a review |

[ |
Univariate streamflow forecasting, data-driven model, daily streamflow, wavelet decomposition, reservoir influence | Univariate streamflow forecasting using commonly used data-driven models: literature review and case study |

[ |
Gene expression programming, genetic programming, sedre stream streamflow, time series modelling | Genetic programming for streamflow forecasting: a concise review of univariate models with a case study |

Literature on streamflow estimation can be seen from numerous angles. Ibrahim et al. [

Many studies on the development application of naturalisation procedures and the primary obstacles associated with the methods have been examined and evaluated. Naturalisation procedures are used when natural flows cannot be measured directly and must instead be calculated. Terrier et al. [

Zhenghao et al. [

All the above reviewers in their studies considering the climate changes with streamflow, except Zhenghao et al. [

To this end, in all the above review papers, there is rarely a focus on univariate streamflow prediction.

In this context, the contributions made in this work could be introduced in detail below:

1-Collect previous studies and present them in coherent groups to shed light on the hybrid methods currently employed in univariate streamflow forecasting. Particularly meta-heuristic algorithms that incorporate machine learning. 2-Explain and show the outcomes of using various algorithms in prior studies, highlighting the most common and successful varieties, and providing some recommendations for future investigations. 3-Focus on how these algorithms help improve the reliability of predictions. 4-Identify potential research pathways to help academics by illuminating the available options and gaps in this field.

The current research is up to date and explores using OBH and HOPH for univariate streamflow anticipating only ML models.

ML models have been utilised over decades and have gained considerable attention recently. This is because it can manage large volumes of data and permit nonlinear structures by applying complex mathematical processes [

Model type | Advantages | Disadvantages | References |
---|---|---|---|

ANN | The capability of dealing with the nonlinearity of data, high generalisation and simple implementation. | The standard ANN approach frequently exhibits deficiencies, such as slow convergence and local minimum. Since insufficient input data can result in an incorrect learning map, the performance of ANN models may suffer if there is noise in the data. | [ |

ANFIS | It can contain all reasons not included in an ideal model while eliminating specific causes considered in physically based models. | ANFIS is implicit, and as a member of the black box, family is difficult to interpret. | [ |

SVR | The SVR employs structural risk minimisation to identify the pattern between independent and dependent values. | SVR’s accuracy is based on the appropriate choice of inputs and parameters. | [ |

RF | It solved numerous regression, classification, and clustering problems and has performed well in numerous disciplines. | It has various drawbacks, including weak performance and a lack of reliability. | [ |

GP | Genetic programming (GP) paradigm generates answers to problems by means of crossover and mutation principles. GP has the capability of self-parameter selection to extract the features necessary for tuning the model without user intervention, and it linearly displays the program. | The fundamental issue with GP is its high computing cost, which is mostly attributable to the repeated evaluation of individual fitness during the evolutionary process. | [ |

It plays a critical role in estimating variables by endorsing high precision and minimal processing costs throughout the training stage, where noisy and undependable data may badly influence the training stage and lead to a flawed model [

The goal of data normalisation is to provide each ANN model with input data that is normally or nearly normally distributed and has the same range of values [

The step of cleaning data seeks to identify outliers and noise [

Selecting the optimum predictors is one of the key processes in data preprocessing and evolving a viable prediction model [

Reference | Normalisation | Cleaning | Selection predictors |
---|---|---|---|

[ |
✗ | ✗ | ✓ |

[ |
✗ | ✗ | ✓ |

[ |
✓ | ✗ | ✓ |

[ |
✗ | ✗ | ✗ |

[ |
✗ | ✗ | ✗ |

[ |
✗ | ✗ | ✓ |

[ |
✗ | ✗ | ✓ |

[ |
✓ | ✗ | ✗ |

[ |
✗ | ✗ | ✓ |

[ |
✗ | ✓ | ✓ |

[ |
✗ | ✗ | ✓ |

[ |
✓ | ✓ | ✗ |

[ |
✓ | ✓ | ✗ |

[ |
✗ | ✗ | ✓ |

[ |
✓ | ✗ | ✓ |

[ |
✗ | ✗ | ✗ |

[ |
✓ | ✗ | ✗ |

[ |
✗ | ✗ | ✗ |

[ |
✓ | ✗ | ✗ |

[ |
✓ | ✗ | ✗ |

[ |
✓ | ✗ | ✗ |

[ |
✓ | ✗ | ✗ |

[ |
✗ | ✓ | ✗ |

[ |
✓ | ✗ | ✓ |

[ |
✗ | ✓ | ✗ |

[ |
✗ | ✗ | ✗ |

[ |
✓ | ✓ | ✓ |

[ |
✗ | ✗ | ✗ |

[ |
✓ | ✗ | ✗ |

[ |
✗ | ✗ | ✓ |

[ |
✗ | ✗ | ✓ |

This section will discuss the papers that the combined technique incorporates a couple or more methods: one works as the main technique and the rest as a pre-or post-processing technique. Recently, coupled techniques have been proposed to build flexible and effective techniques and increase the precision of single algorithms’ forecasts [

Year | Reference | Location | Time scale | Variables used | Period | Models used | Best model | Measures |
---|---|---|---|---|---|---|---|---|

2021 | [ |
Egypt | Monthly | – | 1887–2000 | MLP, RNN, |
NRO-MLP | RE, |

2018 | [ |
Iran | Daily | Q_{t-1} |
3 years | ANFIS,_{R} |
ANFIS-PSO | R^{2}, RMSE, MARE, NSE, MAE |

2018 | [ |
Iran | Monthly | Q_{t-1},_{t-11},_{t-12},_{t-23},_{t-24} |
1978–2008 | GEP, GP, MLR, |
GEP-GA | RMSE, NSE |

2021 | [ |
Iran | Daily |
Q_{t-1};_{t-7};_{t-1-365} |
2005–2016 | ANFIS-PSO, ANFIS-DE, ANFIS-FFA, ANFIS ANFIS-GA, ANFIS-GWO | ANFIS-GWO | RMSE, MAE, NSE, PI, R^{2}, |

2022 | [ |
Turkey | Daily | – | 2000–2019 | LSTM, Linear Regression, |
GA-LSTM | MAD, R^{2} |

2019 | [ |
Egypt | Monthly | Q_{t-1},_{t-2},_{t-11},_{t-12},_{t-13} |
1871–2000 | SVR, MLR, ANN, SVR-GWO, ANN-GWO, MLR-GWO | SVR-GWO | RMSE, MAE, R, NSE, WI |

2020 | [ |
China | Monthly | Q_{t-1} |
1956–2010 | ELM, AR, ANN, ELM-PSO, ELM-GA, ELM-IPSO | ELM-IPSO | MAE, RMSE, RE, NSE |

2020 | [ |
China | Monthly | Q_{t-9},_{t-10} |
1940–2019 | ANN, ELM, SVM, |
ANN-CSA | RMSE, MAPE, |

2019 | [ |
Malaysia | Monthly | Q_{t-1},_{t-2},_{t-3},_{t-6},_{t-12} |
2000–2014 | ANFIS, ANFIS-DE, |
ANFIS-PSO | MAE, RMSE, WI, R^{2} |

2019 | [ |
Algeria | Monthly | (Q_{t-1},_{t-2},_{t-3},_{t-4},_{t-9},_{t-11})_{t-1},_{t-2},_{t-3},_{t-4},_{t-8},_{t-11}) |
1970–1995 | SVM, GWO–SVR, |
GWO–WSVR | RMSE, MAE, |

2021 | [ |
Taiyuan | Monthly | (Q_{t-1},_{t-2})_{t-1},_{t-2},_{t-12},_{t-13}) |
1956–2016 | GRU, ELM, LSSVM, |
Monthly IGWO-GRU | NSE, RMSE, R, |

2021 | [ |
Taiyuan | Monthly | – | 1956–2016 | LSSVM, ELM, GRU, |
ICEEWT-IGWO-GRU | NSE, RMSE, |

2020 | [ |
China | Daily | Q_{t-1},_{t-2},_{t-3},_{t-4},_{t-5},_{t-6} |
2005–2019 | ELM, ANN, SVM, ARMA, |
VMD-ELM | RMSE, MAPE, |

2020 | [ |
Iraq | Monthly | Q_{t-1},_{t-2},_{t-3},_{t-4} |
1991–2010 | ANN, SVR, RF, |
SVR-GA | ME, RMSE, MAE, MPE, MAPE, R^{2} |

2021 | [ |
Malaysia | Daily | Q_{t-1},_{t-2},_{t-3} |
835 day | MLP, MLP-PSO, MLP-SFA, MLP-GA | MLP-SFA | PBIAS, NSE, MAE, RMSE, D |

2020 | [ |
Pakistan | Monthly | Q_{t-1},_{t-2},_{t-3} |
– | M5 Regression Tree (M5RT), ANN, ANFIS, |
ANN-GA and ANFIS-GA for Kunhar Rivers and |
RMSE, MAE, R^{2} |

2020 | [ |
Vietnam | Monthly | Q_{t-1},_{t-36},_{t-24},_{t-12} |
1978–2016 | MLP, MLP-IWD, MLP-NN | MLP-IWD | RE, R^{2} |

2022 | [ |
Turkey | Monthly | – | 2000–2009 | GRU, |
GWO-GRU | RMSE, MAPE, MAE, R^{2}, STD |

2020 | [ |
India | Hourly | Q_{t-1},_{t-3},_{t-12} |
2000–2005 | CANFIS, ANN, PNN, |
CANFIS-FA | RMSE, CE, MAPE, PFC, LFC |

2022 | [ |
Turkey | Daily | – | 2009–2019 | LSTM, ARIMA, |
PSO-LSTM | RMSE, MAE, MAPE, STD, R^{2} |

2018 | [ |
Turkey | Monthly | Q_{t-1},_{t-2} |
1990–2016 | FNN, FNN-PSOGSA, |
FNN-PSOGSA | RMSE, MAE, NSE, WI |

2020 | [ |
Canada, |
Daily | Q_{t-1},_{t-2},_{t-3} |
1998–2018 | MLP, BL, MLP-PSO, |
MLP-BL | RMSE, MAE,^{2} |

2019 | [ |
Algeria | Daily | Q_{t-1} |
14 years | FFNNs, WFFNNs-GA based on the three evolutionary strategies [i.e., (MIMO), |
WFFNNs-GA model based on MISMO | RMSE, SNR, R, NSE, PFC |

2021 | [ |
Algeria | Daily | Q_{t},_{t-1} |
6 years | FFNN, ERNN, LSTM, GRU, |
GRU-PSO with ADAM | RMSE, SNR, NSE |

2018 | [ |
Northern Algeria | Daily | Q_{t-4},_{t-2},_{t-1} |
12 years | ANN, ANFIS, |
WANFIS-GA | MARE, RMSE,^{2} |

2022 | [ |
Turkey | Daily | Q_{t-1},_{t-2},_{t-3} |
1981–2010 | ANN, ANN-ALO, |
ANN-ALO | RMSE, NSE, RSR, VAF, PI, WI, MAE, MBE, MAPE, PBais R^{2} |

2022 | [ |
Iraq | Monathly | Q_{t-1},_{t-2},_{t-3,}_{t-4},_{t-5} |
2010–2020 | ANN-CPSOCGSA, ANN-MPA, ANN-SMA | ANN-CPSOCGSA | MAE, RMSE, MARE, SI, MBE, R^{2} |

2022 | [ |
South |
Hourly | Q_{t-1},_{t-2},_{t-3,}_{t-4},_{t-5,}_{t-1} |
2003–2020 | GA-BART, GA-SVR, MLR | GA-BART | MAPE, RMSE, CC, NSE, TL, MAE, BIAS |

2021 | [ |
Northern |
Monthly | Q_{t-1},_{t-2},_{t-3} |
1980–2011 | ELM-PSOGWO, ELM, ELM-PSO,ELM-GWO, ELM-PSOGSA | ELM-PSOGWO | RMSE, MAE, NSE, R^{2} |

2022 | [ |
Pakistan | Monthly | Q_{t-1},_{t-11},_{t-12} |
1974–2008 | ANN-EMPA , ANN-MPA , ANN-GWO , ANN-PSO , ANN-GA | ANN-EMPA | RMSE, MAE, NSE |

2022 | [ |
Pakistan | Monthly | Q_{t-1},_{t-11},_{t-12} |
1974–2009 | ANFIS, ANFIS-DE, ANFIS-GA, ANFIS-ACO, ANFIS-PSO, ANFIS-GWO, ANFIS-GBO | ANFIS-GBO | RMSE,^{2}, NRMSE |

The key idea behind these methods is to characterise the learning process and locate ML model hyperparameters through the use of optimisation methods [

The firefly algorithm (FFA), genetic algorithm (GA), grey wolf optimisation (GWO), particle swarm optimisation (PSO), and differential evolution (DE) are all hybrids of ANFIS and nature-inspired optimisation algorithms that are used for streamflow simulation across three prediction perspectives: the short-term (daily time scale), the intermediate-term (weekly to monthly scale), and the long-term (annually). From June 2005 to December 2016 in the southwest of Iran, fifteen various input-output vectors were utilised for training the hybrid streamflow forecasting model. The results showed that on all time scales, the hybrid algorithms proposed exceeded the traditional ANFIS models. Values of R^{2}, RMSE, NSE, and RAE were improved by 12%, 10%, 18.5%, and 14.3% for the short-term, 15%, 13%, 20%, and 21.1% for the medium-term, and 10.3%, 7.5%, 10.5%, and 14% for the long-term, respectively [

Additionally, Kilinc et al. [^{3}/s, RMSE = 0.7795 m^{3}/s, MAPE = 5.2819, SD = 0.0973, and R^{2} = 0.9689.

Moreover, compared the M5 Regression Tree (M5RT) models to 2 hybrid approaches, ANN GA and ANFIS GA, for forecasting streamflow in four distinct time-lag input combinations. For this research, we utilised monthly streamflow data from the 2 Rivers in Pakistan. When comparing the ANFIS GA and ANN GA to the M5RT models, it was discovered that the hybrid models performed better in terms of prediction. The results demonstrated that the ANN-GA model with three historical streamflow and periodicity input outperformed the ANFIS_GA model in Kunhar River (RMSE: 31.16 m^{3}/s, MAE: 18.67 m^{3}/s, and R^{2}: 0.891).

Also, a hybridisation of a gated recurrent unit (GRU) with a grey wolf algorithm (GWO) was used for estimating the streamflow of Üçtepe and Tuzla stations in Turkey from 2000 to 2009. The comparative model and linear regression were used to compare the hybrid model’s precision. Based on their findings, the GWO-GRU combined technique was superior to the standard methods in all statistical parameters except standard deviation (SD) at the Üctepe station and the entire Tuzla station. At Üçtepe, the flow measurement station (FMS), RMSE and MAE of the single GRU method were 124.57, 184.06 m^{3}/s, respectively, whereas that of the hybrid model were 82.93 and 85.93 m^{3}/s, respectively, resulting in improvements of around 34% in RMSE and 53% in MAE. In addition, the Tuzla station’s GWO-GRU and linear regression R^{2} values were 0.9827 and 0.9558, respectively. It also supported the potential of the hybrid GWO-GRU model for forecasting issues [

Samui et al. [^{2} = 0.962, NSE = 0.962, RMSE = 0.061 m^{3}/s, MAE = 0.029 m^{3}/s, and MARE = 0.003.

Nguyen et al. [^{3}/s, MAE = 8.00 m^{3}/s, NSE = 0.96, CC = 0.98, and MAPE = 9.0%.

By combining optimisation methods with pre-processing-based hybrid (PBH) models, HOPH models can determine the best settings for their pre-processing stage or assess the correct weights to use when adding up the results of its decompositional elements’ predictions [

Tikhamarine et al. [^{3}/s) and MAE (0.3047 m^{3}/s) values.

Zhao et al. [

When it comes to daily streamflow forecasting, Niu et al. [^{3}/s, MAPE = 63.06 m^{3}/s, CE = 12.75 and R = 0.986).

Zakhrouf et al. [^{3}/s) = 1.550, SNR = 0.066, CC = 0.998, NSE (%) = 99.565, and PFC = 0.081). Results further supported the claim that hybrid models based on comparable evolutionary methodologies estimate results more accurately than solo models.

Moreover, Zakhrouf et al. [^{2}) achieved by using the WANFIS-GA and WANN-GA models were 87.2% and 78.9%, respectively, whereas those obtained using the individual models (i.e., ANFIS and ANN) were 56% and 57%, respectively. For the peak values during the testing period (RMSE = 12.1545 m^{3}/s, MARE = 106.785%, R = 0.934, R^{2} = 0.872, and EC (%) = 87.32), the WANFIS-GA model provided an excellent match for the observed data.

Furthermore, the MLP is integrated with three physics-inspired algorithms (i.e., equilibrium optimisation (EO-MLP), nuclear reaction optimisation (NRO-MLP), and henry gas solubility optimisation (HGSO-MLP)) to forecast the monthly streamflow. Ahmed et al. [

Azad et al. [_{R}), and PSO to predict river flow 1, 3, 5, and 7 days ahead. The data was collected from five stations, the upstream stations, including Eskandare (U1) and Ghaleh-Shahrukh (U2), and the downstream ones, including Saad-Tanzimi (D1), Pole-Zamankkhan (D2), and Cham-Aseeman (D3) in Iran for three years. Results demonstrated the PSO enhanced performance of ANFIS so that averages of R^{2}, RMSE (

Moreover, a gene expression program (GEP) was added to the (GA) to create a unique hybrid model for forecasting streamflow in an intermittent stream one month ahead of time. Results from the GEP-GA were compared to those from the classic GP, the GEP, multiple linear regression, and the GEP-linear regression models. They used a monthly streamflow data set for the region of northwest Iran from 1978 to 2008 for their study. As shown by the results, the GEP-GA performed better than all of the reference models (RMSE = 249.6, NSE = 0.523) [

Three machine-learning approaches were employed to predict monthly streamflow in Egypt from 1871 to 2000. These methods were SVM, multilayer perceptron neural network (MLPNN), and ANN. The above methods were hybridised with the GWO algorithm. Auto-regression (AR) was used for comparative analysis purposes. So, single models, such as SVM, MLPNN, and ANN, performed less well than combined ones. Among all the combined techniques, the SVR-GWO technique was the best in terms of estimating streamflow, based on statistical criteria (i.e., RMSE = 2.0570 m^{3}/s, MAE = 1.2005 m^{3}/s, R = 0.9363, NSE =0.8728, and WI = 0.9671) [

An ELM was also combined with GA, the PSO algorithm, and upgraded particle swarm optimisation (IPSO) to raise the accuracy of monthly streamflow predictions using one delay time. The models were trained with data from the Chaohe River, in China, from January 1956 to December 2010. This research demonstrated that the developed procedure (ELM-IPSO) had a greater prediction accuracy when compared to AR, ANN, ELM-GA, and ELM-PSO approaches. The ELM-IPSO also had the lowest MAE, RMSE, and RE values throughout the training and prediction stages and the highest NSE and R values, indicating that ELM-IPSO is a practical approach for predicting monthly streamflow with MAE = 1.16 m^{3}/s, RMSE = 1.46 m^{3}/s, RE = −11.04%, NSE = 0.78, and R = 0.89 [

Feng et al. [^{3}/s, MAPE = 18.5002, R = 0.9198, and CE = 0.8457).

In another study, the ANFIS was combined with the particle swarm optimisation (ANFIS_PSO), genetic algorithm (ANFIS_GA), and differential evolution algorithm (ANFIS-DE) to predict monthly streamflow. The data was collected from one station (ID 3527410) in Malaysia from 2000 to 2014. The results suggested that integrating long antecedent data increases forecasting accuracy, as the model can capture seasonal patterns and the current trend in time series. The model with five input variables (t _1, t _2, t_3, t_6, t_12) was superior to the best, with a 68% improvement in prediction accuracy over the model with a single input variable (t_1). The results revealed that the PSO improved the capability of the ANFIS model (RMSE = 7.96; MAE = 2.34; R^{2} = 0.998 and WI = 0.994) more than GA and DE in forecasting streamflow. Comparing evolutionary optimisation methods revealed that PSO is superior to GA and DE for optimising ANFIS functions. The precision of the ANFIS-PSO technique was somewhat greater than that of the ANFI-SGA and ANFIS-DE techniques (24% and 20%, respectively), and it was 25% more accurate than the non-hybrid ANFIS model. Therefore, the ANFIS-PSO model could accurately predict highly stochastic stream flow in a tropicalsetting [

A hybridisation of ANN and SVR with GA and (SVR, RF) was done using the grid search algorithm to simulate river flow from Jan 1991–Nov 2010. The historical data under 5 input scenarios were created (1^{st} Model, 2^{nd} Model,…, 5^{th} Model) to predict streamflow in the Tigris River in Iraq. The results indicated that the SVR-GA model was the most accurate at predicting monthly river flow (i.e., ME = −14.73, RMSE = 100.78 m^{3}/s, MAE = 81.585 m^{3}/s, MPE = −214.02, MAPE = 670.30, and R^{2} = 0.96). Consequently, it was applicable to improve the flow of river forecasting capability by utilising the suggested hybrid model [

The multilayer perceptron (MLP) was hybridised with multiple metaheuristic algorithms: sunflower optimisation (SFA), the genetic algorithm (GA), and particle swarm optimisation (PSO) predict six lags of daily streamflow in two stations, Jam Seyed Omar (JSO) and Muda Di Jeniang (MDJ) in Malaysia. The MLP-SFA was compared to the conventional MLP and two other hybrid MLP models (MLP-PSO, MLP-GA). The assessment yielded the following outcomes: at the MDJ station (PBIAS = 0.18, MAE = 0.29 m^{3}/s, NSE = 0.93, RMSE = 0.45 m^{3}/s, d = 0.95), and at the JSO station (PBIAS = 0.16, MAE = 0.27 m^{3}/s, NSE = 0.93, RMSE = 0.37 m^{3}/s, d = 0.94). Compared to previous models, the MLP-SFA might reduce RMSE by 12%~21% at the JSO station and 8%~24% at the MDJ station. The results demonstrated that using MLP with optimisation methods led to the enhancement of the accuracy of the standalone MLP model [

Meshram et al. [_{t-1}, Q_{t-2}) was superior. They were comparing revealed that FNN-PSOGSA is superior to the standard FNN and FNN_PSO models with RMSE = 24.42 m^{3}/s and MAE = 16.47 m^{3}/s, and the maximum rate of NSE = 0.652, WI = 0.864. The findings also show that the FNN-PSOGSA model is a workable strategy for estimating streamflow and increasing forecasting accuracy.

Abdul Kareem et al. [^{2} = 0.91, RMSE = 1.07 m^{3}/s, MAE = 1.07 m^{3}/s, and MARE = 1.01.

Zhao et al. [^{3}/s, NSE = 0.675, R = 0.896, MAPE = 0.060, and qr = 97.22%), (RMSE = 147.666 m^{3}/s, NSE = 0.922, R = 0.971, MAPE = 0.049, and qr = 86.11%) was the evaluation result for Shangjingyou and Fenhe reservoir stations, respectively.

Pham et al. [^{2} = 0.80, RMSE = 73.70 m^{3}/s, RE =195.08, NSE = 0.70 and MAE = 55.01 m^{3}/s) for Thanh My station and (with R^{2} = 0.83, RMSE = 173.65 m^{3}/s, RE = 47.75, NSE = 0.784 and MAE = 123.20 m^{3}/s) for Nong Son station.

Tripura et al. [^{3}/s).

Kilinc [^{2} (0.9749) RMSE (1.2557 m^{3}/s), MAE (0.1025 m^{3}/s), MAPE (10.2574) and lowest standard deviation (−0.1541).

Mohammadi et al. [^{2} = 0.994, MAE = 3.53 m^{3}/s, and RMSE = 6.426 m^{3}/s for Brantford station in the testing stage.

Zakhrouf et al. [^{3}/s for Sidi Aich station and NSE = 0.8703, SNR = 0.3600, and RMSE = 11.074 m^{3}/s for Ponteba Defluent station.

Adnan et al. [^{3}/s, MAE = 46.59 m^{3}/s, and both a high R^{2} and NSE of 0.925, 0.919 in the test phase respectively. Furthermore, the outcomes reveal the potential of the ELM-PSOGWO model to be recommended for monthly streamflow prediction.

Ikram et al. [^{3}/s, MAE = 84.44 m^{3}/s, and the highest NSE = 0.9532. Additionally, the ANN-EMPA enhanced the RMSE, MAE, and Nash–Sutcliffe efficiency of ANN-PSO by 4.8%, 4.1%, and 0.5%, ANN-GA by 6.2%, 5.6% and 0.6%, ANNGWO by 3.7%, 4.4% and 0.5%, and ANN-MPA by 3.2%, 7.5% and 0.3%, respectively.

Adnan et al. [^{2} (0.853), RMSE (53.97 m^{3}/s), MAE (33.69 m^{3}/s), and NSE (0.843). Likewise, Gilgit Station yielded in the test period R^{2} (0.923), RMSE (93.5 m^{3}/s), MAE (48.63 m^{3}/s), and NSE (0.915).

A continual stream of both applied and theoretical material makes keeping up with it difficult. Several authors have suggested using the R-tool and the VOS viewer to organise and show the results of the published studies in a transparent manner [

It displays the total number of articles published by the country, researchers, and institution.

This word cloud investigates the most often occurring and significant words in the prior investigations.

It is constructed utilising the current literature’s most frequently occurring phrases. Researchers, academics, and practitioners in a given subject may find the network structure revealed by co-occurrence analysis to be particularly useful, as it can help us understand the theoretical underpinnings of that field.

There are potentially useful connections between journals, universities, and countries. We suggest a new three-field structure in

It is vital to identify the authors who have had the most impact on the field, which gives an impression to new researchers and provides them with consultants in their work. Similarly, gaining knowledge of author collaboration patterns in the literature might provide light on where the discipline is headed. The most influential writers in the field are summarised in

Additionally, the 31 papers that used OBH and HOPH models in the field of univariate streamflow and considered in this study are distributed over seven publishers, such as Science Direct, Springer, MDPI, IEEE, Taylor & Francis, and Hindawi, are given in the pie chart shown in

The prediction precision is determined by comparing the measured and estimated flows. Different performance evaluations are used to assess the model’s predictive capabilities. An individual assessment may not be sufficient to determine the model(s)’ efficacy and dependability [

RMSE is the average squared deviation between the estimated and measured outputs. It is utilised for assessing the nonlinear error; this is an excellent measure of forecast accuracy [

_{i} is: the forecast value, X_{i} is the actual value, X ̅: mean of the actual value, Y ̅: the mean of the predicted value, N: the total number, i: counter.

MAE estimates the mean of error magnitudes without regard to their direction. In other words, this is the mean absolute deviation between the predicted and actual values [

MAPE is an objective statistic utilised to assess relative error by comparing predicted and observed data. MAPE is frequently insensitive to large magnitudes but sensitive to smaller magnitudes [

It represents the percentage error values between actual and anticipated values [

This measure represents the average absolute error compared to the observed record. It is also known as the relative mean error [

The R^{2} indicates the degree of correlation between the expected and measured values. The R^{2} values range from 0 to 1: 1 means the whole relationship between the data set and the line drawn across data, and 0 indicates that there is no meaningful relationship between the type of data and the line drawn through them [

NSE was evolved by Nash et al. [

The SI is a dimensionless measure of a model’s relative accuracy. The model's accuracy is considered poor if SI ⩾ 30%, acceptable if 20% < SI < 30%, good if 10% < SI < 20%, and excellent if SI < 10% [

Ahmed et al. [_{R}). (II) using the offered methods of research (ANFIS-GA, ANFIS-PSO, ANFIS-ACO_{R}) to anticipate other hydrological phenomena, (III) comparing the performance of the planned combined approach (ANFIS-PSO) to other well-known methods, such as ANNs, SVR, and GEP. Danandeh et al. [

Tikhamarine et al. [

Additionally, based on an analysis of previous research, hybrid models could be enhanced in the following ways:

Applying the three data preprocessing stages significantly impacts the performance and precision of a model's target. Therefore, it is suggested that more significant effort should be spent using data pre-treatment methods, such as singular spectrum analysis (SSA) and empirical mode decomposition (EMD), for denoising data. Additionally, establishing the ideal predictor combination scenario. Hence, it prefers expanding the use of the mutual information method better than the try-and-error method.

The use of combined ML techniques and metaheuristic algorithms for univariate streamflow forecasts has risen significantly recently. However, there is still an area for streamflow prediction improvement.

Recently, hybrid metaheuristic algorithms, such as CPSOCGSA (i.e., combined swarm and physics kinds), have been confirmed effective. It would be useful to extend the current findings by examining different combinations of metaheuristic algorithms types.

There is still room for improving OBH and HOPH models by examining different data pre-processing techniques and metaheuristic algorithms.

There are a couple of limitations to this research. Firstly, some studies may have been missed because they did not utilise the identified search terms in their title or abstract; however, the study covered previous studies very well. The second aspect is that the field is expanding so quickly that a timely survey is complex. The last limitation means that a snapshot of search activity in this active research direction does not reflect the fact that the algorithms are used but rather reflects the response to our research question, which is the aim of this research.

This study has reviewed the previous studies on typically used univariate approaches regarding prediction performance and accuracy. Since machine learning models are currently receiving more attention for streamflow prediction due to their simplicity and lower data needs compared to general hydrological techniques. This paper reviewed the recent univariate streamflow forecasting works for the last five years.

A variety of factors influence the effectiveness and accuracy of the prediction technique. Therefore, approaches were selected and compared based on data preprocessing, a univariate data-driven modelling strategy, a suitable timescale, and metaheuristic algorithms integrated into the model. Recent studies have demonstrated that different traditional ML techniques no longer produce the best accurate results. The features and limitations of ML techniques have several disadvantages, including a slow convergence rate and difficulty readily sliding into local minima. Hybrid models can address the limitations problems, involving a couple or more processes; the first works as the main method and the rest as pre- or post-processing. A combination of ML techniques and meta-heuristic optimisation techniques has been made. Consequently, hybrid models represent the most effective instruments for enhancing the precision of streamflow forecasts, such as a comprehensive hybrid approach that integrates preprocessing techniques and metaheuristic algorithms such as (OBH and HOPH).

This paper has concluded that the researchers employed swarm, evolutionary, physics, and hybrid algorithms in their articles with 77%, 61%, 12% and 12%, respectively. Also, the best model was 58% for the swarm, followed by 26% for evolutionary, 10% for hybrid, 3% for physics, and 3% for other models.

In general, this research supports the idea that meta-heuristic approaches precisely improve ML techniques. It is also one of the first efforts to comprehensively examine the efficiency of various meta-heuristic approaches (classified into four primary classes) hybridised with ML techniques. There are a number of ways in which these results advance our knowledge of HOPH and OBH techniques. As there is still potential for improvement in HOPH and OBH techniques for univariate streamflow forecast techniques, it would be beneficial for academics to conduct further research into the role of meta-heuristic approaches and data pre-processing techniques. Lastly, the availability of reliable univariate streamflow data drove a balance between water demand and supply that achieves sustainability. Moreover, Decision-makers should take into account the results of this study to have a scientific view of the current and expected research directions.

Multi-layer perceptron

Recurrent neural network

Adaptive neuro-fuzzy inference system

Gene expression programming

Genetic programming

Multi-linear regression

Long-short-term memory

Support vector regression

Artificial neural network

Extreme learning machines

Auto-regression method

Gated recurrent unit

Least-squares support vector machine

Auto regressive moving average

Random forest

M5 regression tree

Coactive neuro-fuzzy inference system

Polynomial neural network

Feed-forward neural network

Bi-linear

Elman recurrent neural network

Genetic algorithm

Differential evolution

Flower pollination algorithm

Cooperation search algorithm

Cuckoo search

Shuffled complex evolution

Sine cosine algorithm

Sunflower optimisation

Whale optimisation algorithm

Gray wolf optimisation

Sparrow search algorithm

Particle swarm optimisation

Improved particle swarm optimisation

Quantum-behaved particle swarm optimisation

Wavelet transform

Improved complete ensemble empirical

Improved grey wolf optimiser

Intelligent water drop

Firefly algorithm

Particle swarm optimisation gravitational search algorithms

Particle swarm optimisation-multi-verse optimiser

Equilibrium optimisation

Henry gases solubility optimisation

Nuclear reaction optimisation

Wind driven optimisation

Multi-verse optimisation

Gravitational search algorithm

Mutual information

Root mean square error

Mean absolute error

Mean absolute percentage error

Mean absolute deviation

Mean square error

Absolute error

Mean error

Relative error

Mean absolute relative error

Relative absolute error

Pearson’s correlation coefficient

^{2}

Coefficient of determination

Coefficient of efficiency

Willmott index

Confidence index

Persistence index

Standard deviation

Qualification rate

Agreement index

Standard deviation

Peak flow criteria

Low flow criteria

Signal-to-noise ratio

Correlation coefficient

_{R}

Ant colony optimisation for continuous domain

Variance account

RMSE-observations standard deviation ratio

Artificial bee colony

Teaching‑learning based optimisation

Ant colony optimisation

Ant‑lion optimisation

Imperialist competitive algorithm

Marine predator algorithm

Slim mold algorithm

Scatter index

Mean bias error

Bayesian additive regression tree

Extended marine predators algorithm

Gradient-based optimiser

Time lag

Normalised root mean square error

Twin support vector machine

Efficient wavelet transform

Ensemble empirical mode decomposition

Asymmetric huber loss function-based ELM

This paper’s logical organisation and content quality have been enhanced, so the authors thank anonymous reviewers and journal editors for assistance.

The authors received no specific funding for this study.

The authors confirm contribution to the paper as follows: study conception and design: B.A.K., S.L.Z.; data collection: B.A.K., S.L.Z., N.A.-A.; analysis and interpretation of results: B.A.K., S.L.Z., N.A.-A., Y.R.M.; draft manuscript preparation: B.A.K., S.L.Z., N.A.-A. All authors reviewed the results and approved the final version of the manuscript.

Not applicable.

The authors declare that they have no conflicts of interest to report regarding the present study.