Financial time series prediction, whether for classification or regression, has been a heated research topic over the last decade. While traditional machine learning algorithms have experienced mediocre results, deep learning has largely contributed to the elevation of the prediction performance. Currently, the most up-to-date review of advanced machine learning techniques for financial time series prediction is still lacking, making it challenging for finance domain experts and relevant practitioners to determine which model potentially performs better, what techniques and components are involved, and how the model can be designed and implemented. This review article provides an overview of techniques, components and frameworks for financial time series prediction, with an emphasis on state-of-the-art deep learning models in the literature from 2015 to 2023, including standalone models like convolutional neural networks (CNN) that are capable of extracting spatial dependencies within data, and long short-term memory (LSTM) that is designed for handling temporal dependencies; and hybrid models integrating CNN, LSTM, attention mechanism (AM) and other techniques. For illustration and comparison purposes, models proposed in recent studies are mapped to relevant elements of a generalized framework comprised of input, output, feature extraction, prediction, and related processes. Among the state-of-the-art models, hybrid models like CNN-LSTM and CNN-LSTM-AM in general have been reported superior in performance to stand-alone models like the CNN-only model. Some remaining challenges have been discussed, including non-friendliness for finance domain experts, delayed prediction, domain knowledge negligence, lack of standards, and inability of real-time and high-frequency predictions. The principal contributions of this paper are to provide a one-stop guide for both academia and industry to review, compare and summarize technologies and recent advances in this area, to facilitate smooth and informed implementation, and to highlight future research directions.

Driven by the advances of big data and artificial intelligence, FinTech (financial technology) has experienced proliferation over the last decade. One key area that FinTech is concerned with is the prices of financial market instruments, e.g., stock, option, foreign currency exchange rate, and cryptocurrency exchange rate, which are driven by market forces like the fundamental supply and demand, but in many cases extremely difficult to predict, as there is a broad range of complicated factors that may affect the prices, including macroeconomic indicators, government regulations, interest rates, corporate earnings, and profits, breaking news like COVID cases spike, company announcements like dividends, employee layoffs, investor sentiment, societal behavior, etc.

In the last decade, financial time series prediction has become a prominent topic in both academia and industry where a great many models and techniques have been proposed and adopted. These techniques can be categorized based on the linearity of the model (linear

Before the advent and prominence of deep learning, RF, SVM, and SVR have been extensively used and were once the most effective models [

The enormous amount of data that the financial markets produce can be broadly divided into fundamental and market data. Fundamental data consists of details regarding the financial standing of an organization, such as earnings, revenue, and other accounting indicators. For different financial instruments, market data includes price and volume information as well as order book data that displays bid and ask levels. The interval of financial market data can be high-frequency tick-level indicating individual trades or aggregated on a daily, weekly, or monthly basis. High-frequency data offers more granular insights into market dynamics, but it also presents difficulties because of noise and abnormalities brought on by the influences of market microstructure.

Financial time series prediction is a subset of time series analysis, which in general is a challenging task. In particular, financial time series prediction involves forecasting future values of financial indicators, such as stock prices, exchange rates, and commodity prices, based on historical data. Because of their inherent complexity and specific traits, these time series pose particular difficulties for forecasting. The properties of financial time series data include, but are not limited to, the following:

Non-linearity and non-stationarity: Non-linearity refers to the fact that the data cannot be represented using a straight line (i.e., linearly); non-stationarity refers to the fact that their statistical properties change over time [

Short-term and long-term dependencies: Financial time series have an intrinsic temporal nature as each record has dependencies, either short-term or long-term, on previous records. Short-term dependencies are concerned with intraday fluctuations and fast market movements, whereas long-term dependencies are concerned with trends and patterns that last weeks, months, or even years. It has been revealed that neglecting the temporal dependencies may result in poor prediction performance [

Asymmetry, fat tails, and power-law decay: Traditionally, many financial models, including some option pricing models and risk models, have relied on the assumption of normal distributions, which has been questioned by numerous researchers and financial facts in recent years. Instead, asymmetric distributions and fat-tailed behavior are often observed in financial time series, e.g., stock returns. It is empirically established that the probability distribution of price returns has a power-law decay in the tails [

Volatility: Volatility refers to the degree of variation or dispersion of a financial time series. Financial markets are known for experiencing periods of high volatility, followed by relatively calmer periods. Another phenomenon that has been observed is called volatility clustering, which occurs when periods of high volatility are followed by others with similarly high volatility, and vice versa [

Autocorrelation and cross-correlation: Autocorrelation in financial time series may occur, referring to the correlation of a time series’ current values with its historical values [

Leverage effects: Leverage effects describe the negative relationship between asset value and volatility. It is observed that negative shocks tend to have a larger impact on volatility than positive shocks of the same magnitude [

Behavioral and event-driven dynamics: Investor behavior, sentiment, and psychological variables, as well as market-moving events such as earnings announcements, governmental decisions, geopolitical developments, and news releases, all have an impact on financial markets [

Conventionally, effectively incorporating domain knowledge including these statistical properties into predictive models is crucial for accurate financial time series analysis and forecasting. Recently, deep learning models and hybrid approaches are being increasingly developed and employed to handle these complexities and capture the underlying dynamics of financial markets.

Note that alternative data sources, which incorporate unconventional data like sentiment from social media and web traffic patterns, have grown in popularity in recent years for the task of financial time series prediction. These alternative data sources aim to provide supplement information that may not be immediately reflected in traditional fundamental or market data.

The purpose of financial time series prediction is three-fold. Firstly, for investors, it facilitates informed investment decisions, aids portfolio optimization, and ultimately maximizes profits. Research on price prediction in the literature is sometimes extended to the form of stock selection or portfolio optimization by finding the best weights for each relevant stock [

The mainstream studies on financial time series prediction still have flaws in terms of usability, even though numerous AI researchers actively propose alternative models. Our interactions with various finance domain experts have shown a common complaint: while being the key customers of the technologies proposed for financial time series prediction, many finance domain experts lack in-depth knowledge or skills of the most cutting-edge techniques and may hesitate about where to get started, rendering the selection and deployment of these models challenging. Among so many different machine learning models out there, which algorithms are best for the given class of problems? How are they compared with each other? What methods are involved in these models and what do they imply? How can they mix and match various techniques and integrate them as components into one single model design? Currently, there is a lack of the most up-to-date review of advanced machine learning techniques for the financial time series prediction task. The motivation of this review article is to answer the forementioned questions.

This review article provides an overview of the most recent research on using deep learning methods in the context of financial time series predictions from 2015 to 2023. The principal contributions of this review article are as follows:

Reviewing, comparing, and categorizing state-of-the-art machine learning models for financial time series prediction, with an emphasis on deep learning models involving convolutional neural networks, long short-term memory, and the attention mechanism.

Providing a summary and a one-stop guide for finance experts and AI practitioners on selecting technology components and building the most advanced models, including the software framework, processes, and techniques involved in the practice.

Highlighting remaining challenges and future research directions in this area.

The rest of this review article is structured as follows.

As part of the discovery phase, we have collected a number of journal articles on applying deep learning to financial time series predictions within the Science Citation Index Expanded database by Web of Science, with a combination of keyword filters (“deep learning” combined with “financial time series”) from 2015 through 2023, and 85 journal articles were identified. SCIE is used to maintain a high-quality perspective on the literature as it tends to index only reliable sources and it excludes not peer-reviewed documents. Only journal papers have been selected as this type of article tends to be more comprehensive and detailed and provides more solid contributions.

Based on a thorough examination of all the selected articles and the data involved in these research items, our initial findings are shown as pie charts in

Most of these articles dealt with the stock market, while a much smaller portion of the papers focused on other financial instruments including foreign exchange, cryptocurrency, and options. The US and Chinese stock markets have witnessed the highest popularity, which is interestingly in line with the globe-wise economic size. Some articles utilized data from multiple markets, aiming to provide a more generic and universal solution. Noticeably, daily price forecasting has been dominant in comparison with intraday forecasting (in minute-level or hour-level frequencies). The latter is often seen as a more difficult task, as the data comes in higher frequency, which brings greater data size and more complexity.

The financial price time series prediction task is primarily conducted in two forms. The first form is a classification problem, which can be binary (up or down) or 3-class classification (buy, hold, or sell) indicating the trend of the price movement; the second form is a regression problem, which focuses on forecasting the price values. It can be revealed that the selected articles distribute almost evenly in these two forms of prediction problems.

In this section, we go through crucial elements and techniques involved in using deep learning for financial time series prediction tasks involved in the selected publications, including the selection of input data types, feature selection, and extraction methods, and some key relevant deep learning techniques for both training and prediction processes.

In the selected publications, the input data fed into deep learning models for financial time series prediction tasks has been acquired from a broad range of data sources of various types. The selection of input data types has a significant impact on the performance of deep learning models. Major types of data include the following, ordered by the frequency of use in the selected research works:

Based on functionality, TIs can be categorized into three categories, namely (1) trend indicators, mostly based on moving averages, signaling whether the trend direction is up or down, such as SMA (simple moving average), EMA (exponential moving average), and WMA (weighted moving average); (2) momentum indicators, measuring the strength or weakness of the price movement, such as RoC (rate of change, also known as returns in financial time series prediction, including arithmetic and logarithmic returns), stochastic oscillators (e.g., %K and %D), MACD (moving average convergence divergence), RSI (relative strength index), CCI (commodity channel index), MTM (momentum index) and WR (Williams indicator); and (3) volatility indicators, measuring the spread between the high and low prices, such as VIX (volatility index) and ATR (average true range). Practically, trend indicators Some most commonly used technical indicators in the selected publications are listed in

TI | Equation | Indication |
---|---|---|

SMA | SMA shows the average trend over a specific sliding time window, which smooths the time series and removes the influence of anomalies. This is one of the earliest and most commonly used TIs as seen in the literature. | |

RoC | The RoC or return is useful to measure the profit or loss of an investment over time. The key difference between the arithmetic return and the logarithmic return is that the former is discontinuous, not compounded, and computed only for one time, whereas the latter is continuous and allows compounding over multiple non-overlapping periods by summing up. | |

%K and %D | The stochastic oscillator compares a financial instrument’s closing price to a range of its prices over a particular period. It is commonly displayed as two lines %K and %D. Signals are generated when the two lines cross. It is a possible sell signal when %K crosses below %D, and vice versa. | |

MACD | MACD indicates the relationship between two EMAs of a financial instrument’s price. It is commonly accompanied by a signal line (9-day EMA of MACD). It is a possible sell signal when the MACD line crosses below the signal line, and vice versa. | |

RSI | RSI indicates the velocity and magnitude of a financial instrument’s recent price swings to determine if it is undervalued or overvalued, and in turn to generate buy and sell signals. |

Each TI is a transformation derived from the same original plain price data, focusing on a different aspect. In practice, researchers feed one or more TIs commonly seen in traditional technical analysis, with or without the original price data, into the deep learning models, and rarely give the rationale behind the TI selection. For instance, Reference [

Some examples of input datasets used in the selected publications are shown in

Reference | Input dataset | Type | Market | Frequency | Period |
---|---|---|---|---|---|

[ |
CSI 300 index | P | Chinese stock | Daily | 2005–2017 |

[ |
USD/CNY | P | forex | Daily | 2006–2020 |

[ |
DOW-30 index & ETFs | P & TI | US stock | Daily | 2002–2016 |

[ |
S&P 500 index | P & TI | US stock | Intraday (minute) | 2017.4.3–2017.5.2 |

[ |
Top stocks in SSE index | P & S | Chinese stock | Daily | 2012–2020 |

[ |
Banking stocks in Borsa Istanbul 100 index | P & S & O (Twitter and news) | Turkish stock | Daily | 2018.9–2019.9 |

Data used in ML can be thought of as an n-dimensional feature space, with each dimension representing a particular feature. Real-life raw data could have too many features that suffer the curse of dimensionality, highlighting the complexity and difficulty brought about by high-dimensional data. It has been reported that feeding a large number of raw features into ML gives rise to poor performance [

Feature selection refers to the process of reducing the number of features, including unsupervised methods like correlation analysis, and supervised ones like filters or wrappers. The core idea is to find a subset of the most relevant features for the problem to be addressed by the ML model. Evolutionary algorithms, especially the Genetic Algorithm (GA), are sometimes chosen as a feature selection technique [

Feature extraction or feature engineering is the process of transforming the raw features into a particular form that can be more suitable for modeling, commonly using a dimensionality reduction algorithm, e.g., casting high dimensional data to lower dimensional space. It is crucial to distinguish feature extraction from feature selection. The former uses certain functions to extract hidden or new features out of the original raw features, whereas the latter is focused on selecting a subset of relevant existing features. Some feature extraction techniques used in the selected literature are as follows:

This section discusses key deep learning techniques commonly adopted in the selected papers on financial time series prediction, namely convolutional neural network (CNN), long short-term memory (LSTM), and attention mechanism (AM).

CNN is comprised of one or more sets of a convolution layer and a pooling layer, followed by one or more fully connected neural networks, i.e., FC layers or dense Layers. Each of the convolution layers and pooling layers consists of several feature maps, each of which represents particular features of the input data. In other words, unlike a traditional artificial neural network (ANN) where feature extraction is usually done manually, a CNN performs convolution and pooling operations for feature extraction before linking to the FC, which is typically an ANN like a Multilayer Perceptron (MLP) to generate the final prediction results.

We now illustrate the layers of a standard CNN and relevant concepts in more detail.

For the output layer to generate the final output, a sigmoid function can be employed for binary classification, a

An example of a CNN applied in daily stock price prediction (a classification problem), presented in [

While CNN has shown success in a variety of disciplines, especially computer vision, applying them to financial time series prediction involves thoughtful adaptations. Practically, experimentation and a thorough understanding of the data are required for meaningful results. Some issues when working with CNN include the following:

It is worth noting a couple of variations of CNN that have been applied in financial time series prediction. The first one is the graph convolutional network (GCN), which operates on graphs [

The recurrent neural network (RNN) has been proposed for learning based on sequences, where the order of data records matters. It enables information to persist in the hidden state acting as the “memory” remembering and utilizing precedent states, thus enabling the capture of dependencies between previous and current inputs in the data sequence. An RNN can be considered as identical repeating neural networks linked together, passing information from one to another. As shown in _{t} denotes the input of the repeating neural network (NN) and _{t} denotes the output. In a vanilla RNN, each repeating NN is a simple

However, vanilla RNNs suffer the issue of gradient vanishing or gradient exploding, which renders them unsuitable for longer-term dependencies. LSTM, a special kind of RNN, has emerged to address this issue [_{t}) flows through the entire chain of LSTM, and the gates in each cell regulate the level of information to be remembered or discarded. Each gate is comprised of a _{t}), input gate (_{t}), and output gate (_{t}). The relationship between these gates and the cell state (_{t}) can be calculated by

Pragmatically, RNN and LSTM can be deployed using established software tools like TensorFlow, PyTorch, and Keras. However, the following issues may occur when applying RNN and LSTM in financial time series predictions:

While LSTM is suitable for long-term dependencies in time series data, it may still suffer gradient vanishing or exploding on extraordinary long-term time series data. Attention Mechanism (AM) that has been a success in natural language processing (NLP) and computer vision can be extended to be applied in conjunction with RNN, LSTM, or GRU, to cope with aforementioned issues caused by overly long-term dependencies. Some successful and broadly applied AM-based models include those based on Transformers, such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), two of the most advanced language models that have stunned the world in the past few years [

AM is a concept derived from neuroscience and is inspired by the cognitive mechanism of human beings. When a person observes a particular item, they generally do not focus on the entire item as a whole, with a tendency to selectively pay attention to some important parts of the observed item according to their needs. The principal goal of AM is to enhance the modeling of relationships between different elements in a sequence. In essence, AM assigns various importance weights to each part of an input sequence and extracts more critical information, so that the model is capable of dealing with long sequences and complex dependencies, and thus may result in more accurate predictions without consuming more computing and storage resources. More detail regarding AM can be found in this cornerstone article [

Similar to any deep learning model, AM or Transformers can be implemented using deep learning libraries such as TensorFlow and PyTorch. In addition, Hugging Face Transformers [

For the ease of better illustration and comparison of the deep learning models in the selected papers, we employ a generalized deep learning framework for financial time series predictions that consists of a general architecture, two processes, i.e., training and prediction processes, hyperparameter tuning methods, and evaluation metrics. These are important factors that the researchers or practitioners have to determine during the design and construction of their deep learning (DL) model.

The general architecture is comprised of the input, the ML model, and the output (see

The general training process and prediction process are shown in

Hyperparameters play a critical role in the performance of deep learning models in financial time series prediction [

Traditional hyperparameter tuning methods include manual search, which involves manually selecting the hyperparameters based on prior experience and intuition, e.g., [

More advanced methods have gained increasing popularity nowadays, including Bayesian optimization which involves modeling the objective function (such as the loss function of the deep learning model) and using this model to guide the search for optimal hyperparameters; and evolutionary algorithms, such as the genetic algorithm (GA), particle swarm optimization (PSO), and artificial bee colony (ABC) that involve generating a population of hyperparameters and iteratively evolving the population based on a fitness function. These methods can quickly converge to the optimal hyperparameters. Many recent studies have employed one of these algorithms to optimize the hyperparameters of their proposed model for financial time series prediction. For instance, Chung et al. [

Many various statistical metrics have been employed in the selected publications (see ^{2}) and direction prediction accuracy (DPA), etc., have been used for regression. In addition, the training time and the prediction time of the model on CPU or GPU are sometimes also included in the comparison between models. Note that

Measure | Type | Equation |
---|---|---|

Accuracy | Classification | |

Precision | Classification | |

Recall/Sensitivity | Classification | |

F1-score | Classification | |

MAE | Regression | |

MAPE | Regression | |

MSE | Regression | |

RMSE | Regression | |

RMSRE | Regression | |

R^{2} |
Regression | |

DPA | Regression |

The rest of

In this section, we discuss standalone models comprised of solely CNN or LSTM. Standalone traditional machine learning models like random forest (RF) or support vector machine (SVM), and basic deep learning models like ANN, are out of the scope of the paper.

A CNN-only model has solely CNN involved in the model structure. It can also be considered as a CNN-MLP or CNN-FC model as this model consists of CNN responsible for feature extraction and in most cases connected to a fully connected layer (often a basic ANN like MLP) for prediction. The CNN-only model has been used on the stock market, foreign exchange market, and cryptocurrency for both regression and classification problems. For example, Lin et al. [

Raw financial time series are often pre-processed or transformed to satisfy the requirement of CNN. References [

The CNN-only model has been reported to outperform traditional standalone models like SVM, linear regression, logistic regression, KNN, DT, RF [

An ensemble CNN model contains multiple parallel paths of CNNs extracting features from distinct datasets, and combining the output using a function (e.g., a simple or weighted average function) from all paths to get a collective performance.

For instance, Gunduz et al. [

As discussed in

Note that there are numerous variations of LSTM. First of all, looking at the ratio of the input and output items, there are various sequence prediction models in LSTM, including one-to-one, one-to-many, many-to-one, and many-to-many models. Secondly, at the cell level, the structure and functions used in each cell may vary, e.g., peephole LSTM [

There has been no consensus on how the LSTM-only model would compare with the CNN-only model, as the prediction performance has varied in different scenarios. Some researchers compared standalone models and concluded that CNN-only performed better than LSTM-only, Stacked-LSTM, Bidirectional-LSTM (BiLSTM) [

Recently, hybrid models, combining multiple deep learning or machine learning models in an organized structure, have gained increasing popularity among researchers [

The CNN-LSTM hybrid model is the most adopted one among all the selected recent research articles and has become one of the state-of-the-art methods. In CNN-LSTM, CNN is responsible for spatial feature extraction, and LSTM is responsible for handling temporal dependencies and conducting the time series prediction tasks.

In most cases, CNN is followed by LSTM in the model design, e.g., [

CNN-LSTM in most scenarios outperforms standalone models like CNN-only and LSTM-only. Reference [

Researchers have made efforts to enhance CNN-LSTM with variants of CNN or LSTM. For instance, Wang et al. [

As discussed in

Component organization (CNN, LSTM, and AM) may vary in design. Reference [

Standalone deep learning models like CNN and LSTM have proven to outperform traditional machine learning models like SVM and RF. Ensemble models with multiple CNNs in parallel have proven to be superior to the CNN-only model. The advantage of CNN is to extract spatial dependencies within data, and LSTM is capable of extracting temporal dependencies. It is natural to integrate them both in a way to take advantage of both benefits. In addition, the attention mechanism (AM) can handle extra long-term dependencies which are beyond the capabilities of LSTM. Thus, hybrid models like CNN-LSTM, LSTM-AM, and CNN-LSTM-AM have emerged recently.

Reference | Model | Type | Dataset | Evaluation metrics | Performance | Training time (s) |
---|---|---|---|---|---|---|

[ |
Technical analysis | C | S&P 500 | F measure | 0.4469 | – |

PCA-ANN | 0.4237 | – | ||||

CNN-Corr | 0.3928 | – | ||||

CNN with 2D input | 0.4914 | – | ||||

CNN with 3D input | 0.4837 | – | ||||

[ |
LSTM | C | SZSE 100 | Accuracy | 0.459041 | – |

TCN | 0.640502 | – | ||||

TCN-AM | 0.694751 | – | ||||

[ |
CNN | R | JNJ | MAPE | 1.550 | 11 |

RNN | 1.041 | 16 | ||||

LSTM | 1.301 | 18 | ||||

CNN-LSTM-AM | 0.921 | 31 | ||||

[ |
MLP | R | SSE Index | RMSE | 39.260 | – |

CNN | 36.878 | – | ||||

RNN | 35.801 | – | ||||

LSTM | 34.331 | – | ||||

BiLSTM | 33.579 | – | ||||

CNN-LSTM | 32.640 | – | ||||

CNN-BiLSTM | 32.065 | – | ||||

BiLSTM-AM | 31.955 | – | ||||

CNN-LSTM-AM | 31.694 | – |

The training and inference time of a hybrid model on a GPU is in many scenarios longer than a standalone model. In other words, hybrid models have higher complexity than standalone models. For instance, the CNN-LSTM-AM model proposed in [

It should also be noted that there are other combinations based on CNN and LSTM, e.g., CNN-SVM [

Incorporating hybrid models into financial time series prediction tasks necessitates a mix of domain knowledge and deep learning skills. Building trustworthy and successful models for financial time series prediction requires continuous monitoring, validation of out-of-sample data, and transparency in understanding model decisions. While hybrid models are powerful, their complicated structures make them difficult to interpret. Model predictions can be attributed back to input features using techniques such as Layer-wise Relevance Propagation (LRP) and Shapley Additive Explanations (SHAP) to facilitate explainable AI [

Practically, sufficient attention should be paid to the following technical aspects when leveraging deep learning models for financial time series prediction:

As a summary,

Model | References | Merits | Demerits |
---|---|---|---|

CNN-only | [ |
CNN serves as a spatial feature extraction component, meanwhile reducing the training complexity Outperforming traditional ML methods like ANN or SVM |
Incapable of dealing with temporal dependencies in time series data |

LSTM-only | [ |
Outperforming RNN Good for long-term dependencies Avoiding gradient vanishing and exploding issues |
Incapable of dealing with overly long-term dependencies Difficult to prepare for the input data formatting |

Ensemble CNN | [ |
Combining the output of multiple CNNs in parallel for a collectively improved performance Outperforming CNN-only |
Incapable of dealing with temporal dependencies in time series data |

CNN-LSTM | [ |
Leveraging the capability of CNN and LSTM for both spatial and temporal dependencies Outperforming standalone models |
Higher complexity and lower efficiency in building and running this hybrid model than standalone models |

LSTM-AM | [ |
Capable of handling overly long-term dependencies due to the introduction of AM Outperforming standalone models |
Higher complexity and lower efficiency in building and running this hybrid model than standalone models |

CNN-LSTM-AM | [ |
Leveraging the capability of CNN, LSTM, and AM for both spatial and temporal dependencies, as well as overly long-term dependencies Outperforming standalone models Outperforming CNN-LSTM |
Higher complexity and lower efficiency in building and running this hybrid model than standalone models |

Financial time series prediction, whether for classification or regression, has been a heated research topic over the last decade. While traditional machine learning algorithms have experienced mediocre results, deep learning has largely contributed to the elevation of the prediction performance. Based on literature between 2015 and 2023, CNN, LSTM, CNN-SVM, CNN-LSTM, and CNN-LSTM-AM are considered to be state-of-the-art models, which have beaten traditional machine learning methods in most scenarios. Among these models, hybrid models like CNN-LSTM and CNN-LSTM-AM have proven superior to stand-alone models in prediction performance, with the CNN component responsible for spatial feature extraction, the LSTM component responsible for long-term temporal dependencies, and AM handling overly long-term dependencies. This paper provides a one-stop guide for both academia and industry to review the most up-to-date deep learning techniques for financial market price prediction, and guide potential researchers or practitioners to design and build deep learning FinTech solutions. There are still some major challenges and gaps, which lead to future work directions.

Auto encoder

Attention mechanism

Artificial neural network

Autoregression

Autoregressive conditional heteroscedasticity

Autoregressive integrated moving average

Autoregressive moving average

Automated machine learning

Convolutional neural network

Deep learning

Direction prediction accuracy

Deep Q-network

Decision tree

Empirical mode decomposition

Graph convolutional network

Gated recurrent unit

K-nearest neighbors

Long short-term memory

Mean arctangent absolute percentage error

Moving average convergence divergence

Mean absolute error

Mean absolute percentage error

Machine learning

Multi-layer perceptron

Mean squared error

Natural language processing

Neural network

Other data

Open, high, low, close prices

Plain price data

Principal component analysis

^{2}

R squared

Random forest

Root mean squared error

Root mean squared relative error

Recurrent neural network

Rate of change

Relative strength index

Sentiment data

Simple moving average

Support vector machine

Technical indicator data

Wavelet transform

We acknowledge the ongoing support provided by Prof. Fethi Rabhi and the FinanceIT research group of UNSW Sydney.

This research was funded by the Natural Science Foundation of Fujian Province, China (Grant No. 2022J05291) and Xiamen Scientific Research Funding for Overseas Chinese Scholars.

The authors confirm the contribution to the paper as follows: study conception and design: W. Chen, W. Hussain; data collection: W. Chen, W. Hussain; analysis and interpretation of results (for the reviewed publications): W. Chen, F. Cauteruccio, X. Zhang; draft manuscript preparation: W. Chen, W. Hussain, X. Zhang; revised manuscript preparation: W. Chen, W. Hussain, F. Cauteruccio. All authors reviewed and approved the final version of the manuscript.

The articles surveyed in this paper can be found in mainstream indexing databases. The list of reviewed papers and relevant analysis data can be made available by the authors upon request.

The authors declare that they have no conflicts of interest to report regarding the present study.