Forecasting directional movement of Forex data using LSTM with technical and macroeconomic indicators.

Forex (foreign exchange) is a special financial market that entails both high risks and high profit opportunities for traders. It is also a very simple market since traders can profit by just predicting the direction of the exchange rate between two currencies. However, incorrect predictions in Forex may cause much higher losses than in other typical financial markets. The direction prediction requirement makes the problem quite different from other typical time-series forecasting problems. In this work, we used a popular deep learning tool called “long short-term memory” (LSTM), which has been shown to be very effective in many time-series forecasting problems, to make direction predictions in Forex. We utilized two different data sets—namely, macroeconomic data and technical indicator data—since in the financial world, fundamental and technical analysis are two main techniques, and they use those two data sets, respectively. Our proposed hybrid model, which combines two separate LSTMs corresponding to these two data sets, was found to be quite successful in experiments using real data.

Introduction.

The foreign exchange market, known as Forex or FX, is a financial market where currencies are bought and sold simultaneously. Forex is the world’s largest financial market, with a volume of more than $5 trillion. It is a decentralized market that operates 24 h a day, except for weekends, which makes it quite different from other financial markets.

The characteristics of Forex show differences compared to other markets. These differences can bring advantages to Forex traders for more profitable trading opportunities. Some of these advantages include no commissions, no middlemen, no fixed lot size, low transaction costs, high liquidity, almost instantaneous transactions, low margins/high leverage, 24-h operations, no insider trading, limited regulation, and online trading opportunities. Two types of techniques are used to predict future values for typical financial time series—fundamental analysis and technical analysis—and both can be used for Forex. The former uses macroeconomic factors while the latter uses historical data to forecast the future price or the direction of the price.

The main decision in Forex involves forecasting the directional movement between two currencies. Traders can profit from transactions with correct directional prediction and lose with incorrect prediction. Therefore, identifying directional movement is the problem addressed in this study.

We chose the Euro/US dollar (EUR/USD) pair for the analysis since it is the largest traded Forex currency pair in the world, accounting for more than 80% of the total Forex volume.

In recent years, deep learning tools, such as long short-term memory (LSTM), have become popular and have been found to be effective for many time-series forecasting problems. In general, such problems focus on determining the future values of time-series data with high accuracy. However, in direction prediction problems, accuracy cannot be defined as simply the difference between actual and predicted values. Therefore, a novel rule-based decision layer needs to be added after obtaining predictions from LSTMs.

In this work, we propose a hybrid model composed of a macroeconomic LSTM model and a technical LSTM model, named after the types of data they use. We first separately investigated the effects of these data on directional movement. After that, we combined the results to significantly improve prediction accuracy. The macroeconomic LSTM model utilizes several financial factors, including interest rates, Federal Reserve (FED) funds rate, inflation rates, Standard and Poor’s (S&P) 500, and Deutscher Aktien IndeX (DAX) market indexes. Each factor has important effects on the trend of the EUR/USD currency pair. This can be interpreted as a fundamental analysis of price data. The other model is the technical LSTM model, which takes advantage of technical analysis. Technical analysis is based on technical indicators that are mathematical functions used to predict future price action. The feature set in our model uses popular technical indicators such as moving average (MA), moving average convergence divergence (MACD), rate of change (ROC), momentum, relative strength index (RSI), Bollinger bands (BB), and the commodity channel index (CCI).

The contributions of this study are as follows:

A popular deep learning tool called LSTM, which is frequently used to forecast values in time-series data, is adopted to predict direction in Forex data. Both macroeconomic and technical indicators are used as features to make predictions. A novel hybrid model is proposed that combines two different models with smart decision rules to increase decision accuracy by eliminating transactions with weaker confidence. The proposed model and baseline models are tested using recent real data to demonstrate that the proposed hybrid model outperforms the others.

The rest of this paper is organized as follows. In “Related work” section, related studies of the financial time-series prediction problem are thoroughly examined. “Forex preliminaries”–“Technical indicators” sections provide background information about Forex, LSTM, and the technical indicators. Then, “The data set” section presents the data set used in the experiments. “LSTM-based hybrid model using macroeconomic and technical indicators” section introduces the proposed algorithm to handle the directional movement prediction problem. Moreover, the preprocessing and postprocessing phases are also explained in detail. “Experiments” section presents the results of the experiments and the classification performances of the proposed model. “Discussion” and “Conclusion” sections discuss the experimental results and provide insight for future research directions.

Related work.

Various forecasting methods have been considered in the finance domain, including machine learning approaches (e.g., support vector machines and neural networks) and new methods such as deep learning. Unfortunately, there are not many survey papers on these methods. Cavalcante et al. (2022), Bahrammirzaee (2010), and Saad and Wunsch (1998) have provided overviews of the field. The most recent of these, by Cavalcante et al. (2022), categorized the approaches used in different financial markets. Although that study mainly introduced methods proposed for the stock market, it also discussed applications for foreign exchange markets.

There has been a great deal of work on predicting future values in stock markets using various machine learning methods. We discuss some of them below.

Selvamuthu et al. (2022) used neural networks based on Levenberg–Marquardt, scaled conjugate gradient, and Bayesian regularization for stock market prediction based on tick data and 15-min-interval data for an Indian company.

Patel et al. (2022b) developed a two-stage fusion structure to predict the future values of the stock market index for 1–10, 15, and 30 days using 10 technical indicators. In the first stage, support vector machine regression (SVR) was applied to these inputs, and the results were fed into an artificial neural network (ANN). SVR and random forest (RF) models were used in the second stage. They compared the fusion model with standalone ANN, SVR, and RF models. They reported that the fusion model significantly improved upon the standalone models.

Guresen et al. (2011) explored several ANN models for predicting stock market indexes. These models include multilayer perceptron (MLP), dynamic artificial neural network (DAN2), and hybrid neural networks with generalized autoregressive conditional heteroscedasticity (GARCH). Applying mean-square error (MSE) and mean absolute deviation (MAD), their results showed that MLP performed slightly better than DAN2 and GARCH-MLP while GARCH-DAN2 had the worst results.

Weng et al. (2022) developed a financial expert system using ensemble methods (i.e., neural network regressing ensemble (NNRE), support vector regression ensemble (SVRE), boosted regression tree (BRT), and random forest regression (RFR)) to predict stock prices 1 day ahead. Market prices, technical indicators, financial news, Google Trends, and the number unique visitors to Wikipedia pages were used as inputs. They also investigated the effect of PCA on performance. They reported that ensembles with PCA performed better than those without PCA. They also noted that BRT and RFR were the best while SVRE was the worst in terms of mean absolute percentage error.

Huang et al. (2005) examined forecasting weekly stock market movement direction using SVM. They compared SVM with linear discriminant analysis, quadratic discriminant analysis, and Elman back-propagation neural networks. They also proposed a model that combined SVM with other classifiers. They used not only the NIKKEI 225 index but also macroeconomic variables as features for the model. Their direction calculation was based on the first-order difference natural logarithmic transformation, and the directions were either increasing or decreasing. SVM outperformed the other models with an accuracy of 73% while the combined model was the best, with an accuracy of 75%.

Kara et al. (2011) compared the performance of ANN and SVM for predicting the direction of stock price index movement. Ten technical indicators were used as inputs for the model. They found that ANN, with an accuracy of 75.74%, performed significantly better than SVM, which had an accuracy of 71.52%.

Patel et al. (2022a) compared the performance of four classifiers (ANN, SVM, random forest, and naive Bayes) for stock price index direction using two approaches. In the first approach, they used 10 technical indicator values as inputs with different parameter settings for classifiers. Prediction accuracy fell within the range of 0.7331–0.8359. In the other approach, they represented same 10 technical indicator results as directions (up and down), which were used as inputs for the classifiers. Using this approach, they enhanced accuracy by about 15% for all of the classifiers. Although their experiments concerned short-term prediction, the direction period was not explicitly explained.

Ballings et al. (2022) evaluated ensemble methods (random forest, AdaBoost, and kernel factory) against neural networks, logistic regression, SVM, and k-nearest neighbor for predicting 1 year ahead. They used different stock market domains in their experiments. According to the median area under curve (AUC) scores, random forest showed the best performance, followed by SVM, random forest, and kernel factory.

Hu et al. (2022) introduced an improved sine–cosine algorithm (ISCA) for optimizing the weights and biases of BPNN to predict the directions of open stock prices of the S&P 500 and Dow Jones Industrial Average indices. Using Google Trends data in addition to the opening, high, low, and closing price, as well as trading volume, in their experiments, they obtained an 86.81% hit ratio for the S&P 500 index and an 88.98% hit ratio for the Dow Jones Industrial Average Index.

Gui et al. (2022) investigated SVM for predicting stock price index direction with different parameter settings. That study also compared the result for SVM with BPNN and case-based reasoning models; multiple technical indicators were used as inputs for the models. That study found that SVM outperformed the other models with an accuracy of 57.8313% while the other models had accuracies of 54.7332% and 51.9793%, respectively.

Qiu and Song (2022) developed a genetic algorithm (GA)—based optimized ANN to predict the direction of the next day’s price in the stock market index. GA was used to optimize the initial weights and bias of the model. Two types of input sets were generated using several technical indicators of the daily price of the Nikkei 225 index and fed into the model. They obtained accuracies 60.87% for the first set and 81.27% for the second set.

Zhong and Enke (2022) investigated three-dimensional reduction techniques applied to ANN for forecasting the daily direction of the S&P 500 Index ETF (SPY). Principal component analysis (PCA), fuzzy robust principal component analysis (FRPCA), and kernel-based principal component analysis (KPCA) were used to reduce the number of features. Their experiments indicated that ANN with PCA performed slightly better than the other two techniques.

Zhong and Enke (2022) used deep neural networks and ANNs to forecast the daily return direction of the stock market. They performed experiments on both untransformed and PCA-transformed data sets to validate the model.

In addition to classical machine learning methods, researchers have recently started to use deep learning methods to predict future stock market values. LSTM has emerged as a deep learning tool for application to time-series data, such as financial data.

Zhang et al. (2022) proposed a state-frequency memory recurrent network, which is a modification of LSTM, to forecast stock prices. By decomposing the hidden states of memory cells into multiple frequency components, they could learn the trading patterns of those frequencies. They used state-frequency components to predict future price values through nonlinear regression. They used stock prices from several sectors and performed experiments to make forecasts for 1, 3, and 5 days. They compared the results with LSTM and autoregressive integrated moving average (ARIMA) in terms of mean-square error. They obtained errors of 5.57, 17.00, and 28.90 for the different steps, which outperformed the other models.

Fulfillment et al. (2022) studied stock market forecasting in six different domains using LSTM. He aimed to predict the next 3 h using hourly historical stock data. The model was trained to classify three classes—namely, increasing 0–1%, increasing above 1%, and not increasing (less than 0%). The accuracy results ranged from 49.75 to 59.5%. That study also built a stock trading simulator to test the model on real-world stock trading activity. With that simulator, he managed to make profit in all six stock domains with an average of 6.89%.

Nelson et al. (2022) examined LSTM for predicting 15-min trends in stock prices using technical indicators. They used 175 technical indicators (i.e., external technical analysis library) and the open, close, minimum, maximum, and volume as inputs for the model. They compared their model with a baseline consisting of multilayer perceptron, random forest, and pseudo-random models. The accuracy of LSTM for different stocks ranged from 53 to 55.9%. They concluded that LSTM performed significantly better than the baseline models, according to the Kruskal–Wallis test.

More recently, Fischer and Krauss (2022) applied LSTM to the stock market. They investigated many different aspects of the stock market and found that LSTM was very successful for predicting future prices for that type of time-series data. They also compared LSTM with more traditional machine learning tools to show its superior performance.

Similarly, Di Persio and Honchar (2022) applied LSTM and two other traditional neural network based machine learning tools to future price prediction. They also analyzed ensemble-based solutions by combining results obtained using different tools.

In addition to traditional exchanges, many studies have also investigated Forex. Some studies of Forex based on traditional machine learning tools are discussed below.

Galeshchuk and Mukherjee (2022) investigated the performance of a convolutional neural network (CNN) for predicting the direction of change in Forex. Using the daily closing rates of EUR/USD, GBP/USD, and USD/JPY, they compared the results of CNN with their baseline models and SVM. While the baseline models and SVM had an accuracy of around 65%, their proposed CNN model had an accuracy of about 75%.

Meanwhile, Kayal (2010) investigated the use of MLP in Forex. That work used basic technical indicators as inputs.

Ghazali et al. (2009) also investigated the use of neural networks for Forex. They proposed a higher-order neural network called a dynamic ridge polynomial neural network (DRPNN). In their experiments, DRPNN performed better than a ridge polynomial neural network (RPNN) and a pi-sigma neural network (PSNN).

To predict exchange rates, Majhi et al. (2009) proposed using new ANNs, referred to as a functional link artificial neural network (FLANN) and a cascaded functional link artificial neural network (CFLANN). They demonstrated that those new networks were more robust and had lower computational costs compared to an MLP trained with back-propagation.

In what is commonly called a mark-to-market approach, market prices are increasingly being used to calibrate models to quantify risk in several sectors. The net present value of a financial institution, for example, is an important input for estimating both bankruptcy risk (e.g., Kou et al. 2022) and the likelihood that shocks will propagate throughout the financial system (Kou et al. 2022). In such a context, stock price crashes not only dramatically damage the capital market but also have medium-term adverse effects on the financial sector as a whole (Wen et al. 2022). Credit risk is a major factor in financial shocks. Therefore, a realistic appraisal of solvency needs to be an objective for banks. At the level of the individual borrower, credit scoring is a field in which machine learning methods have been used for a long time (e.g., Shen et al. 2022; Wang et al. 2022).

Deep learning methods such as LSTM are rarely used for Forex. In one recent work, Shen et al. (2022) proposed a modified deep belief network. They were able to show that deep learning approaches outperformed traditional methods.

Even though LSTM is starting to be used in financial markets, using it in Forex for direction forecasting between two currencies, as proposed in the present work, is a novel approach.

Forex preliminaries.

Forex has characteristics that are quite different from those of other financial markets (Archer 2010; Ozorhan et al. 2022). To explain Forex, we start by describing how a trade is made. Profit/loss calculations are made using the difference between the final ratio and the initial ratio of the currency pair that has been traded. If the ratio of the currency pair increases and the trader goes long, or the currency pair ratio decreases and the trader goes short, the trader will profit from that transaction when it is closed. Otherwise, the trader not profit. For example, let us assume the EUR/USD ratio was 1.1500 when the trader started a transaction, going long with an initial amount of $10,000. When the position closes (i.e., the transaction ends) with a ratio of 1.1550, the trader will gain \(\) . When the position closes with a ratio of 1.1450, the trader will lose \(10000 * (1.1500 - 1.1450) = \$50\) . Furthermore, these calculations are based on no leverage. If the trader uses a leverage value such as 10, both the loss and the gain are multiplied by 10.

Detailed definitions of commonly used concepts and terms in Forex can be found in Forex (2022), Archer (2010) and Özorhan (2022). Here, we explain only the most important ones.

Base currency, which is also called the transaction currency, is the first currency in the currency pair while quote currency is the second one in the pair. To illustrate, in the EUR/USD pair, EUR is the base currency, and USD is the quote currency.

Being long (or going long) means buying the base currency or selling the quote currency in the currency pair. Being short (or going short) means selling the base currency or buying the quote currency in the currency pair. Pip is an abbreviation for “percentage of point,” defined as the smallest amount of change occurring in the currency ratio. In general, pip corresponds to the fourth decimal point (i.e., minimum as 0.0001) of that currency. Pipette is the fractional pip, which corresponds to the fifth decimal point (i.e., as 0.00001). In other words, 1 pip equals 10 pipettes.

Leverage corresponds to the use of borrowed money when making transactions. A leverage of 1:100 indicates that if one opens a position with a volume of 1, the actual transaction volume will be 100. After using leverage, one can either gain or lose 100 times the amount of that volume. Margin refers to money borrowed by a trader that is supplied by a broker to make investments using leverage. In this way, one can multiply his/her gains or losses.

Bid price is the price at which the trader can sell the base currency. Ask price is the price at which the trader can buy the base currency. Spread is the difference between the ask and bid prices. A lower spread means the trader can profit from small price changes. Spread value is dependent on market volatility and liquidity. Stop loss is an order to sell a currency when it reaches a specified price. This order is used to prevent larger losses for the trader. Take profit is an order by the trader to close the open position (transaction) for a gain when the price reaches a predefined value. This order guarantees profit for the trader without having to worry about changes in the market price. Market order is an order that is performed instantly at the current price. Swap is a simultaneous buy and sell action for the currency at the same amount at a forward exchange rate. This protects traders from fluctuations in the interest rates of the base and quote currencies. If the base currency has a higher interest rate and the quote currency has a lower interest rate, then a positive swap will occur; in the reverse case, a negative swap will occur.

Fundamental analysis and technical analysis are the two techniques commonly used for predicting future prices in Forex. While the first is based on economic factors, the latter is related to price actions (Archer 2010).

Fundamental analysis focuses on the economic, social, and political factors that can cause prices to move higher, move lower, or stay the same (Archer 2010; Murphy 1999). These factors are also called macroeconomic factors. Economic data reports, interest rates, monetary policy, and international trade/investment flows are some examples (Ozorhan et al. 2022).

Technical analysis uses only the price to predict future price movements (Kritzer and Service 2012). This approach studies the effect of price movement. Technical analysis mainly uses open, high, low, close, and volume data to predict market direction or generate sell and buy signals (Archer 2010). It is based on the following three assumptions (Murphy 1999):

Market action discounts everything. Price moves in trends. History repeats itself.

Chart analysis and price analysis using technical indicators are the two main approaches in technical analysis. While the former is used to detect patterns in price charts, the latter is used to predict future price actions (Ozorhan et al. 2022).

Long short-term memory (LSTM)

Long short-term memory (LSTM) was proposed by Hochreiter and Schmidhuber (1997). LSTM is a recurrent neural network architecture that was designed to overcome the vanishing gradient problem found in conventional recurrent neural networks (RNNs) (Biehl 2005). Errors between layers tend to vanish or blow up, which causes oscillating weights or unacceptably long convergence times. The initial LSTM structure solves this problem by introducing the constant error carousel (CEC). In this way, the architecture ensures constant error flow between the self-connected units (Hochreiter and Schmidhuber 1997).

The memory cell of the initial LSTM structure consists of an input gate and an output gate. While the input gate decides which information should be kept or updated in the memory cell, the output gate controls which information should be output. This standard LSTM was extended with the introduction of a new feature called the forget gate (Gers et al. 2000). The forget gate is responsible for resetting a memory state that contains outdated information. Furthermore, peephole connections and full back-propagation through time (BPTT) training are final features that were added to the LSTM architecture (Gers and Schmidhuber 2000; Greff et al. 2022). With these modifications, the architecture was renamed Vanilla LSTM (Greff et al. 2022), as shown in Fig. 1.

Vanilla LSTM (Greff et al. 2022)

LSTM offers an effective and scalable model for learning problems that includes sequential data (Greff et al. 2022). It has been used in many different fields, including handwriting recognition (Graves et al. 2009; Pham et al. 2014) and generation (Graves 2013), language modeling (Zaremba et al. 2014) and translation (Luong et al. 2022), acoustic modeling of speech (Zia and Zahid 2022), speech synthesis (Fan et al. 2014), protein secondary structure prediction (Sønderby and Winther 2014), audio analysis (Marchi et al. 2014), and video data analysis (Donahue et al. 2022; Greff et al. 2022).

Forward pass.

One of the two main operations of LSTM, shown in Fig. 1, is called the forward pass. In the forward pass, the calculation moves forward by updating the weights (Greff et al. 2022). The weights of LSTM can be categorized as follows:

Input weights: \(W z, W i, W f, W o \, \in \, \mathbb >\) Recurrent weights: \(R z, R i, R f, R o \, \in \, \mathbb >\) Peephole weights: \(p i, p f, p o \, \in \, \mathbb \) Bias weights: \(b z, b i, b f, b o \, \in \, \mathbb \) ,

where z is the block input, i is the input gate, f is the forget gate, o is the output gate, N is the number of LSTM blocks, and M is the number of inputs. By introducing \(x^t\) as the input vector, \(y^t\) as the block output, and \(c^t\) as the cell at time t, the formulation of the forward pass in Vanilla LSTM can be defined as below: