Kevin Xiao, Anahita Vaidhya, Nathan Pao, Audrey Kuo Stanford University and New York University
One important approach of systematic trade is to make decisions based on the predicted price. To predict accurate results, statistics and machine learning models are normally used. We seek to develop different models for stock price prediction, comparing the feasibility and accuracy of each. Our trading strategy is based on prediction, and we tested it by simulating real-time trade. This simulation grants us the real-time performance of different models. Through all the comparisons, we propose the effectiveness of different models in stock price predictions and trading.
Using machine learning and financial algorithms to contribute to market trading allows traders to make financial investments/decisions at a faster and more accurate pace. Financial algorithm trading is a program that follows an algorithm to trade. Due to the extensive data the models receive to train on, they can make educated decisions much better than a human. This allows investors to invest at not only a faster pace but also to make more informed decisions based on the trends from previous data. Currently, companies use algorithms like arbitrage, index fund rebalancing, mean reversion, and market timing to make trading decisions quickly, yet these are based on rigid rules and cannot account for complexities in shifting stock prices. Thus, a machine learning model could look at past trends and make a more informed decision before trading. To develop a proper method to create a more accurate prediction model, we need to analyze a diverse set of existing prediction models along with ones in the work to ensure the model is as efficient and accurate as possible. We analyzed prediction models starting from linear and logistic regression to AutoRegressive Integrated Moving Average (ARIMA), Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), and Holt-Winters Exponential Smoothing (HWES). From these models, we ran a backtest to see the accuracy of our models and set up an email program that sends an automated daily report to inform the investor about the daily returns from each model. To optimize our statistical models, like ARIMA and HWES, we used detrending and adjusted for seasonality. To optimize our machine learning models, like LSTM and GRU, we added sequential layers that will read input from every layer before it, creating a deep learning environment where the model can properly train on historical data. To improve our performance, we continuously adjusted the parameters used in modeling until we
found one that works best. After testing our models extensively over a couple of weeks, we found that the LSTM model had the greatest return on average.
Various methods have historically been used to employ data science in stock trading, the first of which is algorithmic trading. This refers to the automatic buying and selling of stocks based on set rules and calculated decisions. Once learning models are created, machine learning is used to train the computer to accurately predict stock prices and their fluctuations in the future based on patterns and trends present in a certain amount of data from the past. After backtesting and comparing the predictions generated by the model to the actual stock returns, we can examine the variation and difference between the two values to optimize the model’s accuracy.
This approach to algorithmic trading is not new: in fact, many different models have often been used for trading. These include both time series models and classification models. The former refers to deep reinforcement or machine learning and past stock price data to predict future prices, while the latter creates model representations of given data points. Some common models currently being widely used for this purpose include XGBoost, a decision tree library, and LSTM, a type of neural network.
The first step in our research process was data collection. We made an API request to Yahoo Finance API to collect historical data within a specific time frame. The specified time frame for data collection varied based on the model we looked into, but the consensus was to collect data at fifteen-minute intervals within ten days. The data was collected into a Pandas DataFrame, a two-dimensional size-mutable table allowing data manipulation.
The next step was to clean and filter the data. Yahoo Finance provides an array of useful data, ranging from the opening, closing, high/low, trading volume, and timestamps of the stock data. The information our models require are timestamps as well as the closing price of the stock. Our first task was to manipulate data tables. After successfully creating a data frame of the closing price and timestamp, we could move on to fitting our data into a trainable dataset.
Data Fitting and Creating Features
The general procedure for building and testing all our models requires extensive work on building features that fit data into our models. After filtering to only the closing price of the data, we needed to scale the data to a range between 0 and 1. This is known as “Min-Max Normalization” and prevents outliers in data from being exceedingly influential when creating a prediction. We will call this dataset the scaled dataset.
Next, we need to split our dataset. This is one of the most crucial features, as training the model on one dataset will cause overfitting, the case in which the model fits exactly to its training data. This means the model will not be able to make any predictions with unseen data. We used an 80/20 split, with 80% of our dataset being used for training and the other 20% used for testing.
With our training, we created the training data set that will be fed into our models.
To accomplish this, we must create two more datasets, an x_train, and a y_train. The dataset, x_train, represents historical data. The size of x_train will vary depending on the amount of data points we plan to use in our model; in this case, we built our models on the past ten data points. The dataset, y_train, represents the future data points that will be compared to our model’s predictions to give our model an accuracy score.
After training our models on these two datasets, we create two datasets for testing: x_test and y_test. These datasets call upon our models to make a prediction based on the x_test, comparing it with the y_test to create a prediction score.
We built a total of five distinct models, three of which are statistical models that employ regression techniques such as linear and logistic regression, and the remaining two being machine learning models. The statistical models are Autoregressive Integrated Moving Average (ARIMA), Simple Exponential Smoothing (SES), and Holt-Winters Exponential Smoothing (HWES). The machine learning models are Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU).
Using the SMTP python library, we constructed an automatic email program that takes in certain inputs at a certain time of day and sends a predetermined list of recipients a message containing those inputs. We used this email program to automatically send the return rate of each model in correspondence with the backtest.
Consistent across all model iterations was a measurement of performance upon historical financial data—backtesting. Here, we first quantified each model’s success with regard to the accuracy of their stock market predictions. However, given variations in each model’s implementation, we later implemented a generalized daily return algorithm for a tangible, statistical comparison.
As a means of gauging generalized market trends, we implemented the following simple yet powerful models: logistic regression and linear regression.
Being a binary classification model, logistic regression proved much less flexible in analyzing the stock market relative to our other models. However, we were able to achieve a generalized gain/loss prediction with increasing return accuracies throughout extended test cycles.
Here, the confusion matrix—displayed in the table above—exhibits a trend of increasing prediction accuracy with longer observation periods on the AAPL index. Consequently, the ability to accurately predict when a stock market will either increase or decrease proved crucial for our later models to succeed.
As our first closing model, we implemented a linear regression algorithm to deduce generalized linear trends in observed market intervals.
While rather rough, the first semblance of a stock market model can be observed here. Our lines of best fit spanned intervals of two days in which the overarching trends of the stock market can be observed.
General trends in mind, we utilized the ARIMA-GARCH and HWES models to break down the stock market into statistical components before extracting our own predictions.
ARIMA-GARCH (AutoRegressive Integrated Moving Average – Generalized AutoRegressive Conditional Heteroskedasticity)
This model uses the aforementioned regression techniques to make predictions on stationary data. Given the volatility of the stock market, this made it necessary to detrend and difference our data before observing any relevant patterns. From here, we used ARIMA to weigh
every data point within the stock market before calculating a future price based on these weights. We then identified each series’ error variance through the GARCH algorithm before applying such to the ARIMA prediction.
As we can see, the results of our ARIMA-GARCH algorithm did relatively well in predicting the stock market’s overall trend in the backtested time window. However, its ability to account for market volatility appears rather limited due to the model’s statistical approach. The consistency of ARIMA-GARCH’s predictions, however, is quite significant on its own.
HWES (Holt-Winters Exponential Smoothing)
Exponential smoothing is a well-established forecasting model that predicts values based on a weighted sum of previous values, placing a greater importance on recent values and an exponentially decreasing importance on older ones. HWES improves upon exponential smoothing by adding two terms to account for market volatility. The first term combines a weighted average of slopes in the data to account for a general trend, and the second uses a seasonal period to account for seasonality.
While seemingly volatile, the averaged return of the HWES algorithm proves rather promising. In the graph pictured, it averages above a 0.5 score, indicating the potential for successful implementation in the real-world stock market.
Machine Learning Models:
Using the prior statistical backtests as a baseline, we employed LSTM and GRU neural networks to optimize our market predictions beyond those observed in stationery trends.
LSTM (Long Short-Term Memory)
LSTM is a recurrent neural network system utilized in machine learning with the intention of predicting future data points. It does this by using different layers for processing data including an input layer, a hidden layer, and an output layer. In our case, we imported a LSTM model from the Keras package before adding dense layers to account for unseen market variables. We then incorporated the Adam optimization model to address the parameter processing of stock market data.
As seen above, our LSTM model does well to account for the general trend of the market, despite a relatively short testing window. Notably, the model makes an effort to account for market volatility, as indicated in sudden spikes and shifts in the graph above.
GRU (Gated Recurrent Unit)
GRU uses gates and model sequences making it structurally similar to LSTM. However, it only has two gates to inform its data analysis: update and reset. Unlike LSTM, GRU only uses a hidden state; it does not have a cell state. Here, we trained a GRU model to predict stock market returns on a similar scale as we did for HWES.
Though it is difficult to directly compare our GRU model’s predictions with that of our LSTM model, its results proved comparable in success. As another point of reference, the GRU model performed similarly to the HWES model on the same scale, alluding to its feasibility for market implementation.
In short, each model was engineered to provide favorable returns on their respective training datasets. To further test the validity of these results, however, we have implemented a daily return algorithm for use on live stock market data. With greater data for analysis, we may be able to substantiate or clarify the above results.
Admittedly, our results were hindered by two key factors—model formatting and a limited testing phase. These two issues hindered our analysis of our models’ backtest performances and daily returns, respectively, due to inadequate data points for comparison.
Regarding our backtesting results, variations across performance metrics made it difficult to deduce a definitive conclusion for each model’s accuracy. As an example, many of our predictive models differed in the time-scale of our training and testing datasets. Consequently, each success cannot be accurately compared in the context of another at a brief glance.
While our daily returns algorithm was able to quantify each model’s performance under a single metric, the runtime of this test was too short to offer any definitive results either. As proven by previous literature, such a short testing window has often detrimented the results of statistical models—which derive success from relatively slow growths. At the same time, the sporadic returns of our machine learning models did not have enough time to balance either to identify an averaged rate of return.
Based on our daily report system, our best model currently is our LSTM model, with an average return rate of around 10% daily. As of yet, it has never returned a negative return rate, meaning that theoretically, our model should not lose any money while performing intra-day trading. Our other models ARIMA and Holt-Winters ES also have an average positive return rate, with an accuracy that will only improve as we continue to find the proper parameters for these statistical models to make the best predictions. We plan to continue to test and implement more features into our models to further their accuracy while expanding our dataset to contain more training data. With a few more months of testing our models, we can deploy the algorithms for real-time intra-day trading within the stock market. A concern we have for the near future is the possibility of stock market instability when we use our models. However, it is a necessary risk involved in every stock market investment.
All of the models analyzed have been added to the daily performance report. This way, the automatic emails allow us to review their accuracy and compare them to each other and the real returns. If the daily results indicate success in terms of accuracy and continue to follow a positive trend, the next steps will include refining the best models and rendering them more user-friendly so that financial analysts will be able to employ them for practical use in the stock market. Other future goals that we have expressed interest in pursuing include applying the Universal Portfolio Algorithm to the completed models, as well as expanding upon the scope of the research to include virtual reality in the modeling process, thereby incorporating more dimensions into the data visualization.
Algos – Guide to Algorithms Used in Trading Strategies. (2022, January 15). Corporate Finance Institute. Retrieved August 5, 2022, from https://corporatefinanceinstitute.com/resources/knowledge/trading-investing/what-are-alg orithms-algos/
Basic Research Paper Format Examples. (n.d.). Example Articles & Resources. Retrieved August 5, 2022, from
https://examples.yourdictionary.com/basic-research-paper-format-examples.html Kazem, A., Sharifi, E., Hussain, F. K., Saberi, M., & Hussain, O. K. (2021, April 19). Support
vector regression with chaos-based firefly algorithm for stock market price forecasting. ScienceDirect. Retrieved August 5, 2022, from https://www.sciencedirect.com/science/article/abs/pii/S1568494612004449