Predictive Load Balancing for Financial Exchanges

Blog, Journal for High Schoolers, Journal for High Schoolers 2021

Authors

Akshat Parikh, Sohrab Hassibi

Abstract

With the continual development of the modern stock market and new stock exchanges, there has been a growing trend of individual participation in our capitalistic financial system. With these newfound changes, numerous participants of certain stock exchanges have elected to use HFTs, or high-frequency traders, to take advantage of powerful computer programs in order to reach higher levels of success in the stock market. Furthermore, some stocks are traded more frequently than others and have a higher trading volume. From this arises an inefficiency in resources, as stocks with lower trading volumes may still be given the same amount of computational resources as stocks with higher trading volumes. In other words, there is a discrepancy between the required resources for smaller stocks and what they are given, which then takes away from stocks with higher volumes. Matching adequate computational resources to symbols on a stock exchange can help equalize the latency needed to transact across symbols and reduce the ability of latency advantaged traders to cause short term price fluctuations. However, performing this match requires an expectation of future trading volume. In this project, we leverage predictive models to forecast stock symbol daily volumes and use this information to effectively shard a synthetic exchange.

Background

Today’s equity markets allow hundreds of millions of users across the world the opportunity to capitalize millions of businesses. Modern stock exchanges are heavily engineered to provide a high level of service for these users. While in 2016, the New York Stock Exchange reported an average of 3.6 billion stocks traded per day several trends point to increased demand for high availability, performance and fair access in equity exchanges. In 2020 during the Covid-19 pandemic, these equity markets added more than 10 million new users. As interest in market participation grows, the ability to provide reliable, fast and fair access to exchanges becomes more challenging.

A recent example which introduces some challenges associated with large trading volumes and stock exchanges is the Gamestop debacle in January 2021. Gamestop, a video game retail company, saw a sharp increase of more than 5000% in trading volume as the stock ticker was promoted on various social media platforms including Reddit and Twitter. These users, among others, used online brokerages to buy and sell large quantities of the stock. Due to high trading volume in Gamestop, several brokerages faced latency issues in reaching exchanges and exchanges themselves halted shares from trading. A measure to mitigate the effects of such halts in the future was introduced by the SEC in June 2021 which outlines “regulatory halts” that enable exchanges to halt shares from trading when significant latency or outages are observed. Although trading was unusually high for only a handful of tickers, latency and access to financial markets across a broad range of symbols were hampered.

Modern exchanges operate a central matching engine, which maintains data structures pertaining to client orders to buy or sell various stock symbols. For each symbol the aggregation of these orders results in a limit order book — a list of buy and sell orders which are prioritized first by price and secondly by time.

As orders enter the central exchange server, these orders are queued and then await processing by the matching engine. Since the limit order books pertaining to each symbol have no overlap in information with one another, symbols may be sharded and routed to separate queues and matching engine shards for processing to improve the total throughput of an exchange. Doing so would result in each shard at the central matching engine receiving and fulfilling orders pertaining to disjoint subsets of symbols which the exchange offers. If volume across shards are roughly equivalent and the orders, the queueing lengths and consequently queueing delays would be roughly equal. However, assorting symbols into shards to balance the shards volume can be challenging for a few reasons. Firstly, volumes for stock symbols are not uniform — some tickers can receive substantially higher volume than others. Secondly, these allocations must be performed ahead of receiving orders which necessitates the prediction of volumes for stock tickers.

Our research aims to analyze the use of several predictive algorithms to forecast daily volume from previous volume data for stock symbols. Consequently, we implement an algorithm to use these forecasts to balance the load across matching engine shards within an exchange. After we predict the volume data for all stocks, the stocks get organized into shards which are a collection of several stocks.

Methods and Dataset

Dataset

To evaluate approaches for load forecasting and balancing, we use historical volume data from the Nasdaq 200 — the 200 largest companies by market capitalization listed on the Nasdaq exchange. For each symbol we compile daily volume across the exchange from June 15, 2016 to June 14, 2021. In some cases, data may be missing for specific days, for example Affirm which listed on Nasdaq in 2021. In these cases, we exclude the ticker from analysis.

Figure 1 (Small subset of the full dataset)

To process the dataset, we created two arrays or collections that contained the “output” data and the “input” data. The input data is pairs of subsequent volume data, and the output data is the volume that follows each pair. For example, if we wanted to process the Apple Stock for the first five days, the input data would look as follows: the first pair is 117780800 and 125307200; the second pair is 125307200 and 244032800; the third pair is 244032800 and 1378647600. The output data is as follows and matches with the corresponding input data: the first output is 24403280;, the second output is 137647600; the third output is 142185600. We construct such input/output pairs for each symbol before applying several algorithms described below.

Algorithms

After processing the dataset, we applied algorithms for three purposes. Firstly, we used various machine learning algorithms to predict the volume for the next day. The algorithms used were linear regression, bayesian regression, and nearest neighbor regression. We chose the linear regression algorithm as a baseline to compare how other algorithms fare. We also chose nearest neighbor regression because it works well with the data we have. Nearest neighbor regression works by looking at the similarity between data points and using that information to make predictions. Consider several data points on a standard x versus y graph with one data point which we are trying to predict for. The nearest neighbors regression algorithm calculates the distance from that point to all other points on the graph. Depending on the number of neighbors, it chooses the number of points closest to its current point. Finally, it averages out the values for all the chosen points and that results in the prediction for the unknown point. In our case, the “neighbors” would be previous stock volumes.

Figure 2 (Example of Nearest Neighbor Regression with Three Neighbors)

To implement the nearest neighbor algorithm, we had to devise a way to find the optimal number of neighbors for any stock volume and for any number of days. Our solution was to generate predictions for all neighbors which would be from one neighbor all the way to predicted day minus one neighbors. Then, for all predictions, we calculate the mean percent error which is the percentage that the predicted amount is off by when compared to the actual amount. Finally, our algorithm chooses the amount of neighbors that has the least amount of percent error. In all, by optimizing the amount of neighbors, we could create strong predictions with the nearest neighbor algorithm, but it led to some issues further down the road with creating figures.

Our second goal was to find the correlation, if any, between the stocks in our dataset. We could then use the correlation and the volumes of the two stocks to predict for one of the stocks. The correlation matrix, a collection that shows the correlation coefficients between all variables in the dataset, was calculated using a pairwise correlation algorithm . Before feeding our data to the algorithm, we converted our entire dataset into percent increase/decrease values using the previous day to calculate the percentage change. This is because it will lead to better correlation results as large companies might have a large average volume and small companies might have a small average volume. This means that after converting the data to percentage changes, it calculates the correlation based on the changes and the wide gap between the average values is gone.

Our third and final goal was to use the predicted volume data from the machine learning algorithms or the correlation algorithm to balance the volume on the shards. We balanced the shards by using a greedy allocation approach. Before adding the volume to any shard, our algorithm would first check which shard had the smallest total volume and it would add to that shard. This process would run until all of the 200 symbols are assigned to a shard and it would somewhat equalize the total volume in the shards.

Results

The first results show the accuracy of the three machine learning algorithms when predicting volume data for 80 days (July 2, 2016 to September, 2016). The accuracy was generated by calculating the average mean percent error between the predicted data and the actual data for ally symbols. Average mean percent error is the average of all the mean percent errors for the specified number of days; mean percent error is just the percentage that the predicted amount is off from the actual amount.

In this chart, the Bayesian and linear regression do not differ that much in terms of accuracy which persits with the full data. Their accuracy mainly hovers between 30 percent to 35 percent. In this chart, the nearest neighbor algorithm is less accurate with its average mean percent error hovering between 38 to 45 percent error. This is because to generate this figure, we had to limit the amount of neighbors for the nearest neighbor regression optimization to 8 days back till 2 days back from the prediction day. As the number of days increased the number of neighbors also increased, exponentially increasing the computing time for the optimization. With the full dataset and optimizations, the nearest neighbor algorithm is much more accurate than the other two algorithms with its average mean percent error hovering around 10 to 30 percent error.

Figure 3 (Accuracy of Bayesian Linear and Nearest Neighbor Regression Algorithms when Predicting Volume Data)

This next graph shows our results for when we balanced the shards using different algorithms. The total number of shards we chose was 5 which meant there were 40 stocks per shard. We landed on this amount of shards because as we increased the total number of shards, the stocks per shard decreased. This led to more error when balancing as the shards had less room to balance and one large prediction could throw off the other members when having a lot of shards. We calculated the mean percent error between the balanced volume and the actual volume for each stock over 96 days. The same optimizations as in the previous results were used as the nearest neighbor algorithm also took a large time to compute with such a large amount of days. Our results show that when balanced, the percent of error is less for bayesian regression than linear regression. The mean percent error for bayesian regression is more tightly packed from around 5% to 35% than linear regression whose percent error is from 8% to 50%. The nearest neighbor algorithm is slightly more accurate when balancing with its mean percent error values being around 4% to 30%. However, there are some outliers for the nearest neighbor algorithm which are around 60% error. One thing to be noted is that when running the same calculations with the full dataset and optimizations, the accuracy of the bayesian and linear regression algorithms stayed the same, but the accuracy of the nearest neighbor algorithm was brought down to 2% to 20%.

Figure 4 (Accuracy between Balanced Sharding and Actual Amount for all Three Algorithms)

This next diagram shows the correlation matrix between stocks in our dataset. An important aspect of a correlation matrix is the correlation matrix metric which shows how strongly two variables are related. The values of the metric are decimals that span from -1 to 1 with -1 representing a strong negative or inverse relationship and 1 representing a one to one direct relationship. As the correlation coefficients approach 0 from either side, the involved variables are more and more unrelated.

We have filtered the dataset to only include stocks who have a large positive or negative correlation with another symbol (0.5 or greater). From this, we can see that symbols like GOOG and GOOGL have high positive correlation; to be precise, it is around 0.9337. There are 19 pairs of stocks that have a high positive correlation above 0.5. There are no stocks with a strong negative correlation, showing that inverse relationships with volume largely do not exist in our dataset.

Figure 5 (Correlation Matrix between Symbols with a Correlation Coefficient higher than 0.5)

Our final diagram shows the accuracy between randomly allocating shard versus balancing the shard allocation. In this chart, the mean percent error is the error between balanced sharding volume and the actual amounts, as well as, the random sharding volume and the actual amounts. The balanced sharding volume is calculated by balancing all the shards and then taking the average of the total volume of the shard. Then, we calculate the mean percent error between the average volume of the shard and the actual amount for the symbol.

For this chart, it also only uses data for 96 days after the initial day and 5 shards for both balanced sharding and random sharding. The chart shows that with random shard allocation the mean percent error is in the thousands. Whereas, the balanced sharding has a mean percent error of 5 to 40 percent. The mean percent error values for the balanced sharding allocation were calculated by taking the average of all the mean percent errors for all the algorithms. The second figure in this section shows a zoomed in version of the balanced sharding accuracy data.

Figure 6 (Accuracy of Random versus Balanced Sharding Allocation)
Figure 7 (Accuracy of Balanced Sharding Allocation)

Conclusion

Through our research, we devised an algorithm that predicted the volume of a stock using historical data and then balanced the load the server or stock exchange would receive. This is supported by our data from the random sharding versus the balanced sharding. In the balanced sharding, we were able to bring down the percent error to 7 to 35 % when compared to the actual data. With randomly allocating the shards, the percent error was several thousand percent. This would have led to an improper allocation of resources in the stock exchange, causing highly skewed latency across symbols in an exchange.

We also found out that the nearest neighbor algorithm was the strongest for this dataset followed by Bayesian regression and linear regression. To generate the figures for the presentation, we had to minimize the nearest neighbor optimization we did. This led to some error which was then demonstrated in the chart. However, when we calculate the mean percent error with full optimizations the nearest neighbor algorithm leads to the most accurate results compared to the other algorithms. A possible explanation for this is that when we look at volume data for multiple successive days, the volume only has minor variations. However, over time the variations compound, so the nearest neighbor algorithm fares the best because it can use multiple neighbors or days and then make assumptions based on that. The other algorithms tend to look at a larger trend than just the selective group of neighbors that this algorithm chooses.

Our research also provided insight into relationships between stocks. Some of the highest correlated stocks were between stocks of different classes. For example, the Google Class C stock and the Google Class A stock had a correlation of 0.9337. Another similar pair was between the ViacomCBS Class A stock and the regular ViacomCBS stock. This makes sense as both of the stocks come from the same company so they would be traded with similar volume. A peculiar relationship we saw was between the Coinbase stock and the Applovin stock which had a high correlation coefficient of 0.85. One possible reason for this is that both companies went public at a similar time and they were both stocks that people were highly anticipating. Hence, they were traded in similar volumes.

Our final load balancing algorithm was able to assist stock exchanges by using machine learning algorithms to predict the volume data and then effectively balance the load. This allowed us to improve the conditions of stock traders and reduce some of the unfair advantages that high frequency traders have.

Future Directions

There are still improvements needed for our algorithm as it still leads to some inaccuracies. This could be attributed to the fact that for a lot of the symbols in the dataset there were missing values or the company went public just recently like coinbase. This meant there was not enough data to perform strong analysis on. Future research could be to use a larger dataset with more stock as well as use other machine learning algorithms such as ridge regression. Currently, our correlated stock prediction model predicts using the two input data for the stocks and their corresponding output data. This is strong in some situations, but it does not put the correlation coefficient into consideration. A better option for variables that have multicollinearity is to use ridge regression which can handle correlated variables.

Another future direction is to apply our algorithm to a real-world scenario, or more specifically, an actual stock exchange such as CloudEx. This will allow us to actually use a stock exchange and see how the allocation of resources and the actual latency depend on the algorithm. Finally, we can use our research for abuse detection. People who can trade lots of a single type of stock in one day using HFTs can directly contribute to higher latencies and problems, as servers do not have adequate resources to handle their trading volume. We can use the algorithm to predict the expected amount of volume for a particular day and notice any sharp increases in trading volume which could signify abuse of the stock exchange. In all, our research this summer used machine learning to predict the volume data of a stock and then balance the total volume across the stock exchange which helped stock exchanges and reduce unfair advantages in the financial realm.

References

  1. Budish, Eric, et al. “The High-Frequency Trading Arms Race: Frequent Batch Auctions as a Market Design Response.” The Quarterly Journal of Economics, vol. 130, no. 4, 23 July 2015, pp. 1547–1621, faculty.chicagobooth.edu/eric.budish/research/HFT-FrequentBatchAuctions.pdf, 10.1093/qje/qjv027.
  2. Geng, Yilong, et al. Exploiting a Natural Network Effect for Scalable, Fine-Grained Clock Synchronization This Paper Is Included in the Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’18). Exploiting a Natural Network Effect for Scalable, Fine-Grained Clock Synchronization. —. This Paper Is Included in the Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’19). SIMON: A Simple and Scalable Method for Sensing, Inference and Measurement in Data Center Networks SIMON: A Simple and Scalable Method for Sensing, Inference and Measurement in Data Center Networks. , 2019.
  3. Jagannathan, Ravi. “On Frequent Batch Auctions for Stocks.” Journal of Financial Econometrics, 10 Jan. 2020, www.nber.org/system/files/working_papers/w26341/w26341.pdf, 10.1093/jjfinec/nbz038. Accessed 6 Aug. 2021.
  4. Kushal Vala. “Regression: Kernel and Nearest Neighbor Approach – towards Data Science.” Medium, Towards Data Science, 26 Feb. 2019, towardsdatascience.com/regression-kernel-and-nearest-neighbor-approach-6e27e5e955e7. Accessed 6 Aug. 2021.
  5. Lamport, Leslie. “Time, Clocks, and the Ordering of Events in a Distributed System.” Communications of the ACM, vol. 21, no. 7, 1 July 1978, pp. 558–565, 10.1145/359545.359563.
  6. Lin, Yow-Jian, et al. Sync-MS: Synchronized Messaging Service for Real-Time Multi-Player Distributed Games.
  7. Sviridov, German, et al. Demo Abstract: Leveraging AI Players for QoE Estimation in Cloud Gaming. Trading Algorithms – Resources. “
  8. Trading Algorithms – Resources.” Google Docs, 2021, docs.google.com/document/d/1Ufb6B_yA6LYZ3OQ31nc0m6hFMxuJEX7KTxA53Od95ZM/edit. Accessed 6 Aug. 2021.
  9. SECURITIES AND EXCHANGE COMMISSION [Release No. 34-92071; File No. S7-24-89], 2021 https://www.govinfo.gov/content/pkg/FR-2021-06-03/html/2021-11687.htm Accessed 13 Aug 2021.
  10. Local broker Stake forced to suspend GameStop trading, 2021 https://www.smh.com.au/business/markets/local-broker-forced-to-suspend-gamestop-trading-2021020, 2-p56yqz.html. Accessed 13 Aug 2021.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.