The Price of Latency in Financial Exchanges

Journal for High Schoolers


Fadi Kidess, Madhav Puri, Shiv Trivedi, Vig Sachidananda


Modern financial exchanges facilitate the trading of trillions of dollars of assets per day [1] and process all incoming orders from market participants at one location, a central matching engine. The time it takes (latency) for a market participant to reach this matching engine can vary drastically with some participants paying for lower latency by co-locating next to or within an exchange. Lowering latency leads to faster access to data and order placement. High frequency trading (HFT) heavily relies on low latency and the zero-sum game of the stock market implies that this advantage is likely to the detriment of all other participants.

Building on work that has proposed mechanisms for mitigating the effects of latency inequality present in current exchanges, we develop an exchange testbed to both simulate and quantify the effect of latency on trading algorithms. By doing so, we examine the consequences of the current high frequency arms race for lower latency and aim to build upon this research to inform future policies on how auctions in matching engines can be designed to provide equality of opportunity for financial market participants.


High frequency trading firms have a few options for reducing the time it takes them to reach an exchange. In order to communicate quickly between exchanges these firms have utilized optimized fiber optic networks [2] and more recently microwave links [3]. Within an exchange, trading algorithms can be co-located and housed within the same datacenter as the central matching engine [4].

Recent work has examined the potential negative implications of the latency inequality created by these tools. Perhaps the most famous criticism is brought to light in Michael Lewis’ Flash Boys which recounts the practice of front-running which enables HFT to observe and execute slower market participant orders before they are able to themselves. Additionally, quote sniping, an arbitrage across markets or multiple highly correlated stock symbols, is enabled through purchased low latency [5]. 

Given the broad implications of latency inequality in the stock market, several proposals have been made to design an exchange that favors no class of participants in particular. IEX is an exchange that enforces a “speed bump” on all orders to lessen the relative difference in latency across participants [6]. Additionally, block auctions have been suggested as an alternative to the continuous limit order books that exchanges currently implement [5]. Lastly, counterfactual simulators for trading algorithms exist which allow for researching the effects of trading algorithms with realistic market order flow [7].

Methods and Materials

In our work we develop trading simulations on the CloudEx trading platform. Since the platform is able to let us modify both trading behavior and infrastructure we can leverage such a tool to understand the interplay between latency and trading algorithms.

For all experiments, we set up an exchange with 1 matching engine, 3 gateways and 9 traders (each gateway serves 3 traders). We index the gateways and traders and we denote traders served by Gateway 1 as traders 1, 2, and 3. We use a FIFO sequencer for our matching engine which results in orders not being held by a resequencing buffer once they reach the matching engine.

Each trader trades using a random strategy in which limit prices and actions (buy or sell) are chosen randomly. When starting an experiment, each trader is given a random seed from which they will generate buy or sell orders. In our setup, we pass the same random seed to a trader every experiment and we pass different random seeds to the 9 traders in our experiment testbed. This setup allows us to generate sequences of orders for each trader that are unique from each other and deterministic over experiment trials. Furthermore, since these traders are not reacting to price, this setup allows us to investigate the effects of latency on order execution without the confounding effects of latency applied to the receipt of market data.

For all experiments, traders start with $200k of cash and 10k shares of one asset. All traders trade on a single asset and over the course of the experiment they place 100,000 orders. 

Experiment Type 1All gateways equidistant
Experiment Type 2Gateway 1 has 1ms latency(Gateway 2 and 3 unchanged)
Experiment Type 3Gateway 1 has 5ms latency(Gateway 2 and 3 unchanged)

We run three types of experiments in our testbed to better understand the effect of latency on our traders. In the first experiment type (Type 1), all of our gateways are equidistant to the matching engine. In the second type, gateway 1 has a 1 millisecond latency added to it’s gateway to matching engine transit time through the use of a hold and release buffer implemented on the gateway. In the last experiment type, Gateway 1 has a 5 millisecond latency added to orders in transit to the matching engine.

When running experiments, we collect a variety of statistics on both the systems running the exchange and the trading performance of each trader. On the systems side, we collect timestamps for the transit times between the gateway and matching engine. For each trader, we collect timestamped trades that were executed and the details of each trade. In the following section we present the results of analysis on this data.


We have set up three different scenarios for experimentation using a variety of simulated traders. In the first experiment type, all the gateways are an equal distance to the matching engine. In the second experiment type, the first gateway has an added 1ms latency and is simulated to be farther away from the matching engine than the others. For the third and last experiment type, the first gateway now has an added 5ms latency and is the furthest away. There are three total gateways which each serve three traders. All the traders are executing a “buy low sell high” algorithm, allowing us to compare them against each other in a consistent manner.

Figures 1-3 show this added latency relative to the other gateways for each experiment. There is a noticeable difference between the peaks of each gateway due to the added latencies causing the peak of gateway 1 to shift to the right more with each experiment type. The peaks show the greatest probability at which an order reaches the matching engine at a certain delay.

Figure 1 (Experiment Type 1, all gateways equidistant)

Figure 2 (Experiment Type 2, gateway 1 has an added 1ms latency)

Figure 3 (Experiment Type 3, gateway 1 has an added 5ms latency)

Figure 4 shows the rate of return of traders by experiment type, where the first three traders are connected to gateway 1, and the next three traders are connected to gateway 2, and the last three traders are connected to gateway 3. We observe a negative impact on the rate of return for traders 1, 2 and 3 for experiment types 2 and 3 in which additional latency was imposed, while other traders benefited from the latency disadvantage imposed on gateway 1. Figure 5 visualizes analogous data to figure 4 as it plots the rate of return of each gateway by experiment type. This plot follows up the one in figure 4 by clearly showing the inequity in the trading exchange, since clients trading at gateway number one are at a clear disadvantage, while those at the other gateways are able to prosper from said disadvantage. 

Figure 6 shows the effect of latency on the number of trades executed. Higher latency will result in lower amounts of orders executed since there are lower amounts of orders reaching the matching engine at any given time, which shows another way latency can act as a disadvantage to some trades. Furthermore, the number of shares traded is also greatly affected, which is evident in figure 7, since it is dependent on the number of trades executed.

From the data we gathered in figures 8 and 9, we determine the effects of latency on buy prices, where the average buy price for gateway number 1 increases as latency is added, and more trades are unavailable due to the inequitable latency exploits leading to some traders with lower latency getting their orders matched earlier (which is evident in the figure 9. Where each data point corresponds to a trade order).

However we didn’t reach a direct conclusion with the sell price plots in our experiments, and more tests are needed to verify any correlation between latency and the average sell price. Our results for the sell price data are shown in figures 10 and 11.

At experiment types 2 and 3, the return for gateway 1 dropped significantly below gateways 2 and 3, evidently showing the inequitable arbitrage, which is visualized in figures 12, 13 and 14 (where each figure corresponds to the return over number of shares for each experiment type)


During our investigation, we have shown the impact of added latency in a controlled trading exchange environment on order execution. We have verified this through our results which satisfy that traders experienced ~50% fewer trades executed with 5 milliseconds of added latency costing them ~10% on their rate of return. We also acknowledged the importance of understanding the cost of latency to satisfy fairness in a universal trading environment, which is done by keeping latency considerations in mind when designing such market exchanges. By visualizing the effects of latency on many trading factors, the inequitable arbitrage existing in trading exchanges are clearly shown.

Future Directions

Future work would be helpful to clarify the impacts of specific latency on the rest of a market as well as with different exchange matching algorithms. Experimenting with larger simulations that have more clients and gateways would be more realistic. Different matching algorithms to test with include auctions and IEX’s “speed bump” algorithm. Experiments with different resequencing buffer delay values in the order’s upstream path, allowing the matching engine to sort orders based on their gateway timestamps before the orders are processed may also prove to provide valuable data and offer new insights. Though a standard buy low sell high algorithm was used in our trials for consistency, testing with other algorithms may yield interesting and unpredictable effects as a result of different amounts of latency. The influence of efficient technologies such as the latest (at time of writing) fifth generation (5G) telecommunications standard should also be analyzed to gain more relevant insights.


[1] Budish, E., Cramton, P., & Shim, J. (2015). The High-Frequency Trading Arms Race: Frequent Batch Auctions as a Market Design Response*. The Quarterly Journal of Economics, 130(4), 1547-1621. doi:10.1093/qje/qjv027

[2] Co-Location (CoLo) – NASDAQ. (n.d.). Retrieved August 07, 2020, from

[3] Daily Market Summary. (n.d.). Retrieved August 07, 2020, from

[4] Hasbrouck, J., & Saar, G. (2013, May 22). Low-latency trading. Retrieved August 05, 2020, from

[5] Huang, R., & Polak, T. (2011). LOBSTER: Limit Order Book Reconstruction System. SSRN Electronic Journal. doi:10.2139/ssrn.1977207

[6] IEX Group. (n.d.). Retrieved August 07, 2020, from

[7] McKay Brothers: Low Latency Microwave. (n.d.). Retrieved August 07, 2020, from

[8] Moallemi, C., Tsoukalas, G., Sun, Z., & Mookerjee, R. (2013, April 25). OR Forum-The Cost of Latency in High-Frequency Trading. Retrieved August 05, 2020, from 

[9] Spread Networks – CME Group. (n.d.). Retrieved August 07, 2020, from

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.