August 12, 2025

Latency Spikes in Colocated Environments

Colocation promises proximity. Physical distance is compressed to meters. Fiber paths are short, cabinets are next to each other, and the exchange clock is often a few microseconds away. But even in these tightly controlled setups, latency is not constant. Spikes occur. And when they do, they reveal how fragile “low latency” really is.

Latency Is Not a Number

Latency isn’t a value. It’s a distribution. A trader might quote with 23μs RTT most of the time, but that number hides the tail. Bursts in the 60–100μs range appear sporadically. They’re rare enough to escape dashboards, but frequent enough to disrupt strategy performance.

Consider an order that normally reaches the matching engine in 20μs. If it hits 90μs under spike conditions, the trader is no longer racing peers—they’re already behind. On a venue with deterministic queueing, that delay pushes the order down the stack. On one that favors speed, it changes the outcome completely.

The difference between top of book and nothing can sit in that spike.

What Causes the Spikes?

The common assumption is network congestion. That’s part of it, but in colocated environments, the causes are more varied.

CPU scheduling: Even in stripped-down machines pinned to a few cores, kernel interrupts and background processes interfere. A missed tick. A stalled NIC queue. One process grabs a core slightly longer than expected. Microseconds disappear.
Garbage collection (GC): Not every HFT stack is written in C. Some rely on JVM-based components. GC events—even minor ones—can add jitter. It’s not always visible unless the logs are granular.
Kernel queue overflow: The socket buffer fills briefly. A burst of inbound quotes collides with an outbound burst. The OS chooses to drop, delay, or reroute. None of those choices preserve timing.
Firmware-level contention: Certain NICs introduce variation in processing when buffer thresholds are crossed. Others switch paths depending on packet size or content. It’s invisible at the application layer.
Shared infrastructure: Even in premium colocation zones, some parts of the chain—like cross-connects or routing gateways—are not dedicated. Another client’s burst load leaks into your timestamp.

All of this means latency spikes are not always under your control, even when you own the box.

Measuring the Wrong Things

Traders love averages. They benchmark using medians, 95th percentiles, rolling means. These are good for dashboards. But strategies don’t run on averages. They run on edge cases.

If a quoting strategy expects to cancel within 50μs and a spike pushes that to 100μs, it becomes stale. If the market moves against it during that window, the system bleeds. The spike might be invisible in daily metrics. The pnl hit isn’t.

Logs should show max deviation per session. Heatmaps should plot deltas against clock syncs, not just RTT. Packet captures need to be reviewed when queue position outcomes look off. And simulations must include latency spike conditions, not just Gaussian noise.

Why Spikes Break Models

Most latency-sensitive models assume tight confidence intervals. The engine places, modifies, and cancels based on deterministic time budgets. These models fail not because the average latency changes, but because assumptions about worst-case boundaries are wrong.

Let’s say a trader builds a fast-twitch strategy with a 60μs round-trip threshold. The model assumes the system can observe a quote change, place a new order, and reach the book within that window. If occasional spikes push it to 80μs, the order either misses the intended price level or lands after the market has already moved. That breaks slippage assumptions and exposes the trade to re-pricing risk.

Spikes don’t just delay execution. They distort the strategy's assumptions about time and state.

Risk of False Causality

When fill ratios drop, or queue positions look off, the temptation is to blame the exchange or the counterparty’s speed. But often the culprit is internal. A 15μs delay in message construction due to an unexpected object allocation. A brief thread lock. An I/O interrupt handled with lower priority.

Latency spikes create phantom behavior. Traders start chasing ghosts in the market data. They tweak model parameters to fit outcomes that aren’t driven by liquidity or intent—but by infrastructure lag.

This is how signal turns into noise, and backtests start lying.

Dealing with Spikes

There’s no full prevention. But mitigation is possible.

Monitor at microsecond resolution. Don’t rely on app-layer logs. Use timestamping at the NIC level. Record every outbound and inbound FIX message with nanosecond precision. Aggregate only for charts, not for alerts.
Detect spikes as events. Treat any deviation beyond a known threshold as a discrete event, not a statistical artifact. Tag sessions where they happen. Correlate with PnL degradation, queue drops, or missed fills.
Architect for resilience, not just speed. Build tolerances into the model. If a cancel might be late, quote less aggressively. If a fill risk grows during microbursts, throttle exposure. It’s not a binary choice between latency or safety—it’s dynamic adjustment under pressure.
Consider multi-path routing. Some firms split traffic across multiple NICs or cores and race them. Whichever returns first wins. It’s expensive. But it can shield execution logic from local spikes.
Replay failures in low latency testbeds. When you spot a spike that breaks a trade, reconstruct it in a sandbox. Replay market data and order events with injected delays. See where the strategy fails—not just whether it does.

No Such Thing as Stable Latency

Even in the cleanest cabinets of a Tier 1 exchange, latency remains probabilistic. Machines age. Switches reroute. Power dips. Thermal throttling kicks in. Most of it isn’t visible until it ruins a fill.

A strategy that depends on being faster needs more than speed. It needs immunity to momentary failure.

About Axon Trade

Axon Trade provides advanced trading infrastructure for institutional and professional traders, offering high-performance FIX API connectivity, real-time market data, and smart order execution solutions. With a focus on low-latency trading and risk-aware decision-making, Axon Trade enables seamless access to multiple digital asset exchanges through a unified API.

Explore Axon Trade’s solutions: