> the main insight was that rather than wait for market signals to then decide what to do, you can precalculate your responses up to and including the actual message to be sent to the exchange.
I saw a talk about this dialed up to eleven: the entire processing occurred in a "smart NIC" instead of the CPU. The response would start getting sent even as the inbound packet was still being received. The go/no-go decision was effectively just sending the final CRC bytes correctly or deliberately incorrectly, thus invalidating the outbound packet that was already 99% sent.
Before that talk I couldn't figure out why there was a market for NICs with embedded FPGAs, CPUs and memory.
Day traders basically subsidised these things, and now they do efficient packet switching for large cloud providers.
Reminds me of how crypto-mining subsidised a lot of GPU development, and now we have 4K ray tracing and AIs thanks to that.
> The go/no-go decision was effectively just sending the final CRC bytes correctly or deliberately incorrectly, thus invalidating the outbound packet that was already 99% sent.
This trick will get you banned on some of exchanges now :)
Another one, which is public knowledge for years, and also often penalized, is to send TCP fragment with header of the message well in advance, "booking" place in the queue. Then send the finishing fragment with the real order after doing all the calculations.
This bears the question of how does an exchange efficiently detect, log and take action against these kinds of behaviours without increasing its own latency too much and (perhaps?) affecting the market?
Does it even matter if a centralised exchange increases its own latency when all market participants have to go through it? I can only think of the case when a security is listed on multiple exchanges, where the latency could mean a small arbitrage opportunity.
Exchanges rarely care about their absolute latency. The latency race is for the best place in the order entry queue. As soon as the order is queued for sequential processing by the risk checker or the matching engine, the race is over. I've seen places where you needed sub-microsecond tick-to-order latency to win the race for the front of the queue, but the actual risk check and matching took tens of milliseconds.
They do care about throughput and providing fair conditions to all of the participants, though. On busiest derivatives exchanges this means resorting to FPGAs for initial checks.
Then, every message send to the exchange is clearly traceable. In some cases participants have dedicated physical lines. When the exchange sees increased rate of malformed packets from a single line or from a certain participant, they just cut it off and call the contact person from the participant (trader/broker) side to demand explanation.
Most exchanges have switched to order gateways that are either fpga or asic based.
Also every packet you send to an exchange is trivially attributed. They just kick you off if your shenanigans cause a problem. And then they tell all the other exchanges about you.
Cool! I actually wasn't aware about NICs with FPGAs on them. You learn something new every day on HN.
My solution wasn't as fast and it could never do what you describe (start sending bytes before the packet was fully received). The market signal messages were actually batched together (usually one to 5), compressed with zlib and sent as a single multicast packet.
You could in principle accelerate the most CPU intensive parts of Webservers with smart NICs. gzip, TLS, JSON serdes, html templates. There are also accelerators for databases, leaving just the business logic to be executed on the CPU.
I saw a talk about this dialed up to eleven: the entire processing occurred in a "smart NIC" instead of the CPU. The response would start getting sent even as the inbound packet was still being received. The go/no-go decision was effectively just sending the final CRC bytes correctly or deliberately incorrectly, thus invalidating the outbound packet that was already 99% sent.
Before that talk I couldn't figure out why there was a market for NICs with embedded FPGAs, CPUs and memory.
Day traders basically subsidised these things, and now they do efficient packet switching for large cloud providers.
Reminds me of how crypto-mining subsidised a lot of GPU development, and now we have 4K ray tracing and AIs thanks to that.