Feels like a jumble of data. You should organize this somehow. The list of "high level" items seems pretty reasonable, but you left out historical data and backtesting.
Neither are involved in the intraday production critical path. Maybe I should have explicitly used that phrase.
FYI: "Answer: For various reasons, (2) is the natural entry point ( historicaldata can be used in lieu of real data, strategy requires data, and building an orderbooksimulator can initially obviate the need for true exchange order entry), so let’s start here."
The author seems to imply he reimplemented hash tables.
I was under the impression there was a fairly good number of high quality implementations of hashes tables in the wild and that it was a thoroughly studied subject.
For components not in the critical path, third party implementations are acceptable.
However, it is not surprising to see every piece in the critical path to be written in-house (feed handlers, for which there are many companies selling solutions) or rebuilt with heavy modifications (kernel).
I had intended to say hash function, but it was very late by the time I posted.
We don't directly do HFT, we work with banks to help them do that well.
All the numbers I could give you is what our data engine (which is what we sell) gives on our test environment, I cannot share what our customers actually get once deployed and configured on their system.
Generally, they are happy. :)
We're going to post benchmarks soon as we're going out of stealth mode.
Well, he didn't write about hash tables in this post...
Anyway, even if there are good hash table implementations, sometimes you have to provide a custom hash function. A default hash function will not necessarily give you a table without collisions.
Bloomberg also offers a comprehensive market data product that isn't just a firehose from the exchange. Lots of work went into normalizing the data stream so that consumers of the stream wouldn't have to learn the oddities of hundreds of different market feeds. The latency is actively monitored (in microseconds) and they offer direct co-lo as well.
I agree though, even if you have the best algo around, it could fall flat on its face unless you have the pockets to be side-by-side with the big shops.
These places are especially paranoid when it comes to co-lo. I've heard that exchanges guarantee that all the boxes are given the same exact length of network cable to remove any chance that shorter length favors one shop or another.
I agree with the final conclusion, but this is a terrible way to measure system latency!
For starters, most exchanges block icmp packets, precluding a ping-based measurement.
Latency should be measured by actually going through their systems. For example, after sending a FIX message to IB, it's not known how well their FIX processor performs (at the very least, they have to do risk checks, so its not just cut-through).
I think the point is that if you're not co-located, network transit time from your servers to IB, not even including IB to exchange, is an insurmountable disadvantage.