It's a bit unfair to pick on a 2-yr old post, but I don't agree with what Josh i...

It's a bit unfair to pick on a 2-yr old post, but I don't agree with what Josh is saying here, beyond the fact that anyone claiming to be the world's fastest database is talking about a particular benchmark that makes them look good.

So there are quite a few things you can do to go faster once you've decided your data is going to fit in memory. An easy one is to build memory-centric indexes that never hit disk. These can be several times faster than the disk-page-sized b-trees that most disk-backed systems use, even if the disk-backed systems are sitting on a ram-drive.

A second thing you can sometimes get away with is more granular locks on data structures. Since you don't worry about holding a mutex during a page fault, you don't need super complex concurrent data structures. It might be faster just to shove a queue in front of an index and let each index lookup happen sequentially. This is one of the biggest VoltDB tricks (though I'm simplifying). It's also easier to build something like the MemSQL skip-list in memory, but I've never seen it outperform the simpler VoltDB system at either non-column-based scanning or at index lookups. Lock-free data structures may not block, but they're not simple and they make other tradeoffs that affect performance.

As far as scanning tuples goes, which the recent MemSQL marketing calls out, I don't think that has much to do with disk or memory, but rather the architecture and implementation. Vertica's big innovation was using a column-store to scan faster and this seems to be what MemSQL is doing here. If you want to do full table scans, this is very smart. Although you pay a loading penalty as a tradeoff for query performance. Vertica's huge win was technology to make that loading penalty smaller; I'm not sure what the costs to load MemSQL's column data are. I suspect one key innovation is being cheaper than Vertica.

Another legit innovation of VoltDB was to give up external transaction control along with disk. The only thing worse than keeping a transaction open while waiting for disk is keeping it open while waiting for a user. Getting rid of both allows for a crazy fast system when you can assume no substantial waits in your fast path. Using batched SQL or stored procs, you can keep the conditional logic you love, but go fast. Many NoSQL systems (and MemSQL) get this same speedup by just dropping transactions altogether. MVCC does a lot to alleviate this problem, but adds its own complexity; it's really hard to get right/fast over a network.

Hekaton is another system that has made really different choices when focused on memory and reaped benefits. I think for business reasons, it's really tied to MS SQL Server's disk implementation. Though that has pros to go with cons. Hekaton is also new since Josh wrote this post.

And all of these systems can be fully/mostly persistent on disk. It's more important that your data fits in memory for design choices than it never writes to disk. VoltDB in particular can perform many millions of synchronous writes per second. It can do this with SSDs or even with spindles using a flash-backed disk controller.

Disclaimer: VoltDB Eng.