Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think this is specifically because random reads are scaling much better than writes, even though you won't see it on standard "that many MB/s" benchmarks where read is 'just' a few multiples of the write performance.

Persisting a transaction to the database is still (and especially in MVCC): "send data write". "wait for write to be flushed". "toggle metadata bit to mark write as completed". "wait for bit to be completed" which still serialises transaction commits while reads can complete in parallel as fast as the device can handle.

Especially now that the reads and writes don't have to share the disk head, it makes sense for random reads to keep on scaling better than writes.



This is actually the opposite of current limitations. Stick a capacitor and some DRAM on the NVMe and you can "instantly" flush to disk, but there's no way to anticipate where the next read will come from and therefore no way to accelerate it.

You'll see modern NVMe disks with sustained writes greatly outpacing reads until the write cache is saturated, at which point what you say is true and reads will greatly outpace writes. But you don't want your disks to ever be in that threshold.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: