It explains how to avoid the 40ms delay and still batch data where possible for maximum efficiency. The key part is that you can toggle the TCP options during the lifetime of the connection to force flushes.
By using sendmsg() with MSG_MORE instead of write(), you can avoid the setsockopt() with TCP_CORK to cork before the write, and the later setsockopt() with TCP_NODELAY to push despite the cork.
You can't give MSG_MORE to sendfile(), though with HTTP you don't need to. But if you need an equivalent of MSG_MORE with sendfile() you can in theory use SPLICE_F_MORE with splice() instead.
Ah, I see, you were referring to saving `setsockopt()`, not data-carrying syscalls.
Yes that makes sense. But I guess that in most cases where you have control over the `sendmsg()`'s calls flags, you'd also have control over its buffer, so you may be able to build the buffer in userspace in many situations, thus even saving multiple data-carrying syscalls.
The `setsockopt()` approach has the benefit that it works even when you have no control over the sending syscalls, e.g. when some library does it for you that you cannot modify or configure.
MSG_MORE comes in useful for these examples, where you can't use writev() alone, but do control the sending syscalls:
- HTTP (unencrypted) serving static files or cache files, to combine sendmsg() for the headers followed by sendfile() for the body. You can't batch using writev() in that case, if you want the benefit of sendfile().
- Transmitting a stream of data that is being forwarded or generated. For example a HTTPS reverse proxy which forwards incoming unencrypted data and formats it into TLS progressively. It can't buffer the whole response as that would add too much delay, so it can send using sendmsg() with MSG_MORE until it reaches the end of the forwarded data.
I was in that situation 4 years ago and did a short write up on it:
https://gist.github.com/nh2/9def4bcf32891a336485
It explains how to avoid the 40ms delay and still batch data where possible for maximum efficiency. The key part is that you can toggle the TCP options during the lifetime of the connection to force flushes.
Review appreciated.