Batching syscalls is on my mind. The architecture of Fusion will revolve around ...

gavinhoward · on July 15, 2024

Ack! I was just about to write a blog post on this idea!

Good minds think alike, I guess.

But as a sibling comment says, this is essentially what io_uring is. Read Lord of the io_uring [1] if you want to know more. Polling mode is the key.

[1]: https://unixism.net/loti/index.html

cb321 · on July 15, 2024

You might both still be interested in the tiny (EDIT: 27 lines! https://github.com/c-blake/batch/blob/9c7e07670ef1fd0e98687c...) little "virtual machine interpreter" I linked to. Per-interpreter loop overheads are below 100 CPU cycles on several CPUs on a generation of hardware similar to what https://www.usenix.org/system/files/atc20-gu.pdf was mentioning as 700-1000 cycles for microkernel IPC latencies. I'm not sure if the idea coheres with io_uring style dispatch (especially the emphasized polling mode), though.. maybe with some work.

The reason I mentioned it after elcritch's RTOS mention is partly that the way the little interpreter has no backward jumps means there are no loops and so no halting problem issue. So, you can still embed conditional error handling logic in system call batches, but the overall call is bounded by the time of its component calls (that is, if they are bounded, anyway...). That might be a useful property for higher levels of the system to guarantee meeting a real-time budget in many situations with very low complexity. I'm not sure if any of this is original with that github repo, but I haven't seen it before in this specific context.

Perhaps the most complete example of "adding a new sys_batch-based syscall" is https://github.com/c-blake/batch/blob/master/examples/total.... which adds a `mapall` that can mmap a whole file or fail trying (at a couple points) for the purpose of just totaling the bytes as an example calculation.

cb321 · on July 15, 2024

Sounds interesting - kind of like microkernels meet io_uring (in Elevator-pitch-ese).