> Hence, it'll be a while before it permeates through various ecosystems. this m...

mrcode007 · on May 22, 2023

Windows has been using this mechanism for well over a decade now. It’s called IOCP (IO completion ports)

hawk_ · on May 22, 2023

IOCP is syscall free only on one side i.e. completion. io_uring can be syscall free on submission side as well.

riceart · on May 22, 2023

> well over a decade now.

3 decades.

blibble · on May 22, 2023

and Linux has had POSIX async IO since 2003

(which no-one uses either because the API doesn't compose well onto existing application structures)

mananaysiempre · on May 22, 2023

If you had mentioned Solaris, I’d have agreed.

But POSIX async I/O (the aio_* functions) in Linux is basically worthless performance-wise AFAIU, because Glibc implements it in userspace by spawning threads to do standard sync I/O. Now Linux also has non-POSIX async I/O (the io_* functions), but it’s very situational because it works only if you bypass the cache (O_DIRECT) and can still randomly block on metadata operations (so can Win32, to be fair). There’s select/poll/epoll with O_NONBLOCK of course, which is what people normally use, but those do not really work with files on disk (neither do their WinSock equivalents). Hell, signal-driven IO (O_ASYNC) exists, I’ve used it to make a single-threaded emulator (CPU-bound unlike a network server) interact with the terminal. But asynchronous I/O of normal, cached files is only possible on Linux through the use of io_uring, as far as I’ve been able to figure out.

That said, I’ve read people here saying[1] that overlapped I/O on Windows also works by scheduling operations on a thread pool, even referencing KB articles[2]. This does not mesh with everything I’ve read about I/O in the NT kernel, which is supposed to be natively async to the point where the I/O request datastructure (the IRP) has what’s essentially an emulated call stack inside of it, in order to allow the I/O subsystem to juggle continuations. What am I missing? Does the Win32 subsystem need to dumb things down that much even inside its own implementation?

(Windows 8 also introduced a ringbuffer-based, no-syscalls thing called Registered I/O that looks very much like io_uring.)

[1] https://news.ycombinator.com/item?id=11867351

[2] https://support.microsoft.com/kb/156932

cyberax · on May 22, 2023

> That said, I’ve read people here saying[1] that overlapped I/O on Windows also works by scheduling operations on a thread pool

The _kernel_ thread pool. Eventually, most work has to be done in an actual thread, after all.

> [2] https://support.microsoft.com/kb/156932

It's a bit misleading. What they mean is that some operations can act as barriers for further operations. E.g. async calls to ReadFile won't run until the call to WriteFile finishes (if it's writing past the end of the file).

GoblinSlayer · on May 23, 2023

Why you think you can't use epoll with disk files on linux?

mananaysiempre · on May 23, 2023

Per open(2) [1], you can’t really ask the kernel to not block on regular files:

> O_NONBLOCK [...] has no effect for regular files and will (briefly) block when device activity is required, regardless of whether O_NONBLOCK is set. []O_NONBLOCK semantics might eventually be implemented[.]

I’m actually not sure if the reported readiness for them is of any use, but the documentation for select(2) [2] doesn’t give me a lot of hope:

> A file descriptor is ready for writing if a write operation will not block. However, even if a file descriptor indicates as writable, a large write may still block.

This for data operations; if you want open() itself to avoid spelunking through NFS or spinning up optical drives or whatnot, before io_uring you simply had no way to tell that to the kernel—you call open*() or perhaps creat(), which must give you a fd, thus must block until they can do so.

(As far as I’ve seen, tutorial documentation usually rounds this down to “you can’t do nonblocking I/O on disk files”.)

[1] https://man7.org/linux/man-pages/man2/open.2.html

[2] https://man7.org/linux/man-pages/man2/select.2.html

GoblinSlayer · on May 23, 2023

You use io_uring to spin optical drives?

loeg · on May 22, 2023

Yeah, but everything on Windows uses IOCP.

fanf2 · on May 22, 2023

I am looking forward to io_uring support in libuv

CoolCold · on May 23, 2023

isn't it a thing already https://www.phoronix.com/news/libuv-io-uring ?