Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Hence, it'll be a while before it permeates through various ecosystems.

this may take a while as it's a completely different IO model

it took us 30 odd years to get from select/epoll to async/coroutines being popular



Windows has been using this mechanism for well over a decade now. It’s called IOCP (IO completion ports)


IOCP is syscall free only on one side i.e. completion. io_uring can be syscall free on submission side as well.


> well over a decade now.

3 decades.


and Linux has had POSIX async IO since 2003

(which no-one uses either because the API doesn't compose well onto existing application structures)


If you had mentioned Solaris, I’d have agreed.

But POSIX async I/O (the aio_* functions) in Linux is basically worthless performance-wise AFAIU, because Glibc implements it in userspace by spawning threads to do standard sync I/O. Now Linux also has non-POSIX async I/O (the io_* functions), but it’s very situational because it works only if you bypass the cache (O_DIRECT) and can still randomly block on metadata operations (so can Win32, to be fair). There’s select/poll/epoll with O_NONBLOCK of course, which is what people normally use, but those do not really work with files on disk (neither do their WinSock equivalents). Hell, signal-driven IO (O_ASYNC) exists, I’ve used it to make a single-threaded emulator (CPU-bound unlike a network server) interact with the terminal. But asynchronous I/O of normal, cached files is only possible on Linux through the use of io_uring, as far as I’ve been able to figure out.

That said, I’ve read people here saying[1] that overlapped I/O on Windows also works by scheduling operations on a thread pool, even referencing KB articles[2]. This does not mesh with everything I’ve read about I/O in the NT kernel, which is supposed to be natively async to the point where the I/O request datastructure (the IRP) has what’s essentially an emulated call stack inside of it, in order to allow the I/O subsystem to juggle continuations. What am I missing? Does the Win32 subsystem need to dumb things down that much even inside its own implementation?

(Windows 8 also introduced a ringbuffer-based, no-syscalls thing called Registered I/O that looks very much like io_uring.)

[1] https://news.ycombinator.com/item?id=11867351

[2] https://support.microsoft.com/kb/156932


> That said, I’ve read people here saying[1] that overlapped I/O on Windows also works by scheduling operations on a thread pool

The _kernel_ thread pool. Eventually, most work has to be done in an actual thread, after all.

> [2] https://support.microsoft.com/kb/156932

It's a bit misleading. What they mean is that some operations can act as barriers for further operations. E.g. async calls to ReadFile won't run until the call to WriteFile finishes (if it's writing past the end of the file).


Why you think you can't use epoll with disk files on linux?


Per open(2) [1], you can’t really ask the kernel to not block on regular files:

> O_NONBLOCK [...] has no effect for regular files and will (briefly) block when device activity is required, regardless of whether O_NONBLOCK is set. []O_NONBLOCK semantics might eventually be implemented[.]

I’m actually not sure if the reported readiness for them is of any use, but the documentation for select(2) [2] doesn’t give me a lot of hope:

> A file descriptor is ready for writing if a write operation will not block. However, even if a file descriptor indicates as writable, a large write may still block.

This for data operations; if you want open() itself to avoid spelunking through NFS or spinning up optical drives or whatnot, before io_uring you simply had no way to tell that to the kernel—you call open*() or perhaps creat(), which must give you a fd, thus must block until they can do so.

(As far as I’ve seen, tutorial documentation usually rounds this down to “you can’t do nonblocking I/O on disk files”.)

[1] https://man7.org/linux/man-pages/man2/open.2.html

[2] https://man7.org/linux/man-pages/man2/select.2.html


You use io_uring to spin optical drives?


Yeah, but everything on Windows uses IOCP.


I am looking forward to io_uring support in libuv





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: