> It allows for printing an error message, associating the error with some conte...

krilovsky · 2024-08-25T00:09:39 1724544579

You're assuming here that mmap is only used for writing, where TFA is actually describing a read-only scenario, in which case EIO is synchronous as the read can't be completed.

As for the triviality of writing a SIGBUS handler correctly, that is an oversimplification at best. I/O errors aren't always fatal, even in the write case, and handling SIGBUS in the way you describe wouldn't work when e.g. you're also out of file descriptors, or when the cause of SIGBUS isn't due to an I/O error. So what works for 95% of your usecases doesn't necessarily apply to the 95% of other people's usecases.

jcalvinowens · 2024-08-25T00:36:18 1724546178

The point is the same for reads: the vast majority of usecases just immediately abort() when a read fails. Writing byzantine fault logic to deal with broken storage media is like trying to recover from SIGSEGV, it's almost never a good idea.

> I/O errors aren't always fatal, even in the write case

Linux will not return -EIO unless the disk is in an unrecoverable state. Generally the assumption is that userspace will treat -EIO as fatal, so the kernel won't return it unless it's truly hosed. Sometimes the error is specific to a file, but that's the far less common case in practice.

> e.g. you're also out of file descriptors,

ENFILE is easy to deal with in a fatal path, by closing stdin so fd #0 can be reused (you're about to call abort(), you don't need it anymore). Try again :)

> or when the cause of SIGBUS isn't due to an I/O error.

It's either -EIO, or it's I/O beyond EOF. The second thing is a bug equivalent to a buffer overrun. That's synchronous, you can handle it just like you handle SIGSEGV if you want to emit more debugging or even write byzantine recovery logic.

krilovsky · 2024-08-25T01:19:28 1724548768

> Generally the assumption is that userspace will treat -EIO as fatal

A single bad disk doesn't make the situation fatal (unless that's the only disk in your system, in which case you're not even guaranteed to have your signal handler code in memory).

> ENFILE is easy to deal with in a fatal path, by closing stdin so fd #0 can be reused

That's assuming you have stdin open, which again, may work for 95% of your usecases, but isn't universal.

> It's either -EIO, or it's writing beyond EOF

That's an unfounded statement. A quick search of the kernel code will show that there are other reasons for getting a SIGBUS, which are unrelated to mmap (non-disk hardware failures, certain CPU exceptions, to name a few). So yeah, if you know that apart from the disk (or filesystem, at any rate) your hardware is in order, and that the only reason for SIGBUS could be a failed I/O through a memory mapped file, and you know that all of the code in your process is well behaved, writing a SIGBUS handler that terminates the process with a message indicating an mmap I/O error might be reasonable, but that's not the reality for every process, and likely not even 95% of processes.

Regardless, my main point wasn't that lack of file descriptors makes your suggestion problematic, but that your description of it as trivial is an oversimplification at best. mmap has its uses (as does writing a SIGBUS handler to deal with errors), but that doesn't mean that it doesn't have issues. Highlighting them doesn't mean that plain read/write are perfect and free from issues either, and certainly code that isn't ready to deal with EIO will have a bad time when a VFS operation fails. But there are cases where making I/O explicit is better, and I'm not sure why you seem to be making blanket statements that trivialise the issues with mmap.

jcalvinowens · 2024-08-25T01:33:28 1724549608

> A single bad disk doesn't make the situation fatal (unless that's the only disk in your system, in which case you're not even guaranteed to have your signal handler code in memory).

Yes it does. Your point about signal handlers is why I'm right, that's beyond the point where you can expect the machine to function in a sane way. Trying to recover is often actively harmful.

> That's assuming you have stdin open, which again, may work for 95% of your usecases, but isn't universal.

If you've hit EMFILE, you absolutely have some FD which you can sacrifice to collect debug info, is my point. If you don't you can reserve one a priori, this isn't that hard to deal with.

> writing a SIGBUS handler that terminates the process with a message indicating an mmap I/O error might be reasonable, but that's not the reality for every process, and likely not even 95% of processes.

You're completely wrong here: you've invented an ambiguity that does not exist. Take a look at the manpage for sigaction(), and you'll see that all the non-I/O cases you mention are independently identifiable via members of the siginfo_t struct passed to your SIGBUS handler (just like the I/O cases).

> but that your description of it as trivial is an oversimplification at best.

I'm not oversimplifying: you're spewing unfounded FUD about the mmap() interface, and I'm telling you that none of these details matter for 95% of usecases.

krilovsky · 2024-08-25T04:25:14 1724559914

> Yes it does. Your point about signal handlers is why I'm right

If it's not a single disk system, not necessarily. To give you a concrete example: a process that writes logs to a disk dedicated for log collection can simply ignore an EIO/ENOSPC if logging isn't its main task. It can't easily recover from a SIGBUS in that scenario though.

> If you've hit EMFILE, you absolutely have some FD which you can sacrifice to collect debug info, is my point. If you don't you can reserve one a priori, this isn't that hard to deal with.

I'm not sure why you keep sticking to this example, when I already said that it was just an example of another detail that you need to take into account when implementing a SIGBUS handler. Sure, you can open /proc/self/maps a-priori and side-step the issue, but that's another detail that you need to take into account (and that you didn't mention until I brought it up). I never said that it was hard, only that writing a proper handler that deals with the edge cases isn't as trivial as you claim.

> you've invented an ambiguity that does not exist [...] you'll see that all the non-I/O cases you mention are independently identifiable via members of the siginfo_t struct

I'm not sure what's the ambiguity that you're claiming that I've invented. Yes, some of the specific examples that I gave (specifically CPU exceptions) are identifiable if you already know the details, but not all of them: non-disk faults can still result in SIGBUS with BUS_ADRERR, so that alone isn't enough to identify EIO errors or EOF coming from memory-mapped files, and I know that from personal experience debugging SIGBUS crashes.

> you're spewing unfounded FUD about the mmap() interface

I don't know where this is coming from. I never said that using mmap is bad or that it's impossible to write a SIGBUS handler to output debug info before crashing. I merely pointed out that it's not necessarily trivial, as there are details that should be taken care of, and that it may not in fact be suitable for 95% of usecases as you claimed.

You have a mental model of an ideal system which either can't recover from I/O errors, or doesn't get SIGBUS for reasons other than EIO or reading beyond EOF. I'm trying to tell you that not every system is like that, and that while mmap is useful, there are cases where explicit I/O is better suited for the task, and that your 95% might not be everyone's 95%. If you see FUD in simple facts, then I'm sorry, but I see no point in continuing this discussion.

jcalvinowens · 2024-08-25T05:08:25 1724562505

> If it's not a single disk system, not necessarily.

Again, you miss the point. 95%+ of Linux systems are single disk. That's the expected case.

>> If you've hit EMFILE

> I'm not sure why you keep sticking to this example

You brought this up initially, saying it was difficult to handle. I'm demonstrating that you're wrong, it's actually quite trivial to handle. Handwaving about "edge cases" is FUD, if you have some specific point to make then make it

> I'm not sure what's the ambiguity that you're claiming that I've invented.

You claimed it wasn't possible to be sure SIGBUS is from an I/O error. That's wrong.

> non-disk faults can still result in SIGBUS with BUS_ADRERR, so that alone isn't enough to identify EIO errors or EOF coming from memory-mapped files

Wrong. You can resolve that ambiguity from the cited address and si_errno etc. Try it next time.

> I'm trying to tell you that not every system is like that.

The fact you think I need to be told that is amusing. You're completely missing the point.

Let me try one more time:

Don't make things hard when they don't have to be. 95% of the time, they don't have to be. Saying "no, this is actually really hard, and you need to care about these normally irrelevant things" without first acknowledging the simple case is FUD in my book.

krilovsky · 2024-08-25T06:10:16 1724566216

> Again, you miss the point. 95%+ of Linux systems are single disk. That's the expected case.

I specifically added ENOSPC as an example that's relevant on single disk systems as well.

Regardless, I thought we were talking about 95% of usecases in relation to implementations, not runtime systems, but even if we're talking about runtime systems, I'm not sure where you're pulling that 95% number from (or why you felt the need to add a plus sign this time around). That may be true for personal computers, but most Linux systems are servers, which generally aren't deployed in a single disk configuration.

> You brought this up initially, saying it was difficult to handle

I didn't say anything about difficulty. I only said that it wasn't trivial as you made it out to be, which isn't the same thing. Also, when I initially brought it up all I said was that in the FD exhaustion case it wouldn't work in the way you described in the comment that I responded to.

> You claimed it wasn't possible to be sure SIGBUS is from an I/O error. That's wrong.

I didn't. All I said in response to your claim of "It's either -EIO, or it's writing beyond EOF" was that there are other reasons for getting a SIGBUS. Moreover, I actually said (in the same paragraph), that if you know that a SIGBUS is caused by an I/O error, and that all of the code in your process is well-behaved (and by that I meant that terminating it with an abort() wouldn't cause side-effects due to e.g. atexit() handlers not running), using mmap with a SIGBUS handler might be reasonable.

> Wrong. You can resolve that ambiguity from the cited address. Try it next time.

First you claimed that I invented an ambiguity that doesn't exist, and that SIGBUS causes can be identifiable if I just read the sigaction(7) manpage. Now you say that there is an ambiguity, but that it can be resolved using the address, so which is it? [0]

I never said that using mmap is impossible, or even hard (and definitely not "this is actually really hard"). I actually agreed that in some cases it might be reasonable to do it with a SIGBUS handler. All I did say was that it isn't trivial to deal with errors, and that the 95% figure might be true for your usecases, but that it doesn't necessarily apply to other people's usecases.

The only one who said that something was "hard" during this discussion was you.

I get it, it's easier to attack the strawman rather than respond to my comments. I'm just not sure why you think it has anything to do with what I said.

[0] EDIT: I now see that you edited the sentence I quoted to say "from the cited address and si_errno etc.". It might surprise you to learn that si_errno is almost never set in Linux (the manpage is actually explicit about it with "si_errno is generally unused in Linux"), and definitely not in mmap-related SIGBUS coming from memory mapped files. I have no idea why you added this remark telling me that I should try it, when you clearly didn't.

jcalvinowens · 2024-08-25T06:54:35 1724568875

> I have no idea why you added this remark telling me that I should try it, when you clearly didn't.

You are hilariously hostile here, I don't get it. si_errno is the second field in the struct after si_signo, saying "si_errno etc." is obviously in reference to the rest of the fields in the structure...

krilovsky · 2024-08-25T16:30:05 1724603405

> You are hilariously hostile here, I don't get it.

I apologise if it came out hostile. That was not my intention. I was in a bit of hurry when I made the edit, and I just trying to expand my comment in response to your edit, and explain that non-I/O and non-disk SIGBUS errors sometimes look exactly like disk and filesystem errors that return EIO (not just signum being SIGBUS, but also si_code being set to BUS_ADRERR, etc.), so looking at the siginfo_t fields alone wouldn't be enough to diambiguate.

Then there's the address field, which can be probably be used in combination with parsing /proc/self/maps, but my point in that comment was that the information on the manpage alone wouldn't have helped people trying to implement a handler correctly.

In any case, I already described a scenario where crashing would be the wrong thing to do IMO, which you seemed to ignore. Even in scenarios where crashing is reasonable, I'm sure there's a solution for every edge case that I would bring up, but I never said that it was impossible, so I'm not sure why asking me to list every possible edge case is relevant when my point was just that there are edge cases, and that you'd need to consider them (and they would be different for different apps), thus making an implementation not trivial. That doesn't mean that it's necessarily difficult, just that it might be a more complex solution when compared to dealing with a failing VFS operation.

As it seems that we've reached an impasse, I'll just say that simplicity depends on the context and is sometimes a matter of personal taste. I don't have anything against mmap, and I was only trying to argue that there's a trade-off, but you are of course free to disagree and use mmap everywhere if that works for you.

I don't think I have anything more to add to what I already said, and I'm sorry again if you felt personally attacked, or that I had something against mmap and trying to spread FUD.

jcalvinowens · 2024-08-25T06:43:51 1724568231

> but most Linux systems are servers, which generally aren't deployed in a single disk configuration.

You are incorrect about that: most Linux servers in the world have one disk. Most servers are not storage servers.

> I didn't say anything about difficulty. I only said that it wasn't trivial as you made it out to be

...and I demonstrated by counterexample that you're wrong, it is trivial. If you think I'm missing some detail, you are free to explain it. You're just handwaving.

> First you claimed that I invented an ambiguity that doesn't exist, and that SIGBUS causes can be identifiable if I just read the sigaction(7) manpage. Now you say that there is an ambiguity, but that it can be resolved using the address, so which is it?

Both, obviously? If you only look at signo there's an "ambiguity", but with the rest of siginfo_t the "ambiguity" ceases to exist. There is no case where you cannot unambiguously handle -EIO in a mmap via SIGBUS.

You claimed that you could only use SIGBUS with mmap if you were sure there were no other sources of SIGBUS. Quoting you directly:

> So yeah, if you know that apart from the disk (or filesystem, at any rate) your hardware is in order, and that the only reason for SIGBUS could be a failed I/O through a memory mapped file, and you know that all of the code in your process is well behaved, writing a SIGBUS handler that terminates the process with a message indicating an mmap I/O error might be reasonable

That statement is completely wrong: you can always tell whether it came from the mmap or something else, by looking at the siginfo_t fields.

> and by that I meant that terminating it with an abort() wouldn't cause side-effects due to e.g. atexit() handlers not running

Any system that breaks if atexit() handlers don't run is fundamentally broken by design. There are a dozen reasons the process can die without running those.

> All I did say was that it isn't trivial to deal with errors

Yes, and that statement is wrong. Most of the time it is trivial, because you just call abort(). There is no possibly simpler error handling than printing a message and calling abort(). For 95% of the workloads running across the world on Linux, that is entirely sufficient.

It is very unusual to try to recover from I/O error, and most programmers who try are really shooting themselves in the foot without realizing it.

You're free to disagree obviously, but I'm directly refuting the points you're making. Calling it a "strawman" make you look really really silly.

loeg · 2024-08-25T00:05:13 1724544313

First, you're free to not use buffered IO. Second, EIO on fsync or close for buffered IO is still adjacent to the relevant file descriptor.

jcalvinowens · 2024-08-25T00:38:38 1724546318

> adjacent to the relevant file descriptor.

So is SIGBUS: you get the address in the handler. You probably have a data structure associating the two things somewhere anyway, and if you don't you can look it up in /proc.