> Again, you miss the point. 95%+ of Linux systems are single disk. That's the expected case.
I specifically added ENOSPC as an example that's relevant on single disk systems as well.
Regardless, I thought we were talking about 95% of usecases in relation to implementations, not runtime systems, but even if we're talking about runtime systems, I'm not sure where you're pulling that 95% number from (or why you felt the need to add a plus sign this time around). That may be true for personal computers, but most Linux systems are servers, which generally aren't deployed in a single disk configuration.
> You brought this up initially, saying it was difficult to handle
I didn't say anything about difficulty. I only said that it wasn't trivial as you made it out to be, which isn't the same thing. Also, when I initially brought it up all I said was that in the FD exhaustion case it wouldn't work in the way you described in the comment that I responded to.
> You claimed it wasn't possible to be sure SIGBUS is from an I/O error. That's wrong.
I didn't. All I said in response to your claim of "It's either -EIO, or it's writing beyond EOF" was that there are other reasons for getting a SIGBUS. Moreover, I actually said (in the same paragraph), that if you know that a SIGBUS is caused by an I/O error, and that all of the code in your process is well-behaved (and by that I meant that terminating it with an abort() wouldn't cause side-effects due to e.g. atexit() handlers not running), using mmap with a SIGBUS handler might be reasonable.
> Wrong. You can resolve that ambiguity from the cited address. Try it next time.
First you claimed that I invented an ambiguity that doesn't exist, and that SIGBUS causes can be identifiable if I just read the sigaction(7) manpage. Now you say that there is an ambiguity, but that it can be resolved using the address, so which is it? [0]
I never said that using mmap is impossible, or even hard (and definitely not "this is actually really hard"). I actually agreed that in some cases it might be reasonable to do it with a SIGBUS handler. All I did say was that it isn't trivial to deal with errors, and that the 95% figure might be true for your usecases, but that it doesn't necessarily apply to other people's usecases.
The only one who said that something was "hard" during this discussion was you.
I get it, it's easier to attack the strawman rather than respond to my comments. I'm just not sure why you think it has anything to do with what I said.
[0] EDIT: I now see that you edited the sentence I quoted to say "from the cited address and si_errno etc.". It might surprise you to learn that si_errno is almost never set in Linux (the manpage is actually explicit about it with "si_errno is generally unused in Linux"), and definitely not in mmap-related SIGBUS coming from memory mapped files. I have no idea why you added this remark telling me that I should try it, when you clearly didn't.
> I have no idea why you added this remark telling me that I should try it, when you clearly didn't.
You are hilariously hostile here, I don't get it. si_errno is the second field in the struct after si_signo, saying "si_errno etc." is obviously in reference to the rest of the fields in the structure...
> You are hilariously hostile here, I don't get it.
I apologise if it came out hostile. That was not my intention. I was in a bit of hurry when I made the edit, and I just trying to expand my comment in response to your edit, and explain that non-I/O and non-disk SIGBUS errors sometimes look exactly like disk and filesystem errors that return EIO (not just signum being SIGBUS, but also si_code being set to BUS_ADRERR, etc.), so looking at the siginfo_t fields alone wouldn't be enough to diambiguate.
Then there's the address field, which can be probably be used in combination with parsing /proc/self/maps, but my point in that comment was that the information on the manpage alone wouldn't have helped people trying to implement a handler correctly.
In any case, I already described a scenario where crashing would be the wrong thing to do IMO, which you seemed to ignore. Even in scenarios where crashing is reasonable, I'm sure there's a solution for every edge case that I would bring up, but I never said that it was impossible, so I'm not sure why asking me to list every possible edge case is relevant when my point was just that there are edge cases, and that you'd need to consider them (and they would be different for different apps), thus making an implementation not trivial. That doesn't mean that it's necessarily difficult, just that it might be a more complex solution when compared to dealing with a failing VFS operation.
As it seems that we've reached an impasse, I'll just say that simplicity depends on the context and is sometimes a matter of personal taste. I don't have anything against mmap, and I was only trying to argue that there's a trade-off, but you are of course free to disagree and use mmap everywhere if that works for you.
I don't think I have anything more to add to what I already said, and I'm sorry again if you felt personally attacked, or that I had something against mmap and trying to spread FUD.
> but most Linux systems are servers, which generally aren't deployed in a single disk configuration.
You are incorrect about that: most Linux servers in the world have one disk. Most servers are not storage servers.
> I didn't say anything about difficulty. I only said that it wasn't trivial as you made it out to be
...and I demonstrated by counterexample that you're wrong, it is trivial. If you think I'm missing some detail, you are free to explain it. You're just handwaving.
> First you claimed that I invented an ambiguity that doesn't exist, and that SIGBUS causes can be identifiable if I just read the sigaction(7) manpage. Now you say that there is an ambiguity, but that it can be resolved using the address, so which is it?
Both, obviously? If you only look at signo there's an "ambiguity", but with the rest of siginfo_t the "ambiguity" ceases to exist. There is no case where you cannot unambiguously handle -EIO in a mmap via SIGBUS.
You claimed that you could only use SIGBUS with mmap if you were sure there were no other sources of SIGBUS. Quoting you directly:
> So yeah, if you know that apart from the disk (or filesystem, at any rate) your hardware is in order, and that the only reason for SIGBUS could be a failed I/O through a memory mapped file, and you know that all of the code in your process is well behaved, writing a SIGBUS handler that terminates the process with a message indicating an mmap I/O error might be reasonable
That statement is completely wrong: you can always tell whether it came from the mmap or something else, by looking at the siginfo_t fields.
> and by that I meant that terminating it with an abort() wouldn't cause side-effects due to e.g. atexit() handlers not running
Any system that breaks if atexit() handlers don't run is fundamentally broken by design. There are a dozen reasons the process can die without running those.
> All I did say was that it isn't trivial to deal with errors
Yes, and that statement is wrong. Most of the time it is trivial, because you just call abort(). There is no possibly simpler error handling than printing a message and calling abort(). For 95% of the workloads running across the world on Linux, that is entirely sufficient.
It is very unusual to try to recover from I/O error, and most programmers who try are really shooting themselves in the foot without realizing it.
You're free to disagree obviously, but I'm directly refuting the points you're making. Calling it a "strawman" make you look really really silly.
I specifically added ENOSPC as an example that's relevant on single disk systems as well.
Regardless, I thought we were talking about 95% of usecases in relation to implementations, not runtime systems, but even if we're talking about runtime systems, I'm not sure where you're pulling that 95% number from (or why you felt the need to add a plus sign this time around). That may be true for personal computers, but most Linux systems are servers, which generally aren't deployed in a single disk configuration.
> You brought this up initially, saying it was difficult to handle
I didn't say anything about difficulty. I only said that it wasn't trivial as you made it out to be, which isn't the same thing. Also, when I initially brought it up all I said was that in the FD exhaustion case it wouldn't work in the way you described in the comment that I responded to.
> You claimed it wasn't possible to be sure SIGBUS is from an I/O error. That's wrong.
I didn't. All I said in response to your claim of "It's either -EIO, or it's writing beyond EOF" was that there are other reasons for getting a SIGBUS. Moreover, I actually said (in the same paragraph), that if you know that a SIGBUS is caused by an I/O error, and that all of the code in your process is well-behaved (and by that I meant that terminating it with an abort() wouldn't cause side-effects due to e.g. atexit() handlers not running), using mmap with a SIGBUS handler might be reasonable.
> Wrong. You can resolve that ambiguity from the cited address. Try it next time.
First you claimed that I invented an ambiguity that doesn't exist, and that SIGBUS causes can be identifiable if I just read the sigaction(7) manpage. Now you say that there is an ambiguity, but that it can be resolved using the address, so which is it? [0]
I never said that using mmap is impossible, or even hard (and definitely not "this is actually really hard"). I actually agreed that in some cases it might be reasonable to do it with a SIGBUS handler. All I did say was that it isn't trivial to deal with errors, and that the 95% figure might be true for your usecases, but that it doesn't necessarily apply to other people's usecases.
The only one who said that something was "hard" during this discussion was you.
I get it, it's easier to attack the strawman rather than respond to my comments. I'm just not sure why you think it has anything to do with what I said.
[0] EDIT: I now see that you edited the sentence I quoted to say "from the cited address and si_errno etc.". It might surprise you to learn that si_errno is almost never set in Linux (the manpage is actually explicit about it with "si_errno is generally unused in Linux"), and definitely not in mmap-related SIGBUS coming from memory mapped files. I have no idea why you added this remark telling me that I should try it, when you clearly didn't.