Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> but most Linux systems are servers, which generally aren't deployed in a single disk configuration.

You are incorrect about that: most Linux servers in the world have one disk. Most servers are not storage servers.

> I didn't say anything about difficulty. I only said that it wasn't trivial as you made it out to be

...and I demonstrated by counterexample that you're wrong, it is trivial. If you think I'm missing some detail, you are free to explain it. You're just handwaving.

> First you claimed that I invented an ambiguity that doesn't exist, and that SIGBUS causes can be identifiable if I just read the sigaction(7) manpage. Now you say that there is an ambiguity, but that it can be resolved using the address, so which is it?

Both, obviously? If you only look at signo there's an "ambiguity", but with the rest of siginfo_t the "ambiguity" ceases to exist. There is no case where you cannot unambiguously handle -EIO in a mmap via SIGBUS.

You claimed that you could only use SIGBUS with mmap if you were sure there were no other sources of SIGBUS. Quoting you directly:

> So yeah, if you know that apart from the disk (or filesystem, at any rate) your hardware is in order, and that the only reason for SIGBUS could be a failed I/O through a memory mapped file, and you know that all of the code in your process is well behaved, writing a SIGBUS handler that terminates the process with a message indicating an mmap I/O error might be reasonable

That statement is completely wrong: you can always tell whether it came from the mmap or something else, by looking at the siginfo_t fields.

> and by that I meant that terminating it with an abort() wouldn't cause side-effects due to e.g. atexit() handlers not running

Any system that breaks if atexit() handlers don't run is fundamentally broken by design. There are a dozen reasons the process can die without running those.

> All I did say was that it isn't trivial to deal with errors

Yes, and that statement is wrong. Most of the time it is trivial, because you just call abort(). There is no possibly simpler error handling than printing a message and calling abort(). For 95% of the workloads running across the world on Linux, that is entirely sufficient.

It is very unusual to try to recover from I/O error, and most programmers who try are really shooting themselves in the foot without realizing it.

You're free to disagree obviously, but I'm directly refuting the points you're making. Calling it a "strawman" make you look really really silly.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: