Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From what I understand:

1. Initially, they just wanted to give compiler makers more freedom: both in the sense "do whatever is simplest" and "do something platform-specific which dev wants". 2. Compiler devs found that they can use UB for optimization: e.g. if we assume that a branch with UB is unreachable we can generate more efficient code. 3. Sadly, compiler devs started to exploit every opportunity for optimization, e.g. removing code with a potential segfault.

I.e. people who made a standard thought that compiler would remove no-op call to memcpy, but GCC removes the whole branch which makes the call as it considers the whole branch impossible. Standard makers thought that compiler devs would be more reasonable



> Standard makers thought that compiler devs would be more reasonable

This is a bit of a terrible take? Compiler devs never did anything "unreasonable", they didn't sit down and go "mwahahaha we can exploit the heck out of UB to break everything!!!!"

Rather, repeatedly applying a series of targeted optimizations, each one in isolation being "reasonable", results in an eventual "unreasonable" total transformation. But this is more an emergent property of modern compilers having hundreds of optimization passes.

At the time the standards were created, the idea of compilers applying so many optimization passes was just not conceivable. Compilers struggled to just do basic compilation. The assumption was a near 1:1 mapping between code & assembly, and that just didn't age well at all.


One could argue that "optimizing based on signed overflow" was an unreasonable step to take, since any given platform will have some sane, consistent behavior when the underlying instructions cause an overflow. A developer using signed operations without poring over the standard might have easily expected incorrect values (or maybe a trap if the platform likes to use those), but not big changes in control flow. In my experience, signed overflow is generally the biggest cause of "they're putting UB in my reasonable C code!", followed by the rules against type punning, which are violated every day by ordinary usage of the POSIX socket functions.


> One could argue that "optimizing based on signed overflow" was an unreasonable step to take

That optimization allows using 64-bit registers / offset loads for signed ints which it can't do if it has to overflow, since that overflow must happen at 32-bits. That's not an uncommon thing.


I started to like signed overflow rules, because it is really easy to find problems using sanitizers.

The strict aliasing rules are not violated by typical POSIX socket code as a cast to a different pointer type, i.e. `struct sockaddr` by itself is well-defined behavior. (and POSIX could of course just define something even if ISO C leaves it undefined, but I don't think this is needed here)


> The strict aliasing rules are not violated by typical POSIX socket code as a cast to a different pointer type, i.e. `struct sockaddr` by itself is well-defined behavior.

Basically all usage of sendmsg() and recvmsg() with a static char[N] buffer is UB, is one big example I've run into. Unless you memcpy every value into and out of the buffer, which literally no one does. Also, reading sa_family from the output of accept() (or putting it into a struct sockaddr_storage and reading ss_family) is UB, unless you memcpy it out, which literally no one does.


Using a static char buffer would indeed UB but we just made the change to C2Y that this ok (and in practice it always was). Incorrect use of sockaddr_storage may lead to UB. But again, most socket code I see is actually correct.


> Compiler devs never did anything "unreasonable", they didn't sit down and go "mwahahaha we can exploit the heck out of UB to break everything!!!!"

Many compiler devs are on record gleefully responding to bug reports with statements on the lines of "your code has undefined behaviour according to the standard, we can do what we like with it, if you don't like it write better code". Less so in recent years as they've realised this was a bad idea or at least a bad look, but in the '00s it was a normal part of the culture.


What stops compiler makers from treating UB as platform-specific behavior rather than as something which cannot happen?

"You are not allowed to do this, and thus..." reasoning assumes that programmers are language lawyers, which is unreasonable.


    bool foo(some_struct* bar) {
        if (bar->blah()) {
            return true;
        }
        if (bar == nullptr) {
            return false;
        }
        return true;
    }
Can the compiler eliminate that nullptr comparison in your opinion yes or no? While this example looks stupid, after inlining it's quite plausible to end up with code in this type of a pattern. Dereferencing a nullptr is UB, and typically the "platform-specific" behavior is a crash, so... why should that if statement remain? And then if it can't remain, why should an explicit `_Nonnull` assertion have different behavior than an explicit deref? What if the compiler can also independently prove that some_struct->blah() always evaluates to false, so it eliminates that entire branch - does the `if (bar == nullptr)` still need to remain in that specific case? If so, why? The code was the same in both cases, the compiler just got better at eliminating dead code.


There isn't a "find UB branches" pass that is seeking out this stuff.

Instead what happens is that you have something like a constant folding or value constraint pass that computes a set of possible values that a variable can hold at various program points by applying constraints of various options. Then you have a dead code elimination pass that identifies dead branches. This pass doesn't know why the "dest" variable can't hold the NULL value at the branch. It just knows that it can't, so it kills the branch.

Imagine the following code:

   int x = abs(get_int());
   if (x < 0) {
     // do stuff
   }
Can the compiler eliminate the branch? Of course. All that's happened here is that the constraint propagation feels "reasonable" to you in this case and "unreasonable" to you in the memcpy case.


Why is it allowed to eliminate the branch? In most architectures abs(INT_MIN) returns INT_MIN which is negative


Calling abs(INT_MIN) on twos-complement machine is not allowed by the C standard. The behavior of abs() is undefined if the result would not fit in the return value.


Where does it say that? I thought this was a famous example from formal methods showing why something really simple could be wrong. It would be strange for the standard to say to ignore it. The behavior is also well defined in two’s complement. People just don’t like it.


https://busybox.net/~landley/c99-draft.html#7.20.6.1

"The abs, labs, and llabs functions compute the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. (242)"

242 The absolute value of the most negative number cannot be represented in two's complement.


I didn't believe this so I looked it up, and yup.

Because of 2's complement limitations, abs(INT_MIN) can't actually be represented and it ends up returning INT_MIN.


It's possible that there is an edge case in the output bounds here. I'm just using it as an example.

Replace it with "int x = foo() ? 1 : 2;" if you want.


> value constraint pass that computes a set of possible values that a variable can hold

Surely that value constraint pass must be using reasoning based on UB in order to remove NULL from the set of possible values?

Being able to disable all such reasoning, then comparing the generated code with and without it enabled would be an excellent way to find UB-related bugs.


There are many such constraints, and often ones that you want.

"These two pointers returned from subsequent calls to malloc cannot alias" is a value constraint that relies on UB. You are going to have a bad time if your compiler can't assume this to be true and comparing two compilations with and without this assumption won't be useful to you as a developer.

There are a handful of cases that people do seem to look at and say "this one smells funny to me", even if we cannot articulate some formal reason why it feels okay for the compiler to build logical conclusions from one assumption and not another. Eliminating null checks that are "dead" because they are dominated by some operation that is illegal if performed on null is the most widely expressed example. Eliminating signed integral bounds checks by assuming that arithmetic operations are non-overflowing is another. Some compilers support explicitly disabling some (but not all) optimizations derived from deductions from these assumptions.

But if you generalize this to all UB you probably won't end up with what you actually want.


More reasonable: Emit a warning or error to make the code and human writing it better.

NOT-reasonable: silently 'optimize' a 'gotcha' into behavior the programmer(s) didn't intend.


NOT-reasonable: expecting the compiler to read the programmer's mind.


OK, you want a FORMAL version?

Acceptable UB: Do the exact same type of operation as for defined behavior, even if the result is defined by how the underlying hardware works.

NOT-acceptable UB: Perform some operation OTHER than the same as if it were the valid code path, EXCEPT: Failure to compile or a warning message stating which code has been transformed into what other operation as a result of UB.


I don't understand, if the operation is not defined, what exactly the compiler should do?

If I tell you "open the door", that implies that the door is there. If the door is not there, how would you still open the door?

Concretely, what do you expect this to return:

  #include <cstddef>
  void sink(ptrdiff_t);
  ptrdiff_t source();

  int foo() {    
    int x = 1;
    int y;
    sink(&y-&x);
    *(&y - source()) = 42;
    return x;
  }
assuming that source() returns the parameter passed to sink()?

Incidentally I had to launder the offset through sink/source, because GCC has a must-alias oracle to mitigate miscompiling some UB code, so in a way it already caters to you.


Evaluated step by step...

Offhand, *sink(&y-&x);* the compiler is not _required_ to lay out variables adjacently. So the computation of the pointers fed to sink does not have to be defined and might not be portable.

It would be permissible for the compiler to refuse to compile that ('line blah, op blah' does not conform the the standard's allowed range of behavior).

It would also be permissible to just allow that operation to happen. It's the difference of two pointer sized units being passed. That's the operation the programmer wrote, that's the operation that will happen. Do not verify bounds or alter behavior because the compiler could calculate that the value happens to be PTRMAX-sizeof(int)+1 (it placed X and Y in reverse of how a naive assumption might assume).

The = 42 line might write to any random address in memory. Again, just compile the code to perform the operation. If that happens to write 42 somewhere in the stack frame that leads to the program corrupting / a segfault that's fine. If the compiler says 'wait that's not a known memory location' or 'that's going to write onto the protected stack!' it can ALSO refuse to compile and say why that code is not valid.

I would expect valid results to be a return of: 42, 1 (possibly with a warning message about undefined operations and the affected lines), OR the program does not compile and there is an error message which says what's wrong.


&y-&x doesn't require the variables to adjacent, just to exist in the same linear address space. It doesn't even imply any specific ordering .

> Again, just compile the code to perform the operation. If that happens to write 42 somewhere in the stack frame that leads to the program corrupting / a segfault that's fine. If the compiler says 'wait that's not a known memory location' or 'that's going to write onto the protected stack!

As far as the compiler is concerned, source() could return 0 and the line be perfectly defined, so there is no reason to produce an error. In fact as far as the compiler is concerned 0 is the only valid value that source could return, so that line can only be writing to y. As that variable is a local variable that going out of scope, the compiler omits the store. Or you also believe that dead store elimination is wrong?

> possibly with a warning message about undefined operations and the affected lines

There is no definitely undefined operation in my example; there can be UB depending on the behaviour of externally compiled functions, but that's true of almost any C++ statement.

What most people in the "compiler must warn about UB" camp fail to realize, is that 99.99% of the time the complier has no way of realizing some code is likely to cause UB: From the compiler point of view my example is perfectly standard compliant [1], UB comes only from the behaviour of source and sink that are not analysable by the compiler.

[1] technically to be fully conforming the code should cast the pointers to uintptr_t before doing the subtraction.


I'm not familiar with the stack-like functions mentioned, but that is indeed something it should NOT eliminate.

In fact, the compiler should not eliminate 'dead stores'. That should be a warning (and emit the code) OR an error (do not emit a program).

The compiler should inform the programmer so the PROGRAM can be made correct. Not so it's particular result can be faster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: