> Clang and gcc do reasonable things - as you have pointed out there are helpful flags to configure it too.
Without these specific flags, they don't do 'reasonable things'. They happily assume that UB can't happen and use that information to declare certain code paths dead and optimise them away.
The simplest example is something like this:
signed char x = 20;
while(x < x+1) {
do_something();
x++;
}
GCC and Clang when told to optimise will happily assume that you have an infinite loop here. (Unless explicitly told otherwise via the compiler flags we mentioned.)
The parent post has a mistake - it should use an "int" type instead of "signed char", otherwise the implicit promotion done by a+b breaks the desired showcase.
That fixed, it wouldn't be an infinite loop if taking the "x+1" to do what the hardware would do, i.e. twos complement; "x < x+1" would become false for x==INT_MAX; and yet clang considers it UB, and thus the loop is UB once it hits the wraparound point, and that can result in unintentional computer melting: https://godbolt.org/z/833KzYY5G
Of course you still generally need to have whatever harmful code you don't want ran here to still be present in the binary somewhere, and "melt_computer();" is a rather weird thing to have, but a "perform_heavy_work();" or "tolerate_high_temps_for_a_bit();" or "void debug_override_temperature_readings(){...}" are more realistic. (of course you might need special permissions to do those, but, uh, it's possible to get said permissions, especially if the point of the software is to operate those things.. and there are plenty of harmful things one can do without special perms)
(edit: actually, even without that char↔int fix, the parent post still shows UB because infinite loops are UB in C (as long as the condition isn't a compile-time constant) (also the x++ is UB on reaching the char limit); copy-pasting it directly into harmless_loop still results in computer melting)
No, infinite loops aren't generally UB in C. Only when they essentially "don't do anything".
Assume that `do_something();` does some IO, and C is perfectly happy with that loop.
You could have also added extra conditions to make the loop look less infinite to the C compiler. Eg do a check in the loop body, and optionally break from it.
> Of course you still generally need to have whatever harmful code you don't want ran here to still be present in the binary somewhere, [...]
Well, assume that we are eg looking at the 'sudo' program. In any case, almost any code can become harmful, if memory gets corrupted, and if attackers control the input.
Oops, yeah, need to remove the `do_something();` for the loop to be UB.
> Well, assume that we are eg looking at the 'sudo' program. In any case, almost any code can become harmful, if memory gets corrupted, and if attackers control the input.
Not really; OOB stores, sure, but, other than that, even something basic like username comparison or something resulting in OOB loads on too-long names could technically be completely safe if OOB loads were defined as "either returns an arbitrary value, or crashes", just resulting in superfluous rejections.
And, in a safe language, no matter how wrongly you'd implement flag/argument parsing (besides some equivalent of just straight up passing the args to system() or equivalent), as long as the final thing actually processing the request & comparing passwords doesn't assume anything specific of the parsed internal data format, it could cause no actual exploitable issues.
> and that can result in unintentional computer melting:
If your program already has the capability to melt your computer, then you can accidentally trigger that with a bug. That’s a massive if, and that’s a risk of any bug in such a program.
Once again, you can already configure the behavior to avoid this (ftrapv, or fwrapv) which is as much as rust will do.
But UB extends that to "a bug anywhere can potentially cause anything to go wrong", whereas traditional logic bugs require the bug to be functionally related to the bad-if-misused code (for which you can do things like carefully vet the potentially-harmful code and be sure your program will never do anything harmful regardless of how many bugs all other code in the program has).
See all the exploits that go from a single OOB write (or other sources of UB) to arbitrary code execution; it's really not that hard. Whereas such spooky-bugs-at-a-distance is just plain impossible in safe Rust or Java.
> Whereas such spooky-bugs-at-a-distance is just plain impossible in safe Rust or Java.
There are all kinds of bugs in these compilers all the time. Not to mention what Java might do with a multithreading bug. I don’t think you can assert that.
And once again - you can match those languages intended treatment of signed integers by using a compiler flag.
I think clang should insert a ret at the end of the function. Gcc chooses to infinite loop instead which is far better. This is unfortunate for clang.
> There are all kinds of bugs in these compilers all the time. I don’t think you can assert that.
That's a separate discussion; generally you need specific conditions in the source code to have compiler bugs affect your code (and for any given compiler bug it's much more likely for it to visibly break your program (i.e. be a clear release blocker) than quietly break on some specific user input; assuming you have appropriate testing).
I don't think I've ever even heard of any exploitable cases of spooky-bug-at-a-distance in Java, whereas such are commonplace for C projects.
Bad multi-threading in Java still won't produce spooky bugs at a distance - worst you may get is a long torn on a 32-bit boundary on 32-bit systems, or reordered reads/writes.
> I think clang should insert a ret at the end of the function.
......But UB means it's not required to do that... ...that's, like, the main thing the discussion is about. This isn't any form of issue or bug in clang to be fixed, it's intentionally chosen that that behavior is fine on UB.
(to be clear, despite all this, I think UB is a fair enough thing, and am perfectly fine with compilers "exploiting" it for perf (I've even had a case of compiler optimizations on integer overflow fixing a bug! found out when I realized I didn't run certain tests on ubsan/debug builds), and, indeed, outside of OOB stores, is generally not too exploitable; but it's still very far from not being potentially very problematic (and the "potential"ness is generally independent from specific code), especially for projects where safety matters)
Without these specific flags, they don't do 'reasonable things'. They happily assume that UB can't happen and use that information to declare certain code paths dead and optimise them away.
The simplest example is something like this:
GCC and Clang when told to optimise will happily assume that you have an infinite loop here. (Unless explicitly told otherwise via the compiler flags we mentioned.)