From the LLVM developer mailing list this remarkable exchange in which Chris Lattner of LLVM says that the compiler use of undefined behavior (UB) is so “crappy” that the only solution is to abandon C programming (my bold).

On Jul 7, 2017, at 1:40 PM, Peter Lawrence <peterl95124 at sbcglobal.net> wrote:
> Chris,
>           The issue the original poster brought up is that instead of a compiler 
> that as you say “makes things work” and “gets the job done” we have a compiler
> that intentionally deletes “undefined behavior”, on the assumption that since it 
> is the users responsibility to avoid UB this code must be unreachable and 
> is therefore safe to delete.
> 
> It seems like there are three things the compiler could do with undefined behavior
> 1)   let the code go through (perhaps with a warning)
> 2)   replace the code with a trap
> 3)   optimize the code as unreachable (no warning because we’re assuming this is the users intention)

Hi Peter,

I think you have a somewhat fundamental misunderstanding of how UB works (or rather, why it is so crappy and doesn’t
 really work :-).  
The compiler can and does do all three of those, and it doesn’t have to have consistent algorithms for how it picks.  I highly recommend you read some blog posts I wrote about it years ago, starting with:
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

John Regehr also has written a lot on the topic, including the recent post:
https://blog.regehr.org/archives/1520

What you should take from this is that while UB is an inseperable part of C programming, 
that this is a depressing and faintly terrifying thing.  
The tooling built around the C family of languages helps make the situation “less bad”, but it is still pretty bad.  
The only solution is to move to new programming languages that don’t inherit the problem of C.  
I’m a fan of Swift, but there are others.

In the case of this particular thread, we aren’t trying to fix UB, we’re trying to switch one very specific syntactic idiom from UB to defined.

-Chris

When Lattner  writes “The compiler can and does do all three of those, and it doesn’t have to have consistent algorithms for how it picks”  the key terms are “can” and “doesn’t have to”. The C standard may permit “depressing and … terrifying” operations in a compiler but there is no requirement for compilers to make UB into a minefield for programmers. The compiler writers have seized onto what is essentially a loophole in the standard as permission to do stupid things and call them optimizations. It seems only reasonable to hope that WG14-C would consider that their standard is used by compiler writers to justify deprecating use of the language, but maybe they don’t care either. The other interesting fact I learned from this exchange is that it is impossible to write conforming C implementations of “malloc” and “free”. Why? Because once “free” returns, any access to the pointer is UB, so malloc cannot allocate the same memory again.

 

 

 

C compiler developers are hostile to C programming
Tagged on: