logoalt Hacker News

beeforporkyesterday at 8:22 AM6 repliesview on HN

The UB in unaligned pointers is even worse: an unaligned pointer in itself is UB, not only an access to it. So even implicit casting a void*v to an int*i (like 'i=v' in C or 'f(v)' when f() accepts an int*) is UB if the cast pointer is not aligned to int.

It is important to understand that this is a C level problem: if you have UB in your C program, then your C program is broken, i.e., it is formally invalid and wrong, because it is against the C language spec. UB is not on the HW, it has nothing to do with crashes or faults. That cast from void* to int* most likely corresponds to no code on the HW at all -- types are in C only, not on the HW, so a cast is a reinterpretation at C level -- and no HW will crash on that cast (because there is not even code for it). You may think that an integer value in a register must be fine, right? No, because it's not about pointers actually being integers in registers on your HW, but your C program is broken by definition if the cast pointer is unaligned.


Replies

thomashabets2yesterday at 9:06 AM

Author here.

> an unaligned pointer in itself is UB

Yup. Per the "Actually, it was UB even before that" section in the post.

> UB is not on the HW, it has nothing to do with crashes or faults

Yeah. I tried to convey this too, but I'm also addressing the people who say "but it's demonstrably fine", by giving examples. Because it's not.

account42yesterday at 8:50 AM

Which is totally fine and expected for any decent programmer. Casting pointers is clearly here be dragons territory.

show 2 replies
201984yesterday at 12:23 PM

>an unaligned pointer in itself is UB, not only an access to it.

Can someone point to where the standard states this?

stilley2yesterday at 10:56 AM

Does that mean that if I have a struct with #pragma pack(push, 1) I can't use pointers to any members that don't happen to be aligned?

show 1 reply
imtringuedyesterday at 11:56 AM

The problem with C UBI is that originally it meant the compiler has the freedom to map your code to the hardware inspite of machine instructions differing slightly between one another. The same C program may express different behaviour depending on which architecture it is running on.

This type of UB is fine and nobody really complains about hardware differences leading to bugs.

However, over time aggressive readings of UB evolved C into an implicit "Design by Contract" language where the constraints have become invisible. This creates a similar problem to RAII, where the implicit destructor calls are invisible.

When you dereference a pointer in C, the compiler adds an implicit non-nullable constraint to the function signature. When you pass in a possibly nullable pointer into the function, rather than seeing an error that there is no check or assertion, the compiler silently propagates the non-nullable constraint onto the pointer. When the compiler has proven the constraints to be invalid, it marks the function as unreachable. Calls to unreachable functions make the calling function unreachable as well.

show 1 reply
tovejyesterday at 8:42 AM

But that seems obvious. You can't load an integer from an unaligned address.

It's not only C-level is it. There's no (guarantee across architectures for) machine code for that either.

show 4 replies