logoalt Hacker News

ncrucestoday at 8:00 AM1 replyview on HN

It's not really signed vs unsigned that's the issue, IMO. It's (mostly, in C) undefined behavior and implicit conversions?

I'm not sure Go is saner just because len is an int. Well, maybe, depending on how you look at it. Defining len to be signed int, means the largest valid len is half your address space, which also means half of all possible indexes are always invalid; which makes some things easier.

But it's really that integer arithmetic is not undefined behavior regardless of signedness, that bounds are checked, and that even indexing your slice with an int64 on a 32-bit CPU does the full correct bounds check. In fact, you can use any integer type as an index.

Given all of the above, indexing with a uint or an int is actually indiferent. In that case, the bound check is a single unsigned <len compare (despite the fact that len is signed).

What's really painful, is trying to handle a full 32-bit address space with 32-bit addresses and sizes, like in Wasm; you need 33-bit math. So in a sense, limiting sizes to 31-bit (signed) does help. But at the language level, IMO, the rest matters more.


Replies

ueckertoday at 9:13 AM

For signed overflow we have sanitizers, and for conversions C compilers warnings in C. Bounds checking can also be done with sanitizers (but is a bit more tricky). So no, I do not think the undefined behavior is really a big problem. In fact, it helps us find the problem because every overflow can be considered a programming error.

Error due to unsigned wraparound are a much bigger issue, because the lead to subtle issues where neither automatic warnings nor sanitizers help, exactly because it is well-defined and no automatic tool can tell whether the behavior is intended or wrong.

show 2 replies