logoalt Hacker News

Fraterkesyesterday at 11:53 AM9 repliesview on HN

Is there any reason to not just switch to 1-based indexing if we could? Seems like 0-based indexing really exacerbates off-by-one errors without much benefit


Replies

SkiFire13yesterday at 12:12 PM

I'm not sure what that has to do with the article, but anyway: https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831...

That said, I'm not sure how 1-based indexing will solve off-by-1 errors. They naturally come from the fencepost problem, i.e. the fact that sometimes we use indexes to indicate elements and sometimes to indicate boundaries between them. Mixing between them in our reasoning ultimately results in off-by-1 issues.

show 1 reply
dgrunwaldyesterday at 2:52 PM

When accessing individual elements, 0-based and 1-based indexing are basically equally usable (up to personal preference). But this changes for other operations! For example, consider how to specify the index of where to insert in a string. With 0-based indexing, appending is str.insert(str.length(), ...). With 1-based indexing, appending is str.insert(str.length() + 1, ...). Similarly, when it comes to substr()-like operations, 0-based indexing with ranges specified by inclusive start and exclusive end works very nicely, without needing any +1/-1 adjustments. Languages with 1-based indexing tend to use inclusive-end for substr()-like operations instead, but that means empty substrings now are odd special cases. When writing something like a text editor where such operations happen frequently, it's the 1-based indexing that ends up with many more +1/-1 in the codebase than an editor written with 0-based indexing.

adrian_byesterday at 1:06 PM

This is a matter of opinion.

My opinion is that 1-based indexing really exacerbates off-by-one errors, besides requiring a more complex implementation in compilers, which is more bug-prone (with 1-based addressing, the compilers must create and use, in a manner transparent for the programmer, pointers that do not point to the intended object but towards an invalid location before the object, which must never be accessed through the pointer; this is why using 1-based addressing was easier in languages without pointers, like the original FORTRAN, but it would have been more difficult in languages that allow pointers, like C, the difficulty being in avoiding to expose the internal representation of pointers to the programmer).

Off-by-one errors are caused by mixing conventions for expressing indices and ranges.

If you always use a consistent convention, e.g. 0-based indexing together with half-open intervals, where the count of elements equals the difference between the interval bounds, there are no chances for ever making off-by-one errors.

GuB-42yesterday at 3:22 PM

Because it is not how computers work. It doesn't matter much for high level languages like LUA, you rarely manipulate raw bytes and pointers, but in system programming languages like Zig, it matters.

To use the terminology from the article, with 0-based indexing, offset = index * node_size. If it was 1-based, you would have offset = (index - 1) * node_size + 1.

And it became a convention even for high level languages, because no matter what you prefer, inconsistency is even worse. An interesting case is Perl, which, in classic Perl fashion, lets you choose by setting the $[ variable. Most people, even Perl programmers consider it a terrible feature and 0-based indexing is used by default.

layer8yesterday at 6:55 PM

1-based indexing doesn’t work well as soon as you have a start offset within a sequence, from which you want to index. Then the first element is startIndex + 0, not startIndex + 1. 0-based indexing generalizes better in that way.

pansa2yesterday at 2:20 PM

Fundamentally, CPUs use 0-based addresses. That's unavoidable.

We can't choose to switch to 1-based indexing - either we use 0-based everywhere, or a mixture of 0-based and 1-based. Given the prevalence of off-by-one errors, I think the most important thing is to be consistent.

tialaramexyesterday at 12:06 PM

I would bet that in the opposite circumstance you'd say the same thing:

"Is there any reason to not just switch to 0-based indexing if we could? Seems like 1-based indexing really exacerbates off-by-one errors without much benefit"

The problem is that humans make off-by-one errors and not that we're using the wrong indexing system.

show 1 reply
bruce343434yesterday at 12:10 PM

You say "seems like", can you argue/show/prove this?

show 1 reply
naaskingyesterday at 3:37 PM

> Is there any reason to not just switch to 1-based indexing if we could? Seems like 0-based indexing really exacerbates off-by-one errors without much benefit

You'd just get a different set of off-by-one errors with 1-based indexing.