I don't care if computers think in 0s and 1s or if arrays should start at 0 beca...

scott_s · on Aug 21, 2009

We have this convention for the benefit of humans. Indexing starting at 0 has the convenient property that when we modulus an arbitrary number by the size of a range, the result is a valid index into that range. This comes up when implementing things that map a larger set onto a smaller set, such as buffers or hash tables.

We could still achieve the same effect if we started indexing at 1, but it would require more code, and it would be less clear.

TweedHeads · on Aug 21, 2009

"We could still achieve the same effect if we started indexing at 1, but it would require more code, and it would be less clear."

Really?

How about dealing with strings?

"123456789" how many chars we have? 9

char[1] = 1

char[2] = 2

:

char[9] = 9

Now, let's try the C approach:

"123456789"

char[0] = 1

char[1] = 2

:

char[8] = 9

where is char "1"? at zero index!

how many elements we have?

last index plus one!

see? there is already confusion in place, or as you say, more code and less clear...

Locke1689 · on Aug 21, 2009

"0123456789"

  ch[0] = 0;
  ch[1] = 1;
  ch[2] = 2;
  :
  ch[9] = 9;

  ch[1] = 0;
  ch[2] = 1;
  ch[3] = 2;

....

mononcqc · on Aug 21, 2009

Please re-read his question: "how many chars we have?" "how many elements we have?"

  ch[1] = 0;
  ch[2] = 1;
  ch[3] = 2;
  ...
  ch[10] = 9;

This is about having the Nth element of the array with the index N rather than N-1.

scott_s · on Aug 21, 2009

That's not code, that's English.

When we start indexing from 0, the only time we need to do a +/- 1 is when we need to index the last element. All other times (iterating, mapping) we don't need to.

Also, starting indices at 1 would change the idiomatic iteration to:

   for (int i = 1; i <= SIZE; ++i)

I find that the <= and >= operators add more to my cognitive load than the strict relation operators < and > because there's an or in there. I don't find the alternative to that idiom any better:

  for (int i = 1; i < SIZE + 1; ++i)

Your original argument was that we should make it easy for humans to understand, not computers. I think that starting indices from 0 is easier for humans to understand because it simplifies the code we must write and read.

TweedHeads · on Aug 21, 2009

No, your language limitations don't have to force the rest of the world to accept them.

As I alreay said, some languages use the simpler construct:

For i=1 to N

And C could easily use:

for(i=1;i==n;i++)

just by changing the way the loop condition works.

So as you say, it is all about the language.

jonsen · on Aug 21, 2009

No way I would trust a language design with your semantics for this syntactic construct: for(i=1;i==n;i++)

thunk · on Aug 21, 2009

But he was making the point that there are valid human reasons for zero-based indexing - e.g. keeping the lower bound within the natural numbers; having the upper bound be the number of elements in the preceding sequence, etc. The "way computers think" doesn't enter the discussion.

TweedHeads · on Aug 21, 2009

"keeping the lower bound within the natural numbers; having the upper bound be the number of elements in the preceding sequence, etc."

Read your post and understand the same can be said of using one-based indexing.

thunk · on Aug 21, 2009

That's false. You'd be forced to sacrifice the exclusive upper bound, which would then force you to sacrifice the difference between the upper and lower bounds being the number of elts in the collection. Unless the lower bound was exclusive, in which case it could be an unnatural number.

No, the only way there's no lump in the carpet is his way.

TweedHeads · on Aug 21, 2009

Look at your hand and start counting your fingers while naming them:

[1] thumb

[2] index

[3] middle

[4] ring

[5] pinkie

so we have a simple range [1..5]

which can be represented in so many ways in computer programs:

for i=1 to 5 print finger[i]

for(i in [1..5]) print finger[i]

my first finger[1] is my thumb

my last finger[5] is my pinkie

how many fingers we have? as many as the last index in the list: 5

-

now, C programmers like to count this way:

for(i=0;i<5;i++) print finger[i]

where finger[0] is thumb

and finger[4] is pinkie

how many fingers we have?

as many as the last index in the list plus one: 4+1

how human-like!

And that's why you get so messy when trying to get the string position of a substring:

if(pos>-1) exists, since pos=0 means the substring is in the starting position

again, how human-like!

But how dare I argue with C programmers without being burned at the stake?

jongraehl · on Aug 21, 2009

What makes you think people who 0-index arrays and prefer half-open intervals count any differently? Is this argument directed at a four year old?

4+1=5 never enters into it. What a rubbish argument. I might as well complain that your [a,b] has (b-a+1) integers in it. (5-0)=5 so the half open interval has 5 things in it.

What's the measure of a real interval 1<=x<=5? 4. 0<=x<5? 5.

TweedHeads · on Aug 21, 2009

Don't try your cheap tricks on me.

We go from 1 to 5, so the math would be 5-1 +1

You go from 0 to 4, the math would be the same, 4-0 +1

Using 5 as your upper bound, then starting from 0 to 4 is the same as me using 6 as my upper bound then going from 1 to 5.

maweaver · on Aug 21, 2009

for i=1 to 5 print finger[i]

for(i in [1..5]) print finger[i]

You're using an inclusive upper range here. Not that that's necessarily wrong, but it isn't standard for the reason Dijkstra mentioned (how many fingers do I have? 5 - 1 = 4. Oops.)

TweedHeads · on Aug 21, 2009

"how many fingers do I have? 5 - 1 = 4. Oops."

Upper-Lower+1, easy. Whenever the lower is 1, just the upper is enough.

In your case you go from 0 to 4, how's that different from how many fingers you have? 4-0=4 Oops.

gloob · on Aug 22, 2009

Congratulations; you have successfully introduced a whole new way to have off-by-one errors.

People are used, due to forty years or more of tradition in the great majority of major languages, to zero-based indexing. Tossing that away Because It's Inelegant won't do much good to anyone, especially when you take into consideration the fact that an awful lot of math is more convenient with 0-based indexing.