So you think you know C: the Ksplice Pointer Challenge

matthavener · on Oct 18, 2011

They really should be listing these answers in terms of sizeof(int) instead of assuming all 64-bit platforms have sizeof(int) == 4. Pointers are confusing in C, but so is the common assumption that sizeof(int) is always 4.

cjensen · on Oct 18, 2011

sizeof(int) is either 2 or 4 on all reasonable architectures. On all modern and reasonable architectures, it must be 4.

Short answer: if sizeof(int)=8, then you lose either the 2 byte integer, or the 4 byte integer. This would make it harder to minimize memory footprint in programs which work with large amounts of data.

The reason for this is that char, short, int, long, and long long are the only names you have for various integer sizes. Since sizeof(char) is fixed at 1 and sizeof(char) <= sizeof(short) <= sizeof(int) if you make int eight bytes, you must either drop the two-byte or four-byte integer since you only have one name left for it. Dropping one of those types will make programmers unhappy since they packing structs careful is a valid thing to do for memory minimization in large-memory programs.

Note that int32_t and friends are not valid names since they MUST be a typedef for one of char, short, int, long or long long.

Disclaimer: if C11 has fixed this, I'm ignorant of it.

ajross · on Oct 18, 2011

C99 specified stdint.h, which includes int16_t/uint16_t. So compliant compilers are required to support it even if it doesn't map to a built-in type. So you won't lose the short.

That said, it wouldn't make it any less insane; so no one does this and in fact int is 32 bits everywhere except microcontrollers (and 8086, for the tiny handful of people writing BIOS or bootloader code).

cjensen · on Oct 18, 2011

Each of the intXX_t typedefs are optional. Each typedef is available iff the implementation has a type which is exactly XX bits.

ajross · on Oct 18, 2011

Right, which it does: we're talking about ABI variations within a single architecture. The OS and toolchain enforcing ILP64 vs. LP64 semantics on C programs has nothing to do with the i386's ability to operate on 16 bit chunks.

caf · on Oct 18, 2011

The "implementation" in the meaning of the C standard includes the OS and toolchain. If the C toolchain does not provide a 16 bit type, then it need not define (u)int16_t, regardless of the CPU that it is running on.

nitrogen · on Oct 18, 2011

However, as mentioned recently in an HN comment somewhere there are architectures that can only operate on 32-bit or larger chunks, so sizeof(char) == sizeof(int) (I think it may have been an older Cray). I can't find the specific comment, but here's one that mentions a platform with sizeof(char) == 16: http://news.ycombinator.com/item?id=3112704

ajross · on Oct 18, 2011

All machines were word orientated until the IBM 360's arrived and many persisted well into the 80's (the PDP-10 is a particularly famous one for hackers). Many of them got C compilers at one point or another.

That's not really the issue though. My point was that the choice of ILP64 vs. LP64 on a single architecture could not cause you to "lose" a 16 bit quantity. It can't, because those machine instructions obviously don't go away when you change your compiler's calling conventions. So a C99-compliant compiler would still be required to provide int16_t.

Which is... maybe too much minutiae even for a C minutiae thread. But it was my point, anyway.

nitrogen · on Oct 18, 2011

Yes, I see. It can be annoying when someone widens the scope of an already-narrowed discussion, as my comment tried to do. Thanks for your restatement and clarification.

dfox · on Oct 19, 2011

sizeof(char) is always 1, because the number that it returns is not in bytes but in chars. What you mean is CHAR_BIT in limits.h.

This seems to be common state of things on almost anything, that is designed to be fast first and "C-compatible" second.

nitrogen · on Oct 19, 2011

D'oh! Yes, I meant CHAR_BIT. Thanks for the correction.

tzs · on Oct 18, 2011

What if you make the byte 16 bits? C99 doesn't require it to be exactly 8 bits--just that it be large enough to represent the basic character set.

saurik · on Oct 19, 2011

FWIW, I've actually used a really silly chip once that had a horrible-tastic toolchain that it came with, where char and int were the same size, and both either 16 or 32 bits (I forgot whether it was 16 or 32, unfortunately). (In case anyone doesn't notice: on such a system, sizeof(int) is still 1, as it is measured in units of sizeof(char).)

smcl · on Oct 19, 2011

Which chip? ADI's SHARC has that and it's far from "really silly"

saurik · on Oct 19, 2011

I don't remember (this was 2002); I just remember it being a really cheap part on my friend's MicroMouse motor controller (I wrote the maze searching algorithm for them), and that the C compiler didn't actually work very well.

marshray · on Oct 18, 2011

Then breaking a little quiz like this will be the least of your worries.

Of course, that's the kind of logic that got us into this situation today. Fortunately we should to be able to live with CHAR_BIT == 8 for the forseeable future.

TorKlingberg · on Oct 18, 2011

I have actually programmed for an architecture where sizeof(int) is 1. Yes, a very unusual one.

aidenn0 · on Oct 19, 2011

You mean many 8-bit micros and also any word-addressable architectures? I don't think it's that unusual.

angusgr · on Oct 19, 2011

Which 8-bit micros? avr-gcc, PIC microchip C, HC11 gcc, SDCC (for HC11, 8051, Z80, etc.) all have 16-bit ints.

Also, the C standard "Sizes of Integer Types" section sets out minimum ranges each type must allow, and I think int has always been +/-32767. ie see the section at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

TeMPOraL · on Oct 18, 2011

Isn't the assumption that sizeof(int) = platform type in bytes (e.g. 4 for 32-bit and 8 for 64-bit platforms)? I expected int on 64-bit platform to be 8 bytes in size.

scott_s · on Oct 18, 2011

Typically on 32-bit:

  sizeof(char) == 1
  sizeof(short) == 2
  sizeof(int) == 4
  sizeof(long) == 4
  sizeof(long long) == 8
  sizeof(void* ) == 4

Typically on 64-bit:

  sizeof(char) == 1
  sizeof(short) == 2
  sizeof(int) == 4
  sizeof(long) == 8
  sizeof(long long) == 8
  sizeof(void* ) == 8

adestefan · on Oct 18, 2011

You'll see this typically listed as a LP64 model. The other common option is ILP64, where int == long == long long == void * == 8.

It should also be noted that this has nothing to do with the underlying hardware. Instead it's a decision made by the OS ABI. There's nothing that says you can't have sizeof(int) == 8 on a 32-bit system.

omellet · on Oct 18, 2011

Just to add: Windows x64 is LLP, int = 4, long = 4, long long = 8, void* = 8

dfox · on Oct 19, 2011

While it has nothing to do with hardware per se, ILP64 is often seen on platforms, where some significant 32 bit operations (often memory accesses, but surprisingly sometimes even ALU ops) are (or used to be) slower than 64 bit.

shin_lao · on Oct 18, 2011

Absolutely not, on Windows 64 for example sizeof(void *) == 8 and sizeof(int) == 4.

fmota · on Oct 18, 2011

It depends on your platform AND on your compiler, so just saying "Windows 64" isn't enough information. Conceivably, there are compilers for "Windows 64" such that sizeof(int) == 8.

Your point (that int isn't a qword on all 64bit architectures) still stands. But your statement is potentially incorrect.

adestefan · on Oct 18, 2011

The OS ABI basically defines what the compiler will do. While it's possible to run a compiler in ILP64 mode on Windows 64, you won't get too far if you try to pass a 64-bit integer >4^32 into a Windows system call.

jheriko · on Oct 19, 2011

... or pass any 64-bit integer. Windows API calls are mostly "stdcall" and use stack for parameter passing - pushing 8 bytes instead of 4 could be disasterous in many ways - especially considering that the callee cleans the stack in this case.

niklasl · on Oct 19, 2011

There is no stdcall/ccall distinction on 64-bit Windows, there is only one ABI convention that sadly uses a different set of registers for parameter passing than Linux. Only the first four parameters uses registers, the rest is passed on the stack.

jedbrown · on Oct 18, 2011

That is true on 64-bit Linux too, but on Windows64 sizeof(long)==4. See LP64 versus LLP64.

coliveira · on Oct 18, 2011

Even on 64bit machines an int has 4 bytes. I don't think this is mandated by the standard, though. I guess that many implementations believe that this is better than having a huge size difference between 32 and 64 architectures. There is always long int if you want a 64bit integer.

wazoox · on Oct 18, 2011

That's precisely the error I made in this quizz :)

wdaher · on Oct 18, 2011

I don't disagree, but I thought it was nicer for there to be a more concrete, actual memory address for the answer.

I thought about explicitly stating "sizeof(int) is 4 on this system" in the intro blurb, but that primes you a bit more than I'd like for the answer to #2, so I thought it was a little cleaner not to.

philh · on Oct 18, 2011

I feel that anyone who doesn't already know "it depends on sizeof(int)" isn't going to get it anyway; I was briefly thrown off because I thought "surely if sizeof(int) matters like I think it does, I would have been told what it is?"

But I'm more annoyed that I failed the last two by not understanding C than that I failed the second by guessing sizeof(int) incorrectly.

calloc · on Oct 18, 2011

My first instinct was that sizeof(int) was the same width as the system architecture (64 bits)... specifically because you mentioned it was a 64 bit system.

Either way, after I realised that sizeof(int) == 4 the test was surprisingly simple. Are pointers really that hard to grasp?

rwg · on Oct 18, 2011

Or they could use C99's int32_t in the problem's code to make the size of the integer explicit... (Or just state that sizeof(int) == 4 on this hypothetical system.)

tptacek · on Oct 18, 2011

Is there a common 64 bit architecture that is ILP64 as opposed to LP64?

ajross · on Oct 18, 2011

No. The gotcha of course is on the other side. Win64 is P64, so a long won't (!!?@#!) hold a pointer.

Edit: also, pedantry: the assignment of types to sizes is part of the ABI, not the architecture.

wdaher · on Oct 18, 2011

http://en.wikipedia.org/wiki/64-bit#64-bit_data_models has a few examples -- but I think the answer is basically no?

apaprocki · on Oct 18, 2011

In AIX on POWER you can choose between LP64, LLP64, or ILP64.

Erwin · on Oct 18, 2011

Well, sizeof(int) (and short, and long!) could also be 1, and %p could print out the pointer address as a roman numeral. All legal according to standard.

What commonly used platforms have a 64-bit ints? The only one I vaguely recall are really, really old versions of Solaris. After a while IIRC they decided that ILP64 was too much of a PITA and went with LP64.

jedbrown · on Oct 18, 2011

The standard requires that "int" be able to represent from -32767 to +32767. Unless "byte" is redefined to contain a different number of bits, this means that sizeof(int) >= 2. Similarly, long has a range that requires at least 32 bits.

Erwin · on Oct 18, 2011

sizeof(char) is 1 per definition and also one byte, but not necessarily 8 bits.

Apparently some DSPs cannot address an 8-bit quantity -- they're not easy to find but I've found a comp.lang.c posting where Jack Klein mentions a 32-bit Sharp DSP with CHAR_BIT = 32 (so sizeof(char) == sizeof(int) == 1 !) and a Texas Instruments TMS32F28xx DSP with a 16 bit CHAR_BIT.

On such a system, C99 might not define int8_t but you could use int8_least8_t.

My guess is that they are more common than 64-bit ints but they might be supported by their own weird C Compiler/toolchain.

caf · on Oct 18, 2011

The interesting thing about such architectures is that the common stdio idiom:

  int c;

  while ((c = getchar()) != EOF) {

does not work properly on them.

marshray · on Oct 18, 2011

Yep.

I could have used the hint about what the output of %p looks like (I missed the leading 0x). Of course nobody's keeping score, but that doesn't seem essential to the question.

cygx · on Oct 18, 2011

Just to be an onerous bastard: As the code involves undefined behaviour, it's superfluous to reason about any output.

Only values of type void* are valid arguments in case of the %p conversion specifier as there need not be a uniform pointer representation.

pwaring · on Oct 18, 2011

Indeed, and if you use -Wall -Wextra -ansi -pedantic you get:

pointer-test.c:7:3: warning: format ‘%p’ expects type ‘void * ’, but argument 2 has type ‘int * ’

Same goes for using -std=c99 instead of -ansi, although casting to (void *) does get rid of the warning.

bcantrill · on Oct 18, 2011

This is a bit of an idiotic question because one is not testing knowledge of computing or software systems, but rather trivial knowledge of an arcane corner of the language. Indeed, this question is an interview anti-pattern that I have historically labelled the "where-is-the-bathroom-in-my-house" question: if someone has not been in your house, they would not know, and if someone were in your house and had to take a leak, I trust they could figure it out. In my experience, these questions are most likely to be asked by intellectual midgets who themselves would not be able to answer an equivalent (but different) question.

So in the spirit of performing that experiment and exploring this interview anti-pattern, here's my counter-challenge, which I argue is intellectually equivalent:

  #include <stdio.h>

  void
  foo()
  {
          printf("in foo\n");
  }

  void
  main()
  {
          void (*func)(void) = foo;

          (************************************func)();
  }

What does that program do? Yeah, exactly: you just ran it. And you're surprised, aren't you? And most importantly: who cares? Certainly not I when I'm interviewing you -- where you can trust I will ask you deeper questions than language arcana...

adestefan · on Oct 18, 2011

This is a big deal when you're passing multi-dimensional arrays into functions.

caf · on Oct 18, 2011

I don't need to run that to know that it calls foo exactly once. Function designators decay to a pointer to the function if they are not the subject of the & or sizeof operators, so the string of *s simply converts it repeatedly between a function designator and a function pointer.

argv_empty · on Oct 18, 2011

This is a bit of an idiotic question because one is not testing knowledge of computing or software systems, but rather trivial knowledge of an arcane corner of the language.

Keep in mind that this is Ksplice; for the work they do, this might not be meaningless arcana.

0x09 · on Oct 18, 2011

Since you didn't provide the answer, I'll take a whack at it. It'll call foo() as usual, since function pointers dereference to themselves, right?

IgorPartola · on Oct 18, 2011

Not to be confused with:

  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>
  int main() {
      int *x = malloc(sizeof(int) * 5);
      memset(x, 0, sizeof(int) * 5);
  
      printf("%d\n", *x);
      printf("%d\n", *(x+1));
      printf("%p\n", x);
      printf("%p\n", x+1);
      printf("%p\n", &x);
      printf("%p\n", &x+1);
  
      return 0;
  }

matthavener · on Oct 18, 2011

Or, not to be confused with:

  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>
  void foo(int x[5]) {
      printf("%p\n", x);
      printf("%p\n", x+1);
      printf("%p\n", &x);
      printf("%p\n", &x+1);
  }
  int main() {
      int x[5];
      foo(x);
  
      return 0;
  }

bbq · on Oct 18, 2011

To be confused with:

    #include <stdio.h>
    
    void foo(int (*x)[5]) {
        printf("%p\n", *x);
        printf("%p\n", *x+1);
        printf("%p\n", x);
        printf("%p\n", x+1);
    }

    int main(void) {
        int x[5];
        foo(&x);
    
        return 0;
    }

teichman · on Oct 18, 2011

Can you elaborate on this?

matthavener · on Oct 18, 2011

In C, the array as a function parameter is different from an array as a variable/struct member. So, as a parameter, sizeof(x) is sizeof(int*). As a variable, sizeof(x) is sizeof(int[5]).

tptacek · on Oct 18, 2011

I got the 4th question wrong; my only caveat is, professional C programmers probably all learn to avoid constructions like this in favor of more explicit ones.

zb · on Oct 18, 2011

You are mistaken (as is _delirium). Pointers to arrays are fundamental to how multi-dimensional arrays are implemented in C, and this question is designed to test your understanding of them. (It is most certainly not about a fancy way of getting a pointer to the first element after the array, since this pointer has a completely different type.)

Multidimensional arrays are arrays of arrays, not arrays of pointers to arrays (as in Java). Therefore when manipulating, say, rows of a 2-dimensional array you are dealing with pointers to arrays like &x in this example.

It's not an every-day thing but it does come up, and in some specialised fields no doubt it is very common.

Natsu · on Oct 18, 2011

> (It is most certainly not about a fancy way of getting a pointer to the first element after the array, since this pointer has a completely different type.)

Have to be careful there. Going more than one after the end of the array is undefined (see 6.17):

http://c-faq.com/~scs/cgi-bin/faqcat.cgi?sec=aryptr

Actually, that whole FAQ should be of interest to anyone who liked this article. No matter how many times I read through it, I seem to find something new.

kristopolous · on Oct 18, 2011

I use it all the time in network and graphic programming, but I name the variables to state the fact. It's like a poor man's struct; sometimes you are doing say, some kind of packet analysis and you have some really convoluted spec to work off of. You could either formalize it in a bunch of headers and try to synthesize the structure of it or you can use tricks like this in order to parse it nicely.

In graphics programming, or more specifically, file format programming, sometimes I have to parse through, or generate a colortable and then go through a bunch of serialized data and transform that into a buffer where I have something like coordinate[x][y]{[z]}.{rgb/cmyk/yuv/hsv/rgba/bgr...} and I have a bunch of unions and structs; having char * data[3] as my payload and then being able to offset around into it is fantastically convenient, especially when trying to apply kernels or do transformations over the entire space.

The code is far more readable at the lower level manipulation with this kind of stuff.

And when you say 'Well I use OTS solutions for graphics' and I say "I do too, when I can. But when I can't, it's nice to be able to express what I need to do effeciently"

_delirium · on Oct 18, 2011

Same here. For the cases where you really do want to find the first memory address past the end of the array (if you're doing some hackish memory management, probably), I think x + sizeof(x) is more idiomatic than &x+1, because it avoids that particular arrays-are-almost-pointers weirdness in favor of regular pointer arithmetic.

Though I used to program a lot of C, I'm not a C wizard by a longshot, so I might be wrong. Are there cases where using expressions based on &x, where x is an array name, is idiomatic C?

edit: Stupid mistake, see cygx's reply (sizeof(x) gives the size of the array x in bytes, not in elements).

cygx · on Oct 18, 2011

The idiomatic way to get a pointer to one past the end of an array x with element-type foo is

    x + sizeof x / sizeof *x

which is equivalent to

    (foo *)((char *)x + sizeof x)

whereas

    x + sizeof x

is equivalent to

    (foo *)((char *)x + sizeof (foo) * sizeof x))

The expression

    &x + 1

is not idiomatic as it has the type pointer-to-array-of-foo instead of pointer-to-foo.

As to your final question: Parameter declarations discard the size of array types - they are actually pointer-declarations in disguise.

To enforce a fixed array size, you need to declare a parameter of type pointer-to-array, eg

    void bar(int (*arg)[42]);

which you'd have to call like this:

    int x[42] = { 0 };
    bar(&x);

caf · on Oct 18, 2011

You can also get the "correct" type (type of &x[0]) by using:

  *(&x + 1)

I would be hard-pressed to call that idiomatic, however.

mtoddh · on Oct 18, 2011

Agreed - but I have had code just like this come up as an interview question.

jules · on Oct 18, 2011

These questions just show that C is inconsistent. If you have int x[5]; then sizeof(x) is 20, so x is some data of size 20. Yet if you do int y[5]; and then y=x, C refuses to do this even though the types match. That's because it is inconsistent: x is not really an object of size 20 to C, it is also partly a pointer. But then again it is not really a pointer. Which it is depends on confusing rules. If instead y and x were structs with 5 int fields, y=x would work.

davidbalbert · on Oct 18, 2011

The Ksplice blog is one of my all time favorite programming blogs. I am very happy it's back. If you haven't seen it before, it's well worth looking at the archive.

In question 2 I got tripped up because I assumed sizeof(int) == 8 on a 64 bit system. I also got question 4 wrong because I didn't know that &x gives a pointer to an array of size 5.

jrockway · on Oct 18, 2011

Arrays in C have always confused me a little, because books say "an array is just a pointer to the first element". Making matters even more confusing is that most C programmers first deal with arrays in the form of character buffers, which are logical arrays but are typed as pointers to the beginning of the string.

Also confusing was the first time I saw this wonderful construct:

    void f(int x){
        char buf[x + 1];
        printf("%zd\n", sizeof(buf))
    }

Not only does it work, it's also correct C99!

Writing C involves more care than usual because there are so many weird things that are easy to not understand, and not understanding those will cause subtle undefined behavior that allows people controlling the input data to 0wn your computer. Frightening.

zb · on Oct 18, 2011

I think you might be paraphrasing what the books say. What they usually say is that arrays and pointers are equivalent - which is true. It's easy to read that as saying that arrays and pointers are identical, but that's not true at all.

Just to make it confusing though, you can't pass an array by value to a function. It gets automatically converted to a pointer. Arrays are unique in the C language in this sense, and it contributes to the illusion that arrays and pointers are the same thing. (Try passing an array by reference, though, and all becomes clear.)

caf · on Oct 18, 2011

Nitpick: It should be %zu, not %zd (size_t is unsigned).

coliveira · on Oct 18, 2011

The key is that "&" gives a pointer to the declared type. If you have a variable of array type (int x[100]), the pointer is to the array type (pointer to int[100]), not pointer to int.

delinka · on Oct 18, 2011

"...without the use of a computer."

And in this age, with technology so advanced, and computing resources so inexpensive ... why? Because it's the geek's athletic challenge? Whomever's brain holds the most memorized facts is the smartest brain in the world?

I'd guess someone would answer "because you need to know this stuff to be good at your programming job!" To which I reply: if you ever make the assumption that you know how some bit of code will work in a system, you're well on your way to becoming the infallible coder that no one likes to work with. This is why we have testing methodologies.

Yes, you need to be aware of this particular nuance of C. My answers were more high-level ("the address of the beginning of the array", "the next int in memory, not the next byte", etc) and I have no need to know the precise memory locations when I can ask the computer to tell me.

mikeocool · on Oct 18, 2011

If you're interested in even more detail and discussion on pointers vs arrays, the book Expert C Programming by Peter van der Linden has several really good chapters on the subject. Plus, the whole book is a really great read.

krelian · on Oct 18, 2011

I knew all these and I am self tought. Never did too much actual programming but I just enjoyed reading about this things and understanding how they work so over the years I've gained a ton of technical knowledge that (according to what I'm always reading on HN and reddit) the average programmer doesn't know (or care to know).

Can I get a job with the Ksplice team? :)

price · on Oct 19, 2011

They ask for resumes right there in the blog post -- send it in. =)

jheriko · on Oct 19, 2011

Quite good - I almost got the last one wrong. I did immediately realised my mistake and got it wrong again before nailing it but IRL there is no button to press to tell you you got it wrong (unit test maybe?) - there are much worse gotchas and more useful fringe functionality floating around though. What do these print for example.

int aiFoo[ 5 ] = { 1 }; printf( "%d", aiFoo[ 1 ] ); static int ls_aiFoo[ 5 ]; printf( "%d", ls_aiFoo[ 1 ] );

or how about int i = 5; int aiFoo[ i ]; which is valid C99 but not C89?

At any rate - much more important than learning the minutia of C is learning to get things done. If you are passionate about writing some OS you don't need to be taught - you would already know or be learning. It costs nothing but time...

jheriko · on Oct 19, 2011

(and you can fail every bit of this test and write an OS just fine)

For_Iconoclasm · on Oct 18, 2011

Well, it got me on the pointer arithmetic questions. I remembered that C automatically handles pointer arithmetic (multiplying by sizeof(type)), however:

Question 2: I didn't think that x+1 would be interpreted as a pointer for some reason, so I guessed 0x7fffdfbf7f01. Wrong.

Question 4: I incorrectly thought that what I remembered about pointer arithmetic would apply here. 1 * sizeof(int) = 0x04, so I guessed 0x7fffdfbf7f04. Wrong.

I don't work in C professionally, but I'd like to not forget things. My error in question 2 shows forgetfulness, and my error in question 4 is from not ever completely mastering every nook & cranny in C.

How did the rest of HN do?

colanderman · on Oct 18, 2011

> I incorrectly thought that what I remembered about pointer arithmetic would apply here. 1 * sizeof(int) = 0x04, so I guessed 0x7fffdfbf7f04.

Your pointer arithmetic was correct. What you missed was that &x is a pointer to an array of five ints (so, sizeof(int[5])), not a single int.

I on the other hand didn't realize sizeof(int) is not necessarily 8 on 64-bit machines. You learn something new every day...

scott_s · on Oct 18, 2011

On 64-bit machines, int is typically 4 bytes. Longs are typically 8 bytes. But if you really need to assume a certain length, use the typedefs from ctype.h. (For example, if you wanted a 4 byte signed integer, you would say int32_t. An 8 byte unsigned integer would be uint64_t.)

mtoddh · on Oct 18, 2011

So for question 4 it was a matter of realizing that the type for &x is a pointer to an array of 5 ints:

  int (*y)[5];
  y = &x;
  printf("%p, %p\n", y, y+1);

reidrac · on Oct 18, 2011

I got all questions right, although I was not confident (so sizeof(int)is 4).

Hitchhiker · on Oct 18, 2011

The third answer's explanation could be better than whats given :

" That is, whenever an array appears in an expression, the compiler implicitly generates a pointer to the array's first element, just as if the programmer had written &a[0] "

http://c-faq.com/aryptr/aryptrequiv.html

Also remember that much fun could be had by studying the pre-processor up close .. another source of mind-bending fun.

migrantgeek · on Oct 18, 2011

If we could turn pedantry into electrical power, geeks on the internet could power the planet forever.

The takeaway summed the point up nicely although everyone's arguing over sizeof(int). Sure, there are cases when sizeof(int) is not 4 but that's not the point of the exercise. If you solved believing it was 8, you'd still be correct in my opinion.

I'm reminded of so many great articles with comments like "it's you're and not your"

malkia · on Oct 18, 2011

Ignoring the sizeof(int) could be 4, one can code a 64-bit application that uses 32-bit pointers - yes it's rare - but in reality it's a 64-bit application, but just having small pointers - it's called the x32 abi (heh, and I read about it first here from this forum)

https://sites.google.com/site/x32abi/

zwieback · on Oct 18, 2011

I got all right except 2 because I thought that sizeof(int)==8 on 64 bit systems.

Array vs. pointer is one of the more advanced interview questions I like to use to probe C knowledge. I don't actually expect any correct answers, just the knowledge that array != pointer.

peq · on Oct 18, 2011

I am a java programmer and got the first and the last answer correct and the last answer only because the explanation of the second and third one were good enough to understand the last one.

Learned something, so thanks for the link.

jrockway · on Oct 18, 2011

This is the best article I've seen on HN in a year or so. Well done!

nandemo · on Oct 18, 2011

I disagree. As others have pointed out, some of the code has undefined behaviour, yet the article suggests otherwise.

TheTarquin · on Oct 18, 2011

I learned a lot from this, thanks for posting. Mostly I learned how rusty my C skills have gotten in the years since I last used them.

Maybe time to bust out my K&R again.

Symmetry · on Oct 18, 2011

I'd seen this on zephyr the last time a SIPB member posted it to their class, but I still didn't remember the correct answer to the last question.

pandaman · on Oct 18, 2011

In my humble opinion the only challenge here is guessing what sizeof(int) the author had in mind. A mild challenge would be asking about something that is not a part of the pointer arithmetic basics such as a[i] === i[a] and obfuscating a bit (e.g. "const char* i[] = {"Hello","world"}; char c = (2&*i[1])["Hello world"]; what is c? The guess about character encoding is as good as the guess about sizeof(int)).

pajju · on Oct 18, 2011

In his machine int size is 32Bit - should have told the same before.

80x86 · on Oct 19, 2011

gotta say - they got me, yet I rarely play these tricks. For example, I usually promote a pointer to an array index vs. using fixed sized arrays.

jeberle · on Oct 19, 2011

This is something I like about Forth. It uses explicit dereferencing, so it's clear what address is being used. http://home.claranet.nl/users/mhx/sf8/sf8.html

derleth · on Oct 18, 2011

I'm pretty sure the code is undefined once you do x+1 (that is, treat an array like a pointer). Arrays are auto-converted to pointers in, for example, function calls, but until that is done doing arithmetic on them is undefined.

mikeash · on Oct 18, 2011

From the near-final draft of the C99 standard I happen to have sitting around:

"Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined."

In short, the moment you write a variable of array type and it's not one of those very limited cases, it immediately becomes a pointer to the first element. x+1 is completely valid.

TeMPOraL · on Oct 18, 2011

What about

  *(x+1)

that is equivalent to x[1]? Here, in principle, you do x+1 before applying *.

EDIT

HN formatting magic.

mtoddh · on Oct 18, 2011

Fun tidbit - this equivalence also means you can write it as 1[x]:

  #include <stdio.h>
  int main()
  {
    int x[5] = {1,2,3,4,5};
    printf("%d, %d\n", x[1],1[x]);
    return 0;
  }