They really should be listing these answers in terms of sizeof(int) instead of assuming all 64-bit platforms have sizeof(int) == 4. Pointers are confusing in C, but so is the common assumption that sizeof(int) is always 4.
sizeof(int) is either 2 or 4 on all reasonable architectures. On all modern and reasonable architectures, it must be 4.
Short answer: if sizeof(int)=8, then you lose either the 2 byte integer, or the 4 byte integer. This would make it harder to minimize memory footprint in programs which work with large amounts of data.
The reason for this is that char, short, int, long, and long long are the only names you have for various integer sizes. Since sizeof(char) is fixed at 1
and sizeof(char) <= sizeof(short) <= sizeof(int)
if you make int eight bytes, you must either drop the two-byte or four-byte integer since you only have one name left for it. Dropping one of those types will make programmers unhappy since they packing structs careful is a valid thing to do for memory minimization in large-memory programs.
Note that int32_t and friends are not valid names since they MUST be a typedef for one of char, short, int, long or long long.
Disclaimer: if C11 has fixed this, I'm ignorant of it.
C99 specified stdint.h, which includes int16_t/uint16_t. So compliant compilers are required to support it even if it doesn't map to a built-in type. So you won't lose the short.
That said, it wouldn't make it any less insane; so no one does this and in fact int is 32 bits everywhere except microcontrollers (and 8086, for the tiny handful of people writing BIOS or bootloader code).
Right, which it does: we're talking about ABI variations within a single architecture. The OS and toolchain enforcing ILP64 vs. LP64 semantics on C programs has nothing to do with the i386's ability to operate on 16 bit chunks.
The "implementation" in the meaning of the C standard includes the OS and toolchain. If the C toolchain does not provide a 16 bit type, then it need not define (u)int16_t, regardless of the CPU that it is running on.
However, as mentioned recently in an HN comment somewhere there are architectures that can only operate on 32-bit or larger chunks, so sizeof(char) == sizeof(int) (I think it may have been an older Cray). I can't find the specific comment, but here's one that mentions a platform with sizeof(char) == 16: http://news.ycombinator.com/item?id=3112704
All machines were word orientated until the IBM 360's arrived and many persisted well into the 80's (the PDP-10 is a particularly famous one for hackers). Many of them got C compilers at one point or another.
That's not really the issue though. My point was that the choice of ILP64 vs. LP64 on a single architecture could not cause you to "lose" a 16 bit quantity. It can't, because those machine instructions obviously don't go away when you change your compiler's calling conventions. So a C99-compliant compiler would still be required to provide int16_t.
Which is... maybe too much minutiae even for a C minutiae thread. But it was my point, anyway.
Yes, I see. It can be annoying when someone widens the scope of an already-narrowed discussion, as my comment tried to do. Thanks for your restatement and clarification.
FWIW, I've actually used a really silly chip once that had a horrible-tastic toolchain that it came with, where char and int were the same size, and both either 16 or 32 bits (I forgot whether it was 16 or 32, unfortunately). (In case anyone doesn't notice: on such a system, sizeof(int) is still 1, as it is measured in units of sizeof(char).)
I don't remember (this was 2002); I just remember it being a really cheap part on my friend's MicroMouse motor controller (I wrote the maze searching algorithm for them), and that the C compiler didn't actually work very well.
Then breaking a little quiz like this will be the least of your worries.
Of course, that's the kind of logic that got us into this situation today. Fortunately we should to be able to live with CHAR_BIT == 8 for the forseeable future.
Isn't the assumption that sizeof(int) = platform type in bytes (e.g. 4 for 32-bit and 8 for 64-bit platforms)? I expected int on 64-bit platform to be 8 bytes in size.
You'll see this typically listed as a LP64 model. The other common option is ILP64, where int == long == long long == void * == 8.
It should also be noted that this has nothing to do with the underlying hardware. Instead it's a decision made by the OS ABI. There's nothing that says you can't have sizeof(int) == 8 on a 32-bit system.
While it has nothing to do with hardware per se, ILP64 is often seen on platforms, where some significant 32 bit operations (often memory accesses, but surprisingly sometimes even ALU ops) are (or used to be) slower than 64 bit.
It depends on your platform AND on your compiler, so just saying "Windows 64" isn't enough information. Conceivably, there are compilers for "Windows 64" such that sizeof(int) == 8.
Your point (that int isn't a qword on all 64bit architectures) still stands. But your statement is potentially incorrect.
The OS ABI basically defines what the compiler will do. While it's possible to run a compiler in ILP64 mode on Windows 64, you won't get too far if you try to pass a 64-bit integer >4^32 into a Windows system call.
... or pass any 64-bit integer. Windows API calls are mostly "stdcall" and use stack for parameter passing - pushing 8 bytes instead of 4 could be disasterous in many ways - especially considering that the callee cleans the stack in this case.
There is no stdcall/ccall distinction on 64-bit Windows, there is only one ABI convention that sadly uses a different set of registers for parameter passing than Linux. Only the first four parameters uses registers, the rest is passed on the stack.
Even on 64bit machines an int has 4 bytes. I don't think this is mandated by the standard, though. I guess that many implementations believe that this is better than having a huge size difference between 32 and 64 architectures. There is always long int if you want a 64bit integer.
I don't disagree, but I thought it was nicer for there to be a more concrete, actual memory address for the answer.
I thought about explicitly stating "sizeof(int) is 4 on this system" in the intro blurb, but that primes you a bit more than I'd like for the answer to #2, so I thought it was a little cleaner not to.
I feel that anyone who doesn't already know "it depends on sizeof(int)" isn't going to get it anyway; I was briefly thrown off because I thought "surely if sizeof(int) matters like I think it does, I would have been told what it is?"
But I'm more annoyed that I failed the last two by not understanding C than that I failed the second by guessing sizeof(int) incorrectly.
My first instinct was that sizeof(int) was the same width as the system architecture (64 bits)... specifically because you mentioned it was a 64 bit system.
Either way, after I realised that sizeof(int) == 4 the test was surprisingly simple. Are pointers really that hard to grasp?
Or they could use C99's int32_t in the problem's code to make the size of the integer explicit... (Or just state that sizeof(int) == 4 on this hypothetical system.)
Well, sizeof(int) (and short, and long!) could also be 1, and %p could print out the pointer address as a roman numeral. All legal according to standard.
What commonly used platforms have a 64-bit ints? The only one I vaguely recall are really, really old versions of Solaris. After a while IIRC they decided that ILP64 was too much of a PITA and went with LP64.
The standard requires that "int" be able to represent from -32767 to +32767. Unless "byte" is redefined to contain a different number of bits, this means that sizeof(int) >= 2. Similarly, long has a range that requires at least 32 bits.
sizeof(char) is 1 per definition and also one byte, but not necessarily 8 bits.
Apparently some DSPs cannot address an 8-bit quantity -- they're not easy to find but I've found a comp.lang.c posting where Jack Klein mentions a 32-bit Sharp DSP with CHAR_BIT = 32 (so sizeof(char) == sizeof(int) == 1 !) and a Texas Instruments TMS32F28xx DSP with a 16 bit CHAR_BIT.
On such a system, C99 might not define int8_t but you could use int8_least8_t.
My guess is that they are more common than 64-bit ints but they might be supported by their own weird C Compiler/toolchain.
I could have used the hint about what the output of %p looks like (I missed the leading 0x). Of course nobody's keeping score, but that doesn't seem essential to the question.
This is a bit of an idiotic question because one is not testing knowledge of computing or software systems, but rather trivial knowledge of an arcane corner of the language. Indeed, this question is an interview anti-pattern that I have historically labelled the "where-is-the-bathroom-in-my-house" question: if someone has not been in your house, they would not know, and if someone were in your house and had to take a leak, I trust they could figure it out. In my experience, these questions are most likely to be asked by intellectual midgets who themselves would not be able to answer an equivalent (but different) question.
So in the spirit of performing that experiment and exploring this interview anti-pattern, here's my counter-challenge, which I argue is intellectually equivalent:
What does that program do? Yeah, exactly: you just ran it. And you're surprised, aren't you? And most importantly: who cares? Certainly not I when I'm interviewing you -- where you can trust I will ask you deeper questions than language arcana...
I don't need to run that to know that it calls foo exactly once. Function designators decay to a pointer to the function if they are not the subject of the & or sizeof operators, so the string of *s simply converts it repeatedly between a function designator and a function pointer.
This is a bit of an idiotic question because one is not testing knowledge of computing or software systems, but rather trivial knowledge of an arcane corner of the language.
Keep in mind that this is Ksplice; for the work they do, this might not be meaningless arcana.
In C, the array as a function parameter is different from an array as a variable/struct member. So, as a parameter, sizeof(x) is sizeof(int*). As a variable, sizeof(x) is sizeof(int[5]).
I got the 4th question wrong; my only caveat is, professional C programmers probably all learn to avoid constructions like this in favor of more explicit ones.
You are mistaken (as is _delirium). Pointers to arrays are fundamental to how multi-dimensional arrays are implemented in C, and this question is designed to test your understanding of them. (It is most certainly not about a fancy way of getting a pointer to the first element after the array, since this pointer has a completely different type.)
Multidimensional arrays are arrays of arrays, not arrays of pointers to arrays (as in Java). Therefore when manipulating, say, rows of a 2-dimensional array you are dealing with pointers to arrays like &x in this example.
It's not an every-day thing but it does come up, and in some specialised fields no doubt it is very common.
> (It is most certainly not about a fancy way of getting a pointer to the first element after the array, since this pointer has a completely different type.)
Have to be careful there. Going more than one after the end of the array is undefined (see 6.17):
Actually, that whole FAQ should be of interest to anyone who liked this article. No matter how many times I read through it, I seem to find something new.
I use it all the time in network and graphic programming, but I name the variables to state the fact. It's like a poor man's struct; sometimes you are doing say, some kind of packet analysis and you have some really convoluted spec to work off of. You could either formalize it in a bunch of headers and try to synthesize the structure of it or you can use tricks like this in order to parse it nicely.
In graphics programming, or more specifically, file format programming, sometimes I have to parse through, or generate a colortable and then go through a bunch of serialized data and transform that into a buffer where I have something like coordinate[x][y]{[z]}.{rgb/cmyk/yuv/hsv/rgba/bgr...} and I have a bunch of unions and structs; having char * data[3] as my payload and then being able to offset around into it is fantastically convenient, especially when trying to apply kernels or do transformations over the entire space.
The code is far more readable at the lower level manipulation with this kind of stuff.
And when you say 'Well I use OTS solutions for graphics' and I say "I do too, when I can. But when I can't, it's nice to be able to express what I need to do effeciently"
Same here. For the cases where you really do want to find the first memory address past the end of the array (if you're doing some hackish memory management, probably), I think x + sizeof(x) is more idiomatic than &x+1, because it avoids that particular arrays-are-almost-pointers weirdness in favor of regular pointer arithmetic.
Though I used to program a lot of C, I'm not a C wizard by a longshot, so I might be wrong. Are there cases where using expressions based on &x, where x is an array name, is idiomatic C?
edit: Stupid mistake, see cygx's reply (sizeof(x) gives the size of the array x in bytes, not in elements).
These questions just show that C is inconsistent. If you have int x[5]; then sizeof(x) is 20, so x is some data of size 20. Yet if you do int y[5]; and then y=x, C refuses to do this even though the types match. That's because it is inconsistent: x is not really an object of size 20 to C, it is also partly a pointer. But then again it is not really a pointer. Which it is depends on confusing rules. If instead y and x were structs with 5 int fields, y=x would work.
The Ksplice blog is one of my all time favorite programming blogs. I am very happy it's back. If you haven't seen it before, it's well worth looking at the archive.
In question 2 I got tripped up because I assumed sizeof(int) == 8 on a 64 bit system. I also got question 4 wrong because I didn't know that &x gives a pointer to an array of size 5.
Arrays in C have always confused me a little, because books say "an array is just a pointer to the first element". Making matters even more confusing is that most C programmers first deal with arrays in the form of character buffers, which are logical arrays but are typed as pointers to the beginning of the string.
Also confusing was the first time I saw this wonderful construct:
Writing C involves more care than usual because there are so many weird things that are easy to not understand, and not understanding those will cause subtle undefined behavior that allows people controlling the input data to 0wn your computer. Frightening.
I think you might be paraphrasing what the books say. What they usually say is that arrays and pointers are equivalent - which is true. It's easy to read that as saying that arrays and pointers are identical, but that's not true at all.
Just to make it confusing though, you can't pass an array by value to a function. It gets automatically converted to a pointer. Arrays are unique in the C language in this sense, and it contributes to the illusion that arrays and pointers are the same thing. (Try passing an array by reference, though, and all becomes clear.)
The key is that "&" gives a pointer to the declared type. If you have a variable of array type (int x[100]), the pointer is to the array type (pointer to int[100]), not pointer to int.
And in this age, with technology so advanced, and computing resources so inexpensive ... why? Because it's the geek's athletic challenge? Whomever's brain holds the most memorized facts is the smartest brain in the world?
I'd guess someone would answer "because you need to know this stuff to be good at your programming job!" To which I reply: if you ever make the assumption that you know how some bit of code will work in a system, you're well on your way to becoming the infallible coder that no one likes to work with. This is why we have testing methodologies.
Yes, you need to be aware of this particular nuance of C. My answers were more high-level ("the address of the beginning of the array", "the next int in memory, not the next byte", etc) and I have no need to know the precise memory locations when I can ask the computer to tell me.
If you're interested in even more detail and discussion on pointers vs arrays, the book Expert C Programming by Peter van der Linden has several really good chapters on the subject. Plus, the whole book is a really great read.
I knew all these and I am self tought. Never did too much actual programming but I just enjoyed reading about this things and understanding how they work so over the years I've gained a ton of technical knowledge that (according to what I'm always reading on HN and reddit) the average programmer doesn't know (or care to know).
Quite good - I almost got the last one wrong. I did immediately realised my mistake and got it wrong again before nailing it but IRL there is no button to press to tell you you got it wrong (unit test maybe?) - there are much worse gotchas and more useful fringe functionality floating around though. What do these print for example.
or how about int i = 5; int aiFoo[ i ]; which is valid C99 but not C89?
At any rate - much more important than learning the minutia of C is learning to get things done. If you are passionate about writing some OS you don't need to be taught - you would already know or be learning. It costs nothing but time...
Well, it got me on the pointer arithmetic questions. I remembered that C automatically handles pointer arithmetic (multiplying by sizeof(type)), however:
Question 2: I didn't think that x+1 would be interpreted as a pointer for some reason, so I guessed 0x7fffdfbf7f01. Wrong.
Question 4: I incorrectly thought that what I remembered about pointer arithmetic would apply here. 1 * sizeof(int) = 0x04, so I guessed 0x7fffdfbf7f04. Wrong.
I don't work in C professionally, but I'd like to not forget things. My error in question 2 shows forgetfulness, and my error in question 4 is from not ever completely mastering every nook & cranny in C.
On 64-bit machines, int is typically 4 bytes. Longs are typically 8 bytes. But if you really need to assume a certain length, use the typedefs from ctype.h. (For example, if you wanted a 4 byte signed integer, you would say int32_t. An 8 byte unsigned integer would be uint64_t.)
The third answer's explanation could be better than whats given :
" That is, whenever an array appears in an expression, the compiler implicitly generates a pointer to the array's first element, just as if the programmer had written &a[0] "
If we could turn pedantry into electrical power, geeks on the internet could power the planet forever.
The takeaway summed the point up nicely although everyone's arguing over sizeof(int). Sure, there are cases when sizeof(int) is not 4 but that's not the point of the exercise. If you solved believing it was 8, you'd still be correct in my opinion.
I'm reminded of so many great articles with comments like "it's you're and not your"
Ignoring the sizeof(int) could be 4, one can code a 64-bit application that uses 32-bit pointers - yes it's rare - but in reality it's a 64-bit application, but just having small pointers - it's called the x32 abi (heh, and I read about it first here from this forum)
I got all right except 2 because I thought that sizeof(int)==8 on 64 bit systems.
Array vs. pointer is one of the more advanced interview questions I like to use to probe C knowledge. I don't actually expect any correct answers, just the knowledge that array != pointer.
I am a java programmer and got the first and the last answer correct and the last answer only because the explanation of the second and third one were good enough to understand the last one.
In my humble opinion the only challenge here is guessing what sizeof(int) the author had in mind. A mild challenge would be asking about something that is not a part of the pointer arithmetic basics such as a[i] === i[a] and obfuscating a bit (e.g. "const char* i[] = {"Hello","world"}; char c = (2&*i[1])["Hello world"]; what is c? The guess about character encoding is as good as the guess about sizeof(int)).
I'm pretty sure the code is undefined once you do x+1 (that is, treat an array like a pointer). Arrays are auto-converted to pointers in, for example, function calls, but until that is done doing arithmetic on them is undefined.
From the near-final draft of the C99 standard I happen to have sitting around:
"Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined."
In short, the moment you write a variable of array type and it's not one of those very limited cases, it immediately becomes a pointer to the first element. x+1 is completely valid.