> I began programming C and assembler on the VAX and the original PC. At that time, C was a reasonable approximation of the assembly code level. We didn't get into expanding C to assembly that much but the translation was reasonably clear.
Right: On the VAX, there wasn't much else for a compiler to do other than the simple, straightforwards thing, and I'm including optimizations like common subexpression elimination, dead code pruning, and constant folding as straightforwards. Maybe loop unrolling and rejuggling arithmetic to make better use of a pipeline, if the compiler was that smart.
> As far as I know, what's changed that mid-80s world and now is that a number of levels below ordinary assembler have been added.
You make good points about caches and memory protection being invisible to C, but they're invisible to application programmers, too, most of the time, and the VAX had those things as well.
Another thing that's changed is that chips have grown application-visible capabilities which C can't model. gcc transforms certain string operations into SIMD code, which vectorizes it and turns a loop into a few fast opcodes. You can't tell a C compiler to do that portably without relying on another standard. C didn't even get official, portable support for atomics until C11.
You can dance with the compiler, and insert code sequences and functions and hope the optimizer gets the hint and does the magic, but that's contrary to the spirit of a language like C, which was a fairly thin layer over assembly back in the heyday of scalar machines. I don't know any modern language which fills that role for modern scalar/vector hybrid designs.
SIMD design itself isn't constant between different processor families. Any purported standardized language for scalar/vector hybrid either has to rely on a smart optimizer or be utterly platform specific.
> SIMD design itself isn't constant between different processor families. Any purported standardized language for scalar/vector hybrid either has to rely on a smart optimizer or be utterly platform specific.
That is indeed part of the problem. There might be enough lowest-common-denominator there to standardize, like there is with atomics, I don't know, but I'm not saying that C needs to add SIMD support. I'm saying that any low-level language needs to directly expose machine functionality, which includes some SIMD stuff on some classes of processor.
Maybe there will be a shakeout, like how scalar processors largely shook out to being byte-addressable machines with flat address spaces and pointers one word size large, as opposed to word-addressable systems with two pointers to a machine word (the PDP-10 family) or segmented systems, like lots of systems plus the redoubtable IBM PC. C can definitely run on those "odd" systems, which weren't so odd when C was first being standardized, but char array access definitely gets more efficient when the machine can access a char in one opcode. (You could have a char the same size as an int. It's standards-conformant. But it doesn't help your compiler compile code intended for other systems.) C could standardize SIMD access once that happens. However, it would be nice to have a semi-portable high-level assembly which targets all 'sane' architectures and is close to the hardware.
You’re mistaken about the PDP-10. Yes, you could pack two pointer-to-word pointers into a single word; but a single word could also contain a single pointer-to-byte. See http://pdp10.nocrew.org/docs/instruction-set/Byte.html for all the instructions that deal with bytes, including auto-increment! And bytes could be any size you want, per pointer, from 1 to 36 bits.
C#/.Net has vector operations in System.Numerics.Vectors namespace, which will use SSE, AVX2 or neon, if avaliable. However, there are numerous Simd instructions cannot be mapped that way.
For that reason, .Net core 3 added Simd intrinsics, so now you can give AVX/whatever instructions directly.
If I remember correctly someone made a very performant physics engine with the vector API
Right: On the VAX, there wasn't much else for a compiler to do other than the simple, straightforwards thing, and I'm including optimizations like common subexpression elimination, dead code pruning, and constant folding as straightforwards. Maybe loop unrolling and rejuggling arithmetic to make better use of a pipeline, if the compiler was that smart.
> As far as I know, what's changed that mid-80s world and now is that a number of levels below ordinary assembler have been added.
You make good points about caches and memory protection being invisible to C, but they're invisible to application programmers, too, most of the time, and the VAX had those things as well.
Another thing that's changed is that chips have grown application-visible capabilities which C can't model. gcc transforms certain string operations into SIMD code, which vectorizes it and turns a loop into a few fast opcodes. You can't tell a C compiler to do that portably without relying on another standard. C didn't even get official, portable support for atomics until C11.
You can dance with the compiler, and insert code sequences and functions and hope the optimizer gets the hint and does the magic, but that's contrary to the spirit of a language like C, which was a fairly thin layer over assembly back in the heyday of scalar machines. I don't know any modern language which fills that role for modern scalar/vector hybrid designs.