Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The point is that C has no concept of caches. Yet efficient cache usage is crucial to writing performant low level code.

So what C programmers have to engage in cargo culted patterns to try and trigger the correct behaviour from their optimising compiler.



The instruction set architecture has no concept of it either, nor about instruction level parallelism, branch prediction or speculative execution. There is no way of getting this right other than knowing about how the processor is implemented under the hood and issue the right assembly based on that. Often experimentation is needed to get it right.

To fix this, the instruction set need to be changed, and this need to be kept in order to keep it compatible with earlier versions. Thus the instruction set remains as it is with the exception that new instruction may be added.


I'm always amazed that multi-thread code works as well as it does. I can have some data in multiple caches being used by multiple threads on separate CPUs and as long as I get the memory barriers right, it will work. Combine that with predictive branching and out-of-order execution and it feels even more magical.


Yeah, multi threading on a modern processor basically seems like witch craft to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: