Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That argument is often made for JITs, but I have never seen a real world example where extra runtime knowledge that JITs have is used that couldn't be done better by a static compiler, except in the cases where runtime code loading is used.


Alias analysis is a good example. A JIT compiler may speculatively add dynamic disambiguation guards (p1 != p2 ==> p1[i] cannot alias p2[i]). If the assumption turns out to be wrong, the JIT compiler dynamically attaches a side branch to the guard using the new assumption (p1 == p2 ==> p1[i] == p2[i], which is an even more powerful result).

Doing this in a static compiler is hard, because it would have to compile both paths for every such disambiguation possibility. This quickly leads to a code explosion. You'd need very, very smart static PGO to cover this case: there are no branch probabilities to measure, since the compiler doesn't know that inserting such a branch might be beneficial. It may only derive this by running PGO on code which has these branches, which leads to the code explosion again.

Auto-vectorization is another example: a static compiler may have to cover all possible alignments for N input vectors and M output vectors. This can get very expensive, so most static compilers simply don't do it and generate slower, generic code. A JIT compiler can specialize to the runtime alignment and even compile secondary branches in case the alignment changes later on (e.g. a filter fed with different kernel lengths at runtime).


I agree in general, although I will point out that virtually no C developers use PGO while it's on by default in HotSpot and now V8. (Of course, it looks like Java needs PGO just to try to catch up with gcc -O3.)


Not strictly the same but take a look at the CPU world:

VLIW (compilers try to optimize processing based on static knowledge - e.g. Intel's Itanium) vs. the current Intel CPUs (based on P3/P4 architecture) which dynamically allocate resources depending on runtime knowledge.

Runtime information can help compilers. Just look at profile guided optimizations in current static compilers.

The real trouble in JIT compilers is usually that the target languages semantics are very high level. For example an integer in C is machine sized and is not expanded in size to fit its value -- unlike some dynamic languages.


http://weblogs.java.net/blog/2008/03/30/deep-dive-assembly-c...

This link doesn't quite give what you are after (it's mostly about static compilation in the Java HotSpot compiler), but I believe the lock elision features (http://www.ibm.com/developerworks/java/library/j-jtp10185/in...) have to be done at runtime in the JVM (because of late binding).

Obviously this doesn't totally invalidate your argument ("except in the cases where runtime code loading is used"), but it is worth noting that in many languages late binding is normal, and so this is the general case.

Also, HP's research Dynamo project "inadvertently" became practical. Programs "interpreted" by Dynamo are often faster than if they were run natively. Sometimes by 20% or more. http://arstechnica.com/reviews/1q00/dynamo/dynamo-1.html


This is a common misinterpretation of the Dynamo paper: they compiled their C code at the _lowest_ optimization level and then ran the (suboptimal) machine code through Dynamo. So there was actually something left to optimize.

Think about it this way: a 20% difference isn't unrealistic if you compare -O1 vs. -O3.

But it's completely unrealistic to expect a 20% improvement if you'd try this with the machine code generated by a modern C compiler at the highest optimization level.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: