44 languages 42 languages compile to Machine Code Wait, what? There are 2 langua...

chrisseaton · on July 6, 2018

> how is it even possible?

Instead of compiling to machine code, you compile to another language instead. C++ was originally compiled to C, for example.

Why would you think it wasn't possible?

gnulinux · on July 6, 2018

Compilation is the act of generating a back-end code. Most languages compile to Machine Code, but you could also compile to some other language like C, Haskell, Javascript that you know has a very well-optimized compiler. Say, if you generate C code, and do it well, you know that your language will be almost as fast as C.

koonsolo · on July 6, 2018

Haxe basically compiles to any other language. So why would it need to go straight to Machine Code?

It is super-multiplatform.

DannyB2 · on July 6, 2018

Java compiles to a bytecode which is not machine code. Once bytecode is executed on target platform runtime, it is then compiled down to machine code.

But there's more to it than that. The bytecode is actually interpreted at first by the JVM runtime. The code is also continuously dynamically profiled. There are two compilers C1 and C2.

Whatever functions are using the most cpu time get compiled using C1. C1 rapidly compiles to poorly optimized code, but this is a big speedup over the bytecode interpreter. The function is also scheduled to be compiled again in the near future using the C2 compiler. The C2 compiler spends a lot of time compiling, optimizing and aggressively inlining.

But there's more. C2 can optimize its compile for the exact target instruction set, plus extensions, for the actual hardware it is running on at the moment. An ahead of time C compiler cannot do that. It needs to generate x86-64 code that runs on a large variety of hardware processors.

But there's more. The C2 compiler can optimize based on the entire global program. Suppose a function call from one author's library to another author's library can be optimized in some way by writing a different version of that function. C2 can take advantage of this and do it where a C compiler can not because it doesn't know anything about the insides of the other library it is calling -- which might be rewritten tomorrow, or might not be written yet. Once the Java program is started, the C2 compiler can see all parts of the running program an optimize as needed.

But there's more. Suppose YOUR function X calls MY function Y. If your function X is using much CPU, it gets compiled to machine code by C1, and then in a short time gets recompiled again by C2. The C2 compiler might inline my Y function into your X function. Now suppose the class containing my Y function gets dynamically reloaded. Your X function now has a stale inlined version of my Y function. So the JVM runtime changes your X function back to being bytecode interpreted once again. If your Y function is using a lot of CPU, then it gets compiled again by C1, and then in a while, by C2.

All this happens in a garbage collected runtime platform.

It is why Java programs seem to start up, but take a few minutes to "warm up" when they start running fast. Many Java workloads are long running servers, so startup is infrequent.

Now you know why Java can run fast for only six times the amount of memory as a C program.

pjmlp · on July 6, 2018

This is when describing how the Hotspot JVM would execute a Java application.

There are other ways to execute Java applications.

DannyB2 · on July 6, 2018

That is true.

progval · on July 6, 2018

There's ActionScript, which can only be interpreted.

auscompgeek · on July 6, 2018

Hm. I thought Flash has bytecode rather than ActionScript being plain interpreted.

larsiusprime · on July 6, 2018

You are correct. ActionScript compiles to ABC (ActionScript ByteCode) which is run by the AVM (ActionScript Virtual Machine).

bkul_ · on July 6, 2018

Does it count interpreters?

SideQuark · on July 6, 2018

Interpreted only