1. A modern JIT compiler written in Java that takes bytecode and transforms it into machine code. There is a plan that it might someday replace HotSpot [1]. However, we are probably a couple of years away from this.
2. A native image compiler [2] that uses ahead-of-time compilation technology to produce executable binaries of class files. This means startup times and memory usage similar to a language such as go.
3. An abstract syntax tree interpreter called Truffle. Which allows you to easily implement languages on top of GraalVM. With the performance of compiled languages but using an interpreter. You can read more about this here [3].
There are also other features such as a LLVM bitcode engine called Sulong. And various PolyGlot functionality to support integration of whatever language you want. [4]
The 'party trick' combination of (2) and (3) being that you can take a Ruby application, and all of the C extensions that it's using, compile them all AoT and then run whole program optimisation. It was kinda/sorta working a couple of years ago I believe, I don't know what the progress has been.
I read that as Futurama and when I clicked through was expecting to see a picture of Fry or Hypnotoad as some time travel concept as inspiration. Even better, when I clicked through I read it the first time and thought “wow cool someone’s last name is Futurama” then I read it again haha.
because you wrote Futurama and referred to it so many times in this reply, when I skimmed the thread I read the word in this post and the one above it as Futurama. it took me like 3 re-reads to realize the original did NOT say Futurama.
Made a short explainer video on how they're using Futamura's ideas of "partial evaluation" to essentially generate an optimizing compiler from just an interpreter.
This is a good example of where Graalvm's native image is interesting. This is a CLI tool that lets Clojure developers create scripts that are as fast as bash scripts. Clojure may well not be your bag, but stay with me - we're talking about CLI or TUI's written in Java (or a java-based language) that are easy to distribute as statically compiled binaries.
I have hard time being deeply invested in GraalVM considering that the most interesting stuff tends to be kept proprietary (e.g. profile-guided optimizations).
That's the worrying thing to me. Oracle's business model seems to be to keep prices reasonable just as long as it takes to build up a customer base, and then start squeezing.
I'm fascinated by GraalVM, but I'm hesitant to take even one step down a path that leads to Oracle using my actual software implementation platform to shake me down for money. I'd actually rather take the development effort and performance hit on using gRPC or similar to get my Java and Ruby modules talking to each other than to get locked into a potentially weaponized version of the Java platform.
Realistically, this isn't super new. After what happened to Dalvik 10 years back, and later to Adopt, one could argue that the Java Community Process itself, and therefore the very soul of Java, has been weaponized.
If anything, their biggest commercial error may be open sourcing too much. GraalVM EE is quite expensive for what it adds over the open source versions.
Basically more performance and managed Sulong (blocks memory management errors in C/C++ you run on the JVM). And for native AOT compiled images, G1 garbage collector (which is itself open source but the integration isn't).
The profile-guided optimization is a pretty big deal if you're concerned about performance. Based on the benchmarks I've seen, GraalVM Community tends to outperform JIT-compiled Java for applications where warm-up time is a significant portion of total run time, but JIT-compiled Java tends to outperform Graal, sometimes by a significant margin, for anything long-lived. Profile-guided optimization would theoretically close that gap.
Depends on how far people doing Hackintosh could be considered customers, and there are plenty of examples with Apple lawyers visiting the courtroom, but don't let that get into the way of hating Oracle.
GPL is nice, but it seems it's a constant uphill battle - maybe after the GPLed thing becomes the standard things become slightly easier (economically upstreaming becomes cheaper than maintaining a fork).
Can someone please give a sober explanation of what GraalVM, is and how Truffle works?
I think it is somewhat analogous to LLVM: Graal exposes a "language agnostic" API which allows multiple front-ends. There's a front-end for Ruby, JS, C++, others. But this API is higher level than LLVM: it knows about objects, and allows tricks like querying properties across language boundaries.
But I can't imagine what this API looks like! For example, graaljs exists: how do JS semantics get expressed in a Java VM? What does a getProperty instruction look like? How does it know when to invalidate inline caches?
> Can someone please give a sober explanation of what GraalVM, is and how Truffle works?
GraalVM is basically a JVM with a new compiler, the Graal compiler.
Truffle is the language implementation framework for GraalVM and designed to build AST interpreter.
The Graal compiler can optimize Truffle ASTs through partial evaluation (see "One VM to rule them all" paper [1]), and produces machine code directly without using Java bytecode as IR.
> how do JS semantics get expressed in a Java VM?
Graal.js comes with a parser for JavaScript code that generates a Truffle ASTs for a given JavaScript program. JS semantics are defined within the AST nodes (see [2]).
> What does a getProperty instruction look like?
I think getProperty is implemented in `CachedGetPropertyNode` [3], but I am not sure as there are multiple other property-related nodes.
> How does it know when to invalidate inline caches?
Unfortunately, Graal.js lacks documentation for `CachedGetPropertyNode`. But I encourage you to have a look at SimpleLanguage [4], a JS-like toy language and the reference language implementation for Truffle with proper documentation. [5] explains how reading properties and invalidating inline caches works and it's all done using the Truffle DSL. The `@Specialization` annotation [6] might be a good starting point if you want to learn how it works. You may also want to check out the docs on the GraalVM website (e.g. [7]).
You express JS semantics by writing an interpreter for JS in Java. You can't write it in any way you want. Truffle provides a class library for the construction of interpreters, in which you define nodes in a tree. Typically this will be something like the abstract syntax tree of your language but it doesn't have to be. Each node is a class you define, with an execute method (handwaving away some details here).
At the start, you parse source or binary code into a tree of these node objects and then call execute() on one of the root nodes. For example each method or function in your language might be an independent tree, and the root node would be the node at the start of the function. Execute then calls the execute methods of the sub nodes and combines the results together, as per any basic interpreter.
The node objects have fields that contain information about the program, for example:
"return a + 5"
might turn into 4 nodes, a return node, a + node, a variable read node and a constant numeric node which has a field containing 5.
The Truffle class library contains code that measures how often a root node is invoked. After a while some roots will get hot because they get invoked a lot. This method has become a hot spot.
What happens then is the Graal compiler starts compiling the execute method of the root node. It compiles in a different mode to how Java methods are normally compiled:
1. Any time the code reads a field, the compiler pretends it's a constant if it's been annotated with @CompilationFinal. Even if the field is a mutable variable, the compiler acts as if it's not, and will read the value of the field as a constant. This can then trigger constant folding and further optimisations.
2. Any method call is inlined. This proceeds recursively until everything is inlined, stopping only at methods marked as @TruffleBoundarys. The compiler ends up with a single huge method representing the entire interpreter contents of the guest language method. Any calls past a @TruffleBoundary are in effect calls into the language runtime.
Once this is done Graal starts optimising. After inlining the method may be enormous, however, a huge amount of the code in the interpreter can be removed by these optimisations.
Dynamic languages have behaviour too complex to fully compile to native code. The amount of code required would end up being enormous and very slow, as it'd constantly need to look up basic things, like whether you redefined what + means. Therefore Truffle supports a variety of techniques to make them run faster.
One is an Assumption object. You can create these in your nodes and check them in your execute methods. It's a boolean flag. When compiling the JITC assumes the assumption is true, and deletes any code that would have been called if it were false. Your execute methods can call a special method on an Assumption to set it to false however. When that happens HotSpot will de-optimise all the compiled methods and force the back to your Java interpreter.
Another is a transferToInterpreter() method. It's special and says, any code that could execute past the point where I call this method should not be compiled. It means you can keep code that handles obscure cases out of the compiled method. Again, if the compiled method would end up executing a transfer, a de-opt happens.
Another is specialisation. Whilst your interpreter executes it's allowed to change the node objects in the trees. For example because your interpreter observes that at that point in the code you only ever add numbers together, not numbers and strings. The node can do a type check and if it fails, de-optimise and swap itself back to a slower but more generic node. Truffle has a thing it calls a "DSL", really it's a bunch of Java annotations, that automates this whole process for you so you can define a template node class with a whole bunch of different execute methods. It then generates all the actual node classes and the code to do the type checks and swapping behind the scenes.
There's lots more in Truffle to do with supporting debuggers, profilers, language interop etc, but that's the gist of it.
All these techniques added together give you a high level API for building HotSpot or V8 style advanced speculating JITCs, with little more than a specially written interpreter. It's not entirely automatic, but it's far easier than any other framework out there.
As an engineer who lives and breathes java, both in the context of high performance servers and android apps, can someone explain what this does for me besides for quick startup times through the native image feature?
what are the benefits of this?
Java’s linker is based on an open world assumption, where you can add arbitrary stuff to the classpath and dynamically link it at runtime. This defeats most compiler optimizations, so you’re left with the JIT.
GraalVM makes a closed world assumption, so it can do things like dead code elimination (for library functions that don’t get called), and apply static analysis to inline virtual method calls, perform constant propagation, and so on.
This lets it greatly outperform the JVM, at the expense of some rarely used functionality.
Basically, it’s like switching from the JIT to a C compiler’s -O3.
GraalVM is a JIT compiler for the most part. That's the one Twitter uses I believe. SubstrateVM is what lets you perform a closed world assumption ahead of time build. And that does indeed run left performant most of the time, but it uses less memory and starts faster.
> This lets it greatly outperform the JVM, at the expense of some rarely used functionality.
Outperform in what metric? Startup time? Granted. Anything else? Not so much.
>Basically, it’s like switching from the JIT to a C compiler’s -O3.
Yeah and that would be pretty bad (it's not a good analogy to begin with). "-O3" doesn't have anything that a JIT couldn't have. The only advantage is again, startup time. JIT compilation has otherwise only advantages over static compilcation, especially in highly polymorphic code, such as java. Static compilation for polymorphic code is a joke in terms of performance... And every C++ programmer should know that.
I didn't look at the spec, but I would assume that AOT is only the bootup and then the JIT will take over anyway. This means, they'd do some basic precompilation for fat bootup and then use JIT again to optimize the code further based on runtime analysis. Everything else would be a ridicolous step backwards in time and make AOT completely useless, except for some niche scenarios.
As far as I know, Truffle is able to do some absolutely incredible compile-time optimizations —- reaching deeply into what we would normally regard as strictly semantic territory.
(For an example, see some of the optimizations done by TruffleRuby for things like `myArray.sort.first` - which it apparently optimizes by terminating the sort as soon as the first element is sorted to the front of the array... and all that without any special hints in the standard library. Please correct me if I’m wrong... it’s been a few years since I’ve read in depth about TruffleRuby. And granted this example isn’t Java, but I imagine there are great parallels there.)
> Startup time? Granted. Anything else? Not so much.
Startup time matters a lot especially for Java applications. The reason Java never got to the desktop (including browser) I believe was the startup time.
Startup time matters a lot also for micro-services.
A 2nd great benefit of GraalVM I think is it makes it easy to integrate programs written in different languages, say Node.js and Java for instance.
It was always very easy, just package the JRE with the application, use an installer like any other desktop application, or buy one of those commercial AOT compilers available since the start of century.
There's also an expectation that memory footprint will be lower and that can contribute to speed by reducing GC time. The "JIT can do it too" arguments generally don't consider bounded memory.
You can compile your Java code into an executable which doesn't require Java to be present. Advantages would be reduced memory footprint and quicker startup times, which could be ideal for microservices started on demand instead of constantly running.
That being said, the people at Spring don't recommend it for production use yet. Here's what they have to say about it:
"While GraalVM is now GA, GraalVM native image feature which allows ahead-of-time compilation of Java applications into executable images is only available as an early adopter plugin, so we don't consider it production ready yet."
As a counter-point, Quarkus [1] has mature support for building Java apps as native binaries via GraalVM, with a large number of extensions that enable 3rd party libraries to be used in native binaries, and people use this successfully in production. Issues from the past, like lack of debug symbol support with GraalVM CE, have also been resolved by now.
Shameless plug for one real-world usage: the search feature of my personal blog is built as a GraalVM native binary, running as a serverless app on AWS Lambda [3].
Disclaimer: I work for Red Hat, who sponsor the development of Quarkus
While what Quarkus promises sounds delightful, the actual dev experience is still not something I'd recommend. The amount of libraries one can use is still very limited. The compiler errors when one chooses to step outside of the approved list of supported libraries, are very cryptic (or non-existent). In my case, I tried to use Apache Freemarker for templating. I could've tried Quarkus' own templating library, but it didn't exist at that time. What was worse, I had to shut down nearly every app on my 16 Gigabyte, 6-core machine to compile the image! If I didn't do that, compilation would fail with no useful info. The Quarkus team does a wonderful job of demoing a Hello World example. I was just unable to achieve the same success. I love what they're trying to build, so I'll check back with regularity. Our one Quarkus app is using OpenJDK 1.8 for the time being.
Agreed that the experience when trying to native-enable existing libs isn't the greatest. Altough those things are rather reported by the GraalVM compiler rather than Quarkus. The working model and assumption is that somebody goes through this once and then either contributes changes to the library in question back upstream or provides a Quarkus extension for that library, sparing others from that hassle.
That RAM consumption sounds definitely over the top; if you still have the context, logging an issue would be very welcomed. That said, there's many libraries enabled by Quarkus (see quarkus.io/guides/), so every essential functionality should be covered by now. Still a question of course whether your specific library in a given space (like FreeMarker vs. Quarkus Qute) already is supported. In any case, thanks for checking back in regularly, things might look better next time already, as the framework evolves rapidly.
Note that Quarkus starts very fast on Java 14 too. With JDK 14 their hello world app starts in about 600msec on my laptop. HotSpot got optimised a lot over the years and Quarkus gets some of its speed boost by just doing less dynamic stuff at startup.
Static linking is back in fashion, but it means many stale copies of the guts of Java are present, where before we had just one complete and managed copy. It's very similar to bundling your own libc when pretty much every system already has one.
You're not wrong, but with Java evolving rapidly, we're in a situation where there are dozens of commonly used JVM versions in the wild... being able to ship the most recent JVM (which can be as little as 25MB depending on which JDK modules you depend on) with your application not only allows you to benefit from using modern language features, but it also makes your application safer by using the latest runtime with the most up-to-date security fixes, not to mention performance and testing becomes a lot easier when you only have one JVM version to worry about. Also, you can have both options. I have a somewhat popular Java app that I distribute as a tiny jar (300KB, requires Java 9+) and as a stand-alone app (packaged with jlink, around 35MB including JDK 11) - and users can choose which one they want.
If I understand correctly, the difference between jpackage and GraalVM is this:
Jpackage bundles a small Java VM (with only the features you use) together with your compiled bytecode into a single executable. When it runs it starts up the VM and executes bytecode on that VM exactly the same as it would be if you were to run a jar on a preexisting JDK/JRE installation.
Graal compiles your full app ahead of time into a native code binary. There is no bytecode/translation happening at runtime. That is why GraalVM advertises faster startup and lower memory footprint.
So there are essentially 3 ways to run JVM (Java/Scala/Kotlin/...) code:
* compile into bytecode jar -> requires existing VM runtime
* compile into bytecode jar + bundle VM runtime -> no dependencies required, runs as the previous option
* compile into native binary -> no dependencies required, runs native, starts and runs faster
> Jpackage bundles a small Java VM (with only the features you use) together with your compiled bytecode into a single executable. When it runs it starts up the VM and executes bytecode on that VM exactly the same as it would be if you were to run a jar on a preexisting JDK/JRE installation.
This is incorrect. Jpackage creates an installer which will unpack the VM image and any other resources.
I've tried it with a small CLI tool and it actually doesn't work that well (at least on Windows).
It's actually jlink that is already available and creates a minimal JVM with your application code that you can distribute as OS-specific bundles... jpackage, which is still in beta (and doesn't work very well, I've tried it recently), will take the output of jlink and turn that into some common OS-specific packages, like RPM, Debian and dmg.
I believe because dead code elimination isn't currently possible with JVM, "package to binary" would mean including the whole (or most of) JVM with your program.
It's amazing how badly certain companies consistently put their foot in their mouths and cannot even give a basic explanation of what their products to.
To compound the inanity - Graal is also the name of a new compiler used in the newer JVMs, in addition to the name of a completely different kind of 'JRE/SDK'.
GraalVM allows for polygot execution of a number of languages: Java, Javascript, Python, and anything compiled to LLVM bitcode.
It runs them all 'side by side' so there's no translation barrier when interacting between languages.
GraalVM also comes with a native compiler that allows you to have much faster startup time, though it comes at the cost of not getting more advanced runtime optimisations.
As far as 'performance' I don't think there is anything fundamentally different form the newer JVMs.
GraalVM is “open core”: the community edition is free as in freedom, but the enterprise version (which you need for any serious production usage) is commercial and “Oracle expensive”.
It’s a bit of a shame, because it’s really nice tech, but it will likely continue to struggle for adoption as long as it’s owned by Oracle. I wish they would spin it off; the same product, with more or less the same commercial model, would probably work very well in the market, as long as it lived far away from the the toxic Oracle-licensing nightmare.
I'm happy with language interoperability as a concept but how can i mix java and js in the same codebase with graalvm? And would it make sense? What are the advantages of this approach?
I said "currently", because you currently have to use the host Java, that is the Java on top of which all other languages are implemented. The GraalVM team is working on Project Espresso, a Java written in Truffle. That will make polyglot programming with Java much more consistent with the rest of the languages. Project Espresso isn't public yet, but here's the latest update from the team working on it:
You can use it for doing server side rendering of React or other JS framework components. I have a VERY rough repository[0] where I tested this out a while back. The way I did it was to export functions on the global object in JS[1] which could then easily be called from Java code[2]. I'm sure there's a better way to accomplish the same thing.
What I found interesting with this approach is that it allows you to do server side rendering without needing a separate server and without needing any third party dependencies aside from GraalVM itself.
Instead of compiling Java to Java bytecode which is then executed by the JVM (e.g. Hotspot), you compile it to native machine code/assembly + runtime (Substrate VM). However you loose the cross-platform ability and Java reflections won't work out of the box.
But isn't it more than Java? You can run other languages like Node.js and Python besides Java. So what I'm asking for would be an IDE where I can easily program calls from Java to Node.js or vice versa, for example. Or some other poly-language IDE if such exists
1. A modern JIT compiler written in Java that takes bytecode and transforms it into machine code. There is a plan that it might someday replace HotSpot [1]. However, we are probably a couple of years away from this.
2. A native image compiler [2] that uses ahead-of-time compilation technology to produce executable binaries of class files. This means startup times and memory usage similar to a language such as go.
3. An abstract syntax tree interpreter called Truffle. Which allows you to easily implement languages on top of GraalVM. With the performance of compiled languages but using an interpreter. You can read more about this here [3].
There are also other features such as a LLVM bitcode engine called Sulong. And various PolyGlot functionality to support integration of whatever language you want. [4]
[1] https://jaxenter.com/openjdk-project-metropolis-137318.html
[2] https://www.graalvm.org/docs/reference-manual/native-image/
[3] https://www.beyondjava.net/truffle-compiler-compiler
[4] https://github.com/oracle/graal/blob/master/sulong/README.md