You express JS semantics by writing an interpreter for JS in Java. You can't write it in any way you want. Truffle provides a class library for the construction of interpreters, in which you define nodes in a tree. Typically this will be something like the abstract syntax tree of your language but it doesn't have to be. Each node is a class you define, with an execute method (handwaving away some details here).
At the start, you parse source or binary code into a tree of these node objects and then call execute() on one of the root nodes. For example each method or function in your language might be an independent tree, and the root node would be the node at the start of the function. Execute then calls the execute methods of the sub nodes and combines the results together, as per any basic interpreter.
The node objects have fields that contain information about the program, for example:
"return a + 5"
might turn into 4 nodes, a return node, a + node, a variable read node and a constant numeric node which has a field containing 5.
The Truffle class library contains code that measures how often a root node is invoked. After a while some roots will get hot because they get invoked a lot. This method has become a hot spot.
What happens then is the Graal compiler starts compiling the execute method of the root node. It compiles in a different mode to how Java methods are normally compiled:
1. Any time the code reads a field, the compiler pretends it's a constant if it's been annotated with @CompilationFinal. Even if the field is a mutable variable, the compiler acts as if it's not, and will read the value of the field as a constant. This can then trigger constant folding and further optimisations.
2. Any method call is inlined. This proceeds recursively until everything is inlined, stopping only at methods marked as @TruffleBoundarys. The compiler ends up with a single huge method representing the entire interpreter contents of the guest language method. Any calls past a @TruffleBoundary are in effect calls into the language runtime.
Once this is done Graal starts optimising. After inlining the method may be enormous, however, a huge amount of the code in the interpreter can be removed by these optimisations.
Dynamic languages have behaviour too complex to fully compile to native code. The amount of code required would end up being enormous and very slow, as it'd constantly need to look up basic things, like whether you redefined what + means. Therefore Truffle supports a variety of techniques to make them run faster.
One is an Assumption object. You can create these in your nodes and check them in your execute methods. It's a boolean flag. When compiling the JITC assumes the assumption is true, and deletes any code that would have been called if it were false. Your execute methods can call a special method on an Assumption to set it to false however. When that happens HotSpot will de-optimise all the compiled methods and force the back to your Java interpreter.
Another is a transferToInterpreter() method. It's special and says, any code that could execute past the point where I call this method should not be compiled. It means you can keep code that handles obscure cases out of the compiled method. Again, if the compiled method would end up executing a transfer, a de-opt happens.
Another is specialisation. Whilst your interpreter executes it's allowed to change the node objects in the trees. For example because your interpreter observes that at that point in the code you only ever add numbers together, not numbers and strings. The node can do a type check and if it fails, de-optimise and swap itself back to a slower but more generic node. Truffle has a thing it calls a "DSL", really it's a bunch of Java annotations, that automates this whole process for you so you can define a template node class with a whole bunch of different execute methods. It then generates all the actual node classes and the code to do the type checks and swapping behind the scenes.
There's lots more in Truffle to do with supporting debuggers, profilers, language interop etc, but that's the gist of it.
All these techniques added together give you a high level API for building HotSpot or V8 style advanced speculating JITCs, with little more than a specially written interpreter. It's not entirely automatic, but it's far easier than any other framework out there.
You express JS semantics by writing an interpreter for JS in Java. You can't write it in any way you want. Truffle provides a class library for the construction of interpreters, in which you define nodes in a tree. Typically this will be something like the abstract syntax tree of your language but it doesn't have to be. Each node is a class you define, with an execute method (handwaving away some details here).
At the start, you parse source or binary code into a tree of these node objects and then call execute() on one of the root nodes. For example each method or function in your language might be an independent tree, and the root node would be the node at the start of the function. Execute then calls the execute methods of the sub nodes and combines the results together, as per any basic interpreter.
The node objects have fields that contain information about the program, for example:
"return a + 5"
might turn into 4 nodes, a return node, a + node, a variable read node and a constant numeric node which has a field containing 5.
The Truffle class library contains code that measures how often a root node is invoked. After a while some roots will get hot because they get invoked a lot. This method has become a hot spot.
What happens then is the Graal compiler starts compiling the execute method of the root node. It compiles in a different mode to how Java methods are normally compiled:
1. Any time the code reads a field, the compiler pretends it's a constant if it's been annotated with @CompilationFinal. Even if the field is a mutable variable, the compiler acts as if it's not, and will read the value of the field as a constant. This can then trigger constant folding and further optimisations.
2. Any method call is inlined. This proceeds recursively until everything is inlined, stopping only at methods marked as @TruffleBoundarys. The compiler ends up with a single huge method representing the entire interpreter contents of the guest language method. Any calls past a @TruffleBoundary are in effect calls into the language runtime.
Once this is done Graal starts optimising. After inlining the method may be enormous, however, a huge amount of the code in the interpreter can be removed by these optimisations.
Dynamic languages have behaviour too complex to fully compile to native code. The amount of code required would end up being enormous and very slow, as it'd constantly need to look up basic things, like whether you redefined what + means. Therefore Truffle supports a variety of techniques to make them run faster.
One is an Assumption object. You can create these in your nodes and check them in your execute methods. It's a boolean flag. When compiling the JITC assumes the assumption is true, and deletes any code that would have been called if it were false. Your execute methods can call a special method on an Assumption to set it to false however. When that happens HotSpot will de-optimise all the compiled methods and force the back to your Java interpreter.
Another is a transferToInterpreter() method. It's special and says, any code that could execute past the point where I call this method should not be compiled. It means you can keep code that handles obscure cases out of the compiled method. Again, if the compiled method would end up executing a transfer, a de-opt happens.
Another is specialisation. Whilst your interpreter executes it's allowed to change the node objects in the trees. For example because your interpreter observes that at that point in the code you only ever add numbers together, not numbers and strings. The node can do a type check and if it fails, de-optimise and swap itself back to a slower but more generic node. Truffle has a thing it calls a "DSL", really it's a bunch of Java annotations, that automates this whole process for you so you can define a template node class with a whole bunch of different execute methods. It then generates all the actual node classes and the code to do the type checks and swapping behind the scenes.
There's lots more in Truffle to do with supporting debuggers, profilers, language interop etc, but that's the gist of it.
All these techniques added together give you a high level API for building HotSpot or V8 style advanced speculating JITCs, with little more than a specially written interpreter. It's not entirely automatic, but it's far easier than any other framework out there.