> And actually we’re now entering a world where at minimum supporting LLVM as a backend option is mandatory.
I would have said this too a few years ago, but Go is a huge counterexample. It's become very popular without using LLVM, and the compiler that does use a mature, optimizing backend (gccgo) doesn't get nearly as much use as the Plan 9-based one.
I should also mention that increasingly it's become popular to have a mid-level IR that you lower the AST to before lowering to LLVM IR. GHC does it, Swift is doing it, and Rust is working on doing it. The reasons are (a) you can do language-specific optimizations like devirtualization, which are often pretty hard to do in LLVM; (b) going from AST to LLVM IR is often a big, messy leap for any language that isn't C.
Go is an outlier, a social phenomenon brought about by just the right combination of people, companies and the niche it occupies. Something like Go won't happen again in the near future, not at this scale for sure.
Which can be a bad or good thing, depending on how you decide to look at the issue.
> I would have said this too a few years ago, but Go is a huge counterexample.
I've been wondering about this ever since Go came out. If I recall correctly, LLVM was already a viable option back then.
The latest release of Go earlier this week featured a pretty big revamp of their backend with a new SSA-based middle/backend. I wonder how much work went into that and could they have achieved that faster had they been on LLVM since day 1.
There's an "official answer" in the FAQ [1] ("We considered using LLVM for gc but we felt it was too large and slow to meet our performance goals." Note: lowercase gc refers to Go compiler, whereas uppercase GC would refer to Garbage Collector). That said, I seem to recall other additional/more detailed reasons were given on various occasions too, including:
- ease of iteration and freedom to make huge breaking changes (and they did make use of it, see e.g. various generations of the GC, or the SSA work);
- thorough knowledge of the internals of the compiler they used (given they were it's authors; also this is linked to the reason I listed above);
- one of the explicit goals for the Go toolchain was that it must compile very fast (which I believe is also somewhat at odds with advanced optimizations, like those used in LLVM) - they even quipped something like "we designed Go while waiting for C++ to compile";
- additionally, though I'm not sure if it was ever spelled explicitly, I suppose they wouldn't want to get tied to the LLVM as a huge dependency, including having to adjust to each and every change in LLVM API, regardless if sensible or not (I seem to recall from so me articles that LLVM does introduce breaking changes in API from time to time?);
Yes, GC is a big issue... or was back when Go was just getting started. GC support in LLVM should be MUCH better now. Ie. it has everything you need to hook up a GC in it, but there isn't a built-in GC in it.
And the fact that the first people working on Go have a good knowledge of the Plan9 toolchain probably gave them a head start. But that benefit has probably eroded long ago.
And I completely disagree with the Go team's emphasis on compile times. Fast result programs are far more important than fast compile times. Fast compile times are nice to have, but not much more than that. But this is just my opinion, it's as good as anyone else's.
Russ Cox gave a really good answer to this question. I have to paraphrase because I can't find the original comment.
It was done for implementation speed. They had a compiler backend which they, the early Go team, were all familiar with. It was small so they could quickly change it to suit Go, rather than the other way around.
Russ Cox expressed the sentiment that if they had tried to use LLVM there would have been a lot of up front work that would have taken a long time and the Go project probably would have fizzled out. Using the plan 9 tool chain they were up and running very quickly allowing them to focus on the most important work, designing the language itself.
(Probably shouldn't attribute this to Russ Cox, I've probably read into his (rather briefer than what I wrote above) comments)
In my opinion this was the right choice. If LLVM turns out to be the best backend for Go it's easy to use it later. However, the language itself is largely fixed after 1.0. Better to focus on the language than the compiler backend which can be swapped out later.
LLVM does take a long time to compile, but four hours is an exaggeration. A mid-range computer can compile it in 30 minutes with -j4, and a good computer can compile it in 10 with -j8.
Heh, I was in this same boat two years ago. Servo used to build its own Rust back then, and that involved building llvm. I hadn't yet discovered ccache[1] and I had an old 4GB RAM laptop with a Core something processor. Took 4-6 hours total IIRC. This was also a really old laptop, so that was understandable. But very very annoying, since sometimes I'd need to rebase over a rustup and then wait hours for the build to work. This was a large part of my impetus to move servo to a state where we download Rust snapshots.
[1]: Btw, if LLVM compile times are getting to you, Use. Ccache. It is amazeballs.
Those compilers don't do nearly as much optimization as LLVM does. It's hard to overstate how wide the gap between the classical compilers you like and LLVM/GCC truly is. LLVM has no fewer than three IRs, each with a full suite of compiler optimizations (four if you count ScheduleDAG). LICM, just to name one example, runs at each stage (see MachineLICM.cpp). Turbo Pascal did nowhere near that level of optimization, and it produced unacceptably slow programs by modern standards.
Maybe the biggest issue is optimizing register allocation while providing the information for all roots for the garbage collection... For non-GC languages you don't need to care about it and just agressively optimize register allocation. For GC languages you have to be careful, because the GC must know 1) which registers are holding references and which not 2) which of those registers are "alive" and which are not in use anymore (and therefore ready for garbage collection)...
Is LLVM that well supported on windows ? Go aims at running on as much platforms and it seems to me a lot of languages built with LLVM don't run on windows (Swift,Crystal,...).
Yes, there are no immediate plans for anything other than MIR at the imperative-semantic level (the compiler also uses HIR, which is mostly a desugared AST and not related to LLVM IR).
Currently, initial support for lowering MIR to LLVM IR is close to complete and it might be enabled in the next beta (or the one after that).
None of the analysis passes have started their transition to MIR, though (there is a well-typed check to catch bugs in MIR generation, but that's about it).
There's also talk of using MIR for compile-time (pure) function evaluation [1] and some out-of-tree experimentation with interpreting it [2].
MIR looks a bit more low-level than Haskell's Core. Like Core, MIR is typed, but unlike Core, match statements disappear. This is interesting, because matching is really the main computational engine of Core. I wonder if there's a good explanation why it was deemed advantageous to desugar match.
I also wonder if there are plans to integrate the MIR interpreter into the compiler. IIRC, interpretation of Core between optimisation phases is used to good effect in Haskell.
Depends what you mean by "match" statements: if you're referring to complex patterns, yes, they get desugared.
If you're instead referring to matching over ADT (enum) variants, that is a MIR primitive, the "Switch" terminator (there is also a "SwitchInt" for integer-like values, as in C or LLVM IR).
As for the interpreter, it's quite likely that we'll use (something like) it for evaluating constants, but not entirely certain atm.
I personally hope we can get the CTFE semantics I described (see [1] above), which despite being pure, go far beyond C++17 constexpr capabilities (e.g. you could potentially run an entire compiler for a different language at compile-time).
We don't really have our own optimizations atm, but we do want MIR optimization passes.
The benefits would be two-fold:
We could do transformations LLVM can't figure out itself, like NVRO: `fn bar() -> T {let mut x = ...; foo(&mut x); x} ... box bar()` can (theoretically) end up calling `foo` with a pointer to the heap instead of having to copy `x` from the stack later.
And we can lift the burden of LLVM trudging through monomorphized instances from MIR: since MIR still has type parameters, you can transform it and simplify the resulting LLVM IR for all instances, e.g.:
mem::replace(mut_ref, x) is {mem::swap(mut_ref, &mut x); x} which can be reduced once to {Return = * arg0; * arg0 = arg1;} in MIR and translated to a load, a store and a ret (for immediates) or two memcpy's (for indirect argument & return), depending on the type.
As most of the time rustc spends is in LLVM optimizations, this will result in compilation time speedups linear in the number of instances, sometimes exponential in the size of the original code.
All true, but there is more. SPJ and S Marlow [1] list the following additional advantages of using a typed intermediate representation, which
would/could apply to MIR too.
1. Running a type-checker on the IR "is a very powerful consistency
check on the compiler itself. Imagine that you write an
'optimisation; that accidentally generates code that
treats an integer value as a function, and tries to call it. The
chances are that the program will segmentation fault, or fail at
runtime in a bizarre way. Tracing a seg-fault back to the particular
optimisation pass that broke the program is a long road. Now imagine
instead that we run [the type-checker on the IR] after every
optimisation pass: it will report a precisely located error
immediately after the offending optimisation. What a blessing [...]
in practice we have found that it is surprisingly hard to accidentally
write optimisations that are type-correct but not semantically
correct."
2. Running a type-checker after translation to IR serves as a "100%
independent check on the type inference engine".
3. A high-level typed IR is a "[s]anity check on the design of the
source language."
You may be better off building up the LLVM IR as text, vs. using the FFI builder. (If you want/need type safety as opposed to just waiting for the LLVM verifier to tell you what went wrong, you could write the builder in Lua.)
So I've done basically the same thing with a few differences:
(1) I chose Common Lisp as the language I would implement
(2) I used C++ template programming to automatically generate wrappers for C++ library functions (like boost::python and luabind) so that I could expose C++ functions directly to Common Lisp. That allowed me to use the llvm C++ API directly.
The project is called clasp -> github.com/drmeister/clasp
Send me a message - we should talk and compare notes.
Instead of generate LLVM code, you could instead generate 'typed Lua' using Lua as a metaprogramming language. Then the 'typed Lua' can get compiled down by LLVM into object code.
I never understood in simple terms what terralang is actually about. Even in your sentence, if typed Lua is being "generated" (i.e., it's the target) how come Lua is being used as a metaprogramming language (i.e., a language that does the generation).
I understand generative metaprogramming as consisting of a language that a human deals with, a language that gets generated. In the case of Terra/Lua it's not clear which generates and which is generated, or whether it's is a completely different approach (to which I say, what's the point).
On the Terra webpage, it says 'Terra is low-level' but then 'Terra is code embedded in a Lua meta-program', not to mention 'Terra-Lua programs'.
Terra is lower-level, and on evaluation compiles to object code. Lua is higher-level, and on evaluation compiles to bytecode before being executed. A source file written entirely in Terra would produce a binary, and a source file written in Lua would produce Lua bytecode. In either case, you can run the result immediately, or save the resulting object code/bytecode to run later.
The strength of the two is that Terra can be manipulated using Lua as a metalanguage: for example, you could write a Lua function which produces Terra code, which is then executed. In some sense, this is like C, which has a lower-level language (C) and a corresponding high-level language which can manipulate that lower-level language (the C preprocessor). In this case, though, Lua is much more powerful and flexible than CPP, and Terra is also a smaller, simpler language than C.
It gets more powerful, though, because the Lua parts don't just have to exist at compile-time. You could use Lua at run-time to produce Terra code, which in turn can get executed immediately, producing object code specialized to a given situation or machine: for example, you might write a language, use Lua to parse it, and then convert it to Terra to produce object code which is then invoked immediately: this effectively just-in-time compiles your code. Or, you could write an algorithm in Terra with specific constants that aren't known until run-time, and then use Lua to stitch those constants in before running the algorithm: for example, to take advantage of specific machine features or parameters.
These examples also assume that you have only "compile" and "run" stages; you could also write programs with more stages: for example "compile source", "specialize for machine", and "run" could be three distinct stages. Terra/Lua would afford you the flexibility to stage your computations in this way.
use `pairs(a_table)` to get `key, val` variables bound respectively. Use `ipairs` to get a loop over indices and values. This is very similar to Python, only with "do/end" instead of indent/dedent. What you can't do in Python, but can in Lua is to provide less variable names than the iterator function returns. For example, to get a list of keys when iterating over `(key, val)` pairs in Python you'd have to do this:
The terminology seems to be tripping you up (and I agree these terms aren't super clear). I think it's easiest to understand this through examples. Say you wanted to create a specialized function in pure Lua:
function make_adder_func(n)
return function(x) return n + x end
end
local add3_func = make_adder_func(3)
local result = add3_func(6)
-- Prints "9"
print(result)
This totally works: because Lua supports both closures and first-class functions, the parameter "3" is captured in the environment of the inner function, so we have a specialized function for adding 3 to any other number.
Now the way Lua will execute this isn't super optimized. It treats "n" as a local variable which could theoretically be modified (it's not here, but proving that takes more of an optimizer than most dynamic languages possess). Also Lua doesn't even statically know the type of "n", so the "+" operation could be string concatenation, or even a user-defined function for the __add metamethod if "n" was a custom type. So even though it looks like the add3_func() is pretty specialized, it's actually not specialized at all in practice.
(For the record, LuaJIT is smarter about this and does have a trace compiler + optimizer that makes this extremely efficient).
But what Terra gives us is the ability to generate fully specialized and optimized functions at runtime. In Terra we could write:
function make_adder_func(n)
return terra(x : int) return n + x end
end
local add3_func = make_adder_func(3)
local result = add3_func(6)
-- Prints "9"
print(result)
Unlike the Lua function, the Terra function is strongly typed and does not contain a closure over its environment. The inner Terra function isn't referring to the Lua variable "n", it is taking the value of n and making it a constant inside the Terra function. This allows the Terra function to be more thoroughly optimized (more like what LuaJIT would do).
So getting back to your question, the phrase "typed Lua is being generated" is a little confusing. What it means is that these Terra functions are strongly typed, and we're using Lua code to programmatically generate Terra functions, like in make_adder_func() above.
I know less about this than the other people replying, but will chip in because I have a different perspective that you might find useful.
When I saw what this guy was doing, my first reaction was also "that's like Terra!"
Here's how developers work at the moment: You write some code. Then - you /fork a process/ that reads your code, and puts it into a pipeline to produce machine code or bytecode. It's a ceremony. There's all kind of obscure little technologies to know about - variables defined in your shell language, makefiles, the peculiar arguments to specify when you invoke your compiler.
This article and Terra show that there might be a simpler path than this available. In Terra, you describe a system of code using lua structures. (They could as easily have been s-expressions). Terra then abstracts those into their structural components, and passes that tree to LLVM. And an artifact comes out. This is cool, because now you can easily integrate automatic code generation /into your application layer/.
What is happening here is not revolutionary in its mechanism. But you get access to the whole thing in a way that I haven't seen before. I think this will significantly change the way we think about programming.
You can see this writer struggling with multiple issues. And he's used lua, which is supposed to be an easy platform to integrate with. And he's having problems. Still, once these issues are worked through, I'd expect to see a a hacker language emerge that is a thin wrapper around the LLVM API and allows you to dance close to the LLVM API.
So here's yet another attempt at describing terra :)
Firstly, 'typed Lua' really meant the terra sub language because it looks like Lua with types.
Going back to terra, consider that 'terra' is a language that can be compiled by LLVM into machine code. So you can choose to either compile and emit object code, or compile and immediately run the generated object code. The other choice you have is to either write terra code by hand, or generate terra code from Lua code.
You say 'if typed Lua is being "generated" (i.e., it's the target) how come Lua is being used as a metaprogramming language'. Why is that not possible? Many languages have built in metaprogramming using the same language (notably lisp code can generate lisp code).
Another way to look at this is Lua is the metaprogramming and control language for generating, compiling and running Terra code.
I also had trouble understanding that from their description, but then at some point I think I got it. Is I understand, it's like if you used Lua (actually, a Lua-like language dubbed "Terra") in places where you'd normally use the C preprocessor, or C++ templates, or Lisp macros, or Rust macros, or inline assembly fragments in C/C++. While your "base language" that you'd use to write most of the code would be your regular old Lua. Does it make more sense when described this way?
edit: so, from reading the other comments around, I think the language roles are actually reversed vs. what I thought - i.e. Lua is the "macro language", and Terra is the "main language".
Essentially it's an implementation of the concept that metaprogramming should use the same semantics and syntax as normal programming. This is the case in Lisp, for example, but Lisp is not (or not typically) compiled, so Lisp metaprogramming is essentially limited to defining syntactic sugar. In Terra, metaprogramming (or multi-stage programming, as they call it), can actually result in improved performance compared to a typical C-like implementation of the same algorithm.
The website has a couple papers that provide benchmarks for some interesting test cases.
> This is the case in Lisp, for example, but Lisp is not (or not typically) compiled …
Say what? Lisp is often compiled; the standard extensively discusses compilation[1] and — to address your point re. performance-improving metaprogramming — compiler macros[2] specifically exist in order to advise the compiler, e.g. for performance.
So, no, Lisp macros are not limited to defining syntactic sugar. It's like I keep on saying: the Common Lisp standard from 1994 contains with its covers functionality that people still don't know about.
Most are compiled. Many also typically have types you can add in to get performance benefits of static typing. Add in compilation of individual functions and live update of running app to top off key benefits of LISP interpreters/compilers.
I have the “Code Model” (that is what I call the AST, because there is no Syntax) up for a very little, C-like language, and I have a frontend that can manipulate parts of it. However, these are not yet combined, and I stopped working on it because I am in the middle of transforming my efforts into a Web App. This will allow you to edit and run a code model in your browser, and download a compiled binary when you are finished.
Btw., you will be able to link code elements such as classes to online-ressources such as a task tracker, a diagram, a mindmap, etc. to have the whole ALM process embedded into your code base.
Interesting article. Looking at the code, it's pretty clear that this is not a particularly practical approach, but I think it points in the direction of where language design should head. Abstractions shouldn't be forced, unless absolutely necessary; forced abstractions lead to inefficiency.
I've been working on a related concept, but more along the lines of developing a low-level IR (LLIR) and corresponding syntax that allows direct coding in the LLIR. It would essentially be a portable assembly language. The LLIR could be then be compiled down to machine code for different architectures, or even run directly in a VM.
This isn't too different from the LLVM concept, except the IR would be a practical programming language, much like assembly. Furthermore, the IR would provide access control primitives (reference capabilities) that most languages do not support, but that are extremely useful for low-overhead secure computing and concurrent computing. These primitives could be especially useful in the design of exokernel OS's, and related applications.
>I've been working on a related concept, but more along the lines of developing a low-level IR (LLIR) and corresponding syntax that allows direct coding in the LLIR. It would essentially be a portable assembly language. The LLIR could be then be compiled down to machine code for different architectures, or even run directly in a VM.
Isn't this WebAssembly ? Bonus points they have sexpression based format and a binary format and are working on a LLVM backend for it + a bunch of JIT integrations, making sure it's portable and have a lot of people working on it from multiple companies.
I've run into so many of these issues myself, trying to work through the forest of LLVM. The abysmal state of the generated docs and those cryptic segfaults are the bane of my existence.
So I started to slap together a little library that gives some order to the madness. I call it petard [0], for reasons that should be obvious if you've heard that word. At the moment I'm focused on JavaScript bindings, but the idea is to build a sensible object-based API that's simple enough to export pure C bindings to make FFI easy.
There's currently support for lots of the basics, but some significant issues too (lack of features as well as poor C++ style). Stern memory-management focused pull requests would be very welcome, and feature additions would be helpful, too.
This seems like a fun way to poke around with LLVM but I don't see how this is any easier than writing down a grammar and feeding it into a parser generator...
"Writing a parser" sounds really annoying and complicated, but in my admittedly limited experience with ocamlyacc, it was easy peasy and dare I say it, fun!
The hard part I guess was defining the language and maybe that's what he's trying to avoid.
But even still, scripting around the AST building functions in LLVM and coding directly to them seems way more annoying to me in the long run. (especially the writing programs part)
Here's the potential I see. Imagine if it was routine to write an application stack entirely in a repl language. Not just the application code - everything. You'd write your application code. Then other functions would dispatch the tree of application code to the LLVM API where it would be compiled. Then other functions could take the resulting artifacts and push them through to deployment.
Doing this at the moment would involve writing fiddly wrappers around lots of layers of technology - dynamically constructing makefiles, and shelling out to call things.
> But even still, scripting around the AST building functions in LLVM and coding directly to them seems way more annoying to me in the long run. (especially the writing programs part)
Using the IRBuilder to build IR is quite easy in my experience.
There's an enormous lump of complexity that this chap seems to be unaware of: the type system. This is the semantic analysis bit that he seems to think is done by LLVM, but it isn't.
It's like he mostly thinks of compilers as turning statements and expressions into executable code, but IMO that's one of the easier bits. Defining how different values interact, what operations are legal and aren't, and the runtime support - this is a lot of work, and in particular, it's different for every language, because it's what makes languages distinct.
I think she's a woman not a "chap". But anyways, you'd be relying on the IR semantics and checking to do the legwork for you. LLVM IR is a language after all.
Yeah. LLVM IR is the most suitable representation for doing this kind of thing, but it's the one that's not compatible across LLVM versions. Maybe it would be worth defining an alternate format for the bytecode that can be read and written by external tools?
Excellent post highlighting the problem solving skills and tenacity of a talented engineer. Of course, this is also the "duct tape and paper clips" approach to writing software that I certainly would not advocate for production systems.
For everyone here who mentioned building wrappers around llvm-c or llvm c++ api: there is a nice tool for automating this job (and many other similar things), it's called "Clang". Or, more specifically, cindex. It's easy to use Python + cindex to parse LLVM headers and spit out nice wrapper library for your language of choice.
I really like the idea, why not hack a scripting engine to add bindings to itself, e.g. in Duktape JavaScript engine. Then a JavaScript function could use the C-API of Duktape to generate AST's at runtime for itself. But I think the VM would be the limit then... support for coroutines etc.
I would have said this too a few years ago, but Go is a huge counterexample. It's become very popular without using LLVM, and the compiler that does use a mature, optimizing backend (gccgo) doesn't get nearly as much use as the Plan 9-based one.
I should also mention that increasingly it's become popular to have a mid-level IR that you lower the AST to before lowering to LLVM IR. GHC does it, Swift is doing it, and Rust is working on doing it. The reasons are (a) you can do language-specific optimizations like devirtualization, which are often pretty hard to do in LLVM; (b) going from AST to LLVM IR is often a big, messy leap for any language that isn't C.