Unlike this recursive-descent parser, my minimalistic language uses a table-driven shift-reduce parser which is entirely contained in the C code. Both the lexer and the parser can be extended very easily in order to accommodate new tokens and new syntactic forms.
I guess at the end of the day there's no real alternative to "just dive into the source code for a while" with these sorts of things - hence the importance of having "decades of experience" under your belt - but I'm curious if there's anywhere in particular that you might recommend as a startpoint.
Yes, examples of minimal ELF files are good starting points for getting the back end part up and running. I'm not sure that reading real-life code will get you very far, because it often deals with lots of edge cases and attempts to be efficient and generate clever code. However, there are compilers that have been made as educational tools or with simplicity as a major design goal. I would stick to those in the beginning. See my home page (http://t3x.org) for lots of examples!
Then, books. I'll make a list some day, I promise! ;) From the top of my head: Wirth's "Compiler Construction", Richard's "BCPL, the Language and its Compiler", and - shameless plug - "Practical Compiler Construction" by myself. I have also read the Dragon Book back then - it's good, but too heavy on theory for a beginner.
> Yes, examples of minimal ELF files are good starting points for getting the back end part up and running.
Mmm, one of several components.
> I'm not sure that reading real-life code will get you very far, because it often deals with lots of edge cases and attempts to be efficient and generate clever code.
Very good point. It takes a bit more time and effort to build a proper mental map, but there's really no worthwhile alternative.
> However, there are compilers that have been made as educational tools or with simplicity as a major design goal. I would stick to those in the beginning.
I agree! These help a lot, and make study almost (if not very) fun :)
> See my home page (http://t3x.org) for lots of examples!
Will do, thanks!
> Then, books. I'll make a list some day, I promise! ;) From the top of my head: Wirth's "Compiler Construction", Richard's "BCPL, the Language and its Compiler", and - shameless plug - "Practical Compiler Construction" by myself. I have also read the Dragon Book back then - it's good, but too heavy on theory for a beginner.
Interesting. I actually stumbled on a copy of the Dragon Book while at an op-shop some time ago. I didn't actually know what I was grabbing at the time, just that it looked like a really good idea to prioritize getting it. I'm very happy to have it, and hope to be able to make sense of it at some point :)
BCPL, the Language and its Compiler and Practical Compiler Construction don't seem to have free options; that's fine, I've added these (and your other books!) to my (burgeoning) wishlist :P
Another really cool and intricate aspect of your compiler is that scanning/parsing/generation are all done in a single pass. I'm not too convinced whether there is a practical benefit to that, but it certainly contributes to the small size of the compiler and reduces the memory footprint, since it eliminates the need for keeping the list of tokens or the parse tree in memory, effectively only using the stack as storage?
The biggest benefit is simplicity, the greatest drawback is lack of extensibility (optimizer, etc).
Memory footprint is not really optimal, because T3X9 keeps both the entire source code and the entire executable in memory all the time. I wanted fast compilation, though. For small footprint, I would have made it a two-pass compiler.
An early T3X compiler ran in tiny on DOS (64KB for code and data).
I've also invented something like this: NISC. (Or {}ISC) Null Instruction Set Computer. You can arbitrarily reduce the die size for implementations of this architecture, as well as scale the clock speed arbitrarily. It also revolutionizes cooling.
I don't think the description as minimal was intended in any formal way. It's informally minimal in the sense that it features less complexity than more well known languages like C or Fortran.
A formal notion of a minimal language/machine gets messy anyway and is of limited utility even in a purely academic context, as far as I know.
>How is this able to omit basic type information? Is everything just an int?
Data is untyped and operators are typed. So in X+Y, X and Y are integers, in X::Y, X is a byte vector and Y is an integer, and in Y[X], X is an integer and Y is a vector (and the types of its element again depend on the operators applied to them).
Note that not all combinations of data and operators make sense. E.g., "foo"*[1,2,3] is a valid expression, but probably will not deliver any meaningful result.
I'll leave the interpretation of the statement to you! ;)
It is, of course, a bit of an oxymoron to answer this question myself but, FWIW, I have been writing compilers since the 1990's, both as a hobby and as a job.
I have written compilers for lots of different processors and virtual machines with varying key aspects, like compilation speed, memory footprint, executable size, etc.
Check out my home page (http://t3x.org), there compilers for many different languages, from mainstream to pretty exotic. There are also excerpts from many of my books and even the full texts of some of the older ones.
The author is unknown to me (and I am nobody!). FWIW, I just bought the book and quickly read through the chapter on bootstrapping. IMHO, it is really well written.
What could it teach? Compiler design? Imperative language design? By "bootstrapping", are you referring to building a higher-level language that compiles to T3X?
It can teach the most fundamental basics of procedural language design and compiler construction: how to implement selection, loops, and functions/procedures as well as scanning, parsing, and code generation.
By bootstrapping, I'm referring to building a T3X9 compiler for a new target platform and than using that to create something more interesting. You can start the process by re-implementing the compiler or by re-targeting it.
Something related to functional programming, databases, data analysis, dev methodology, or system architecture. I use a procedural language in my job, but this doesn't seem relevant to it. My degree isn't in CS, if that helps.
Cool, looks like I have something to read this weekend. I am glad their e-book caters towards people with little comp-sci / compiler design background.
>The compiler translates from T3X9 directly to FreeBSD-386-ELF. It does not require any assembler, linker, or other additional programs. I suspect that it can be ported to other 32-bit ELF Unixes in about half an hour.
Compiled executables are, by their very natures, usually specific to operating systems. This minimal compiler outputs one specific executable file format, hardwiring a specific platform and instruction set architecture in that executable file format's header.
Yes, I know that about executables, but I guess compiler-related subtleties are lost on me. I will clearly benefit from reading the book I just bought.
The Mandelbrot example has implementation problems. The check for a point being outside the set is implemented totally wrong, even the comment (which is not even consistent with the code) is wrong.
Not judging the language itself, but that is not really convincing.
Unlike this recursive-descent parser, my minimalistic language uses a table-driven shift-reduce parser which is entirely contained in the C code. Both the lexer and the parser can be extended very easily in order to accommodate new tokens and new syntactic forms.