I still think the fact that even most people with a 4-year degree still haven't done a compilers course is the core problem. Both other times we've had this discussion I haven't noticed anyone popping up to say "Yeah, I've written like 3 compilers and Wasabi was just an insane idea." (Of course, the Internet being what it is, someone will probably say that now. But the point is I haven't seen it before I asked for it.) A lot of people are doing the cost/benefit analysis with an order of magnitude or two too much in the "cost" column. Yeah, of course it looks insane then... but the problem is the analysis, not the reality.
Compilers just aren't that magically hard and difficult. I'll cop to not having written a true compiler yet but I've written a number of interpreters, and I've written all the pieces several times (compile to AST, interpret, serialize back out, just never had the whole shebang needed at once).
If you're reading this, and you're still in a position where you can take a compilers course, take it! It's one of the most brutally pragmatic courses in the whole of computer science and it's a shame how it's withered. (Even if, like me, you'll probably write more interpreters than compilers. And nowadays you really ought to have a good reason not to pick an existing serialization off-the-shelf. But it's still useful stuff.) It's one of those things that is the difference between a wizard and a code monkey.
I've written like 3 compilers, and while I don't think Wasabi was quite insane (they had an interesting set of constraints, so I could at least follow the logic), it's not the choice I would've done. Or rather, it's totally the choice I would've done as a fresh college grad in 2005 having written my first compiler for work (which was ripped out in about 2 months...it didn't take me that long to realize my mistake), but it's not what I would've done with the hindsight experience of that and other compiler projects.
The cost of an in-house programming language isn't in writing the compiler. It's training all your new team members in the language. It's documenting the language constructs, including corner cases. It's in not being able to go to Stack Overflow when you have problems. It's in every bug potentially being in either your application code, your compiler, or your runtime libraries, and needing to trace problems across this boundary. It's in integrating with 3rd-party libraries, and in not being able to use tooling developed for an existing mainstream language, and having to add another backend to every other DSL that compiles to a mainstream language.
All that said, I agree that if you're ever in a position to take a compiler course, do it. It's one of the most valuable courses I ever took, and really peels back the mystery on why programming languages are the way they are. It's just that the difference between wisdom and intelligence is in knowing when not to use that brilliant technique you know.
"It's just that the difference between wisdom and intelligence is in knowing when not to use that brilliant technique you know."
Which is precisely why I've never written a full compiler, even though I've written all the pieces many times.
For instance, instead of writing a parser, could you perhaps get away with just a direct JSON serialization of some AST? Do you really need to emit something, or will an interpreter do? So far I've never been so backed against the wall that I've actually needed a full compiler.
Yeah, one of the compilers I wrote just used JSON as the AST, with it being generated by a GUI interface. Another used HTML with annotations (although go figure, I wrote an HTML parser [1] for it, because there weren't any C++ options at the time that didn't bring along a browser engine). A third had a custom front-end but then emitted Java source code as the back-end.
The interesting thing is that the more experience you get, the more alternatives you find to writing your own language. Could you use Ruby or Python as the front-end, much like Rails [2], Rake [3], or Bazel [4]? Could you build up a data-structure to express the computation, and then walk that data-structure with the Interpreter pattern? [5] Could you get away with a class library or framework, much like how Sawzall has been replaced by Flume [6] and Go libraries within Google?
In general, you want to use the tool with the least power that actually accomplishes your goals, because every increase in power is usually accompanied by an increase in complexity. There are a bunch of solutions with less power than a full programming language that can still get you most of the way there.
I'm doing this now, for a crappy language and a crappy processor. It's been a nightmarish hellscape of a project, but also very expanding. Highly recommend.
(If you're interested in goofing around with Starfighter, you're going to get an opportunity to get handheld through a lot of this stuff.)
Assuming I could transfer the benefit of hindsight back to Joel's position in 2005, including all the knowledge of how the market has evolved over the past 10 years? I would've jumped on the SaaS bandwagon, hard, and converted the existing the existing VBScript codebase to a hosted solution, discontinuing support for the PHP/Linux version and freeing the company up to migrate code as it wished on its own servers.
I recognize that this would've been a huge leap for anyone in 2005, when 37signals was basically the only company doing small-business SaaS and the vast majority of companies insisted that with any software they buy, they actually buy it and the source code and data sit within the company firewall. Heck, when Heroku came out in 2007 I was like "Who the hell would use this, turning over all of their source code to some unnamed startup?"
But looking at how the industry's evolved, that's pretty much the only way they could've stayed relevant. Many companies don't even have physical servers anymore. That's the way FogBugz did evolve, eventually, but they were late getting there and had to back out all the existing Wasabi code and fixes they made for it to be easily deployable (which was one of their core differentiators, IIRC; they were much easier to setup than Bugzilla or other competitors).
It makes me appreciate how tough the job is for CEOs like Larry Page or Steve Jobs, who have managed to stay at the leading edge of the industry for years. Larry was pretty insane for buying a small mobile phone startup called Android in 2005, but it turned out to be worth billions eventually.
Tangent: Your description of how people resisted SaaS a decade ago makes me wonder if the only reason the industry did eventually move toward SaaS was that most on-premises apps were such a nightmare to deploy. After all, some of the disadvantages of SaaS, such as lack of control over one's own data, are real. If Sandstorm.io had existed back in 2004, might we have avoided SaaS altogether? (Of course, if Sandstorm.io had existed back then, Fog Creek would still have needed to port FogBugz to Linux.)
I think the move to SaaS was a combination of factors:
1. The primary product of many companies got too large to deploy on their own server farms, and so they started moving toward AWS etc. for scalable hosting. Once your product is in the cloud, it makes sense to deploy your supporting infrastructure & tooling there as well, because otherwise you're paying the support, hosting, & sysadmin costs for just your non-critical corporate infrastructure.
2. Bandwidth became a non-issue. In the 1990s there was a very measurable difference between 10BaseT internally vs. an ISDN line to your hosting provider. In the 2010s, there's little practical difference between gigabit Ethernet vs. 10M broadband.
3. HTTPS became ubiquitous, taking care of many security risks.
5. Employees started to blur the line between work and home, leading to demand for work services that could be used, encrypted, from a user's home network. VPNs were a huge PITA to set up. This was a big issue for much of the early 2000s; one of my employers made some clever network software to punch through corporate firewalls with a minimum of configuration.
6. Development speed increased. SaaS companies could push new versions of their product faster, react to customer feedback quicker, and generally deliver better service. Because all customer interactions go through the company's servers (where they can be logged), they have much better information about how people are using their products. Deployed services were left in the dust.
tl;dr: #1-4 made lots of businesses go "Why not?", while #5 and #6 made them go "Yessss."
It's interesting that many of the arguments about why you should not use SaaS businesses now (like privacy and security, and lack of ownership) were relatively minor reasons then. I do kinda wish (in an abstract way) that something like Sandstorm would catch on, but I think they may be early: SaaS just isn't that painful, and until we have a major shake-out where a lot of businesses get taken out because their dependencies go down, it seems unlikely that it will become so. Or the other way this could play out is that a new powerful computing platform comes out that lets you do things that aren't possible with thin clients, and you see a rush back to the client for functionality.
All very good reasons. I'll add another - accounting.
The monthly bills for small purchases of SaaS fits on what could be expensed on a corporate card. By the time IT gets wind, the product has already infiltrated the organization. If there's a very large up front cost, then IT is involved, you need a formal RFP process, lots of people weigh in, those opposed to the purchase can try and block it... As soon as "Put it on the corporate card" became viable, power moved back to the business units.
With Sandstorm, we could actually get that effect on-prem. Since no technical expertise is needed for deployment, and since the security model is so strong, and the IT department will be able to manage resource quotas on a user basis rather than an application basis, it's actually entirely reasonable that people outside of IT could be permitted to install software without IT approval.
Granted, it may take a while to convince IT people that this is OK, but fundamentally they have every reason to prefer this over people cheating with SaaS.
Actually, not that late. I think their main problem was that the environment changed around them. Besides SaaS, the whole developer ecosystem changed as well: when I look at who really won the bugtracking market, it's GitHub, who added it as a feature on code hosting.
If winning the bugtracking market was the goal, they probably would've taken VC money. You may notice that everyone who's in a position to claim that has done so (Github, Atlassian, etc).
They did learn from this, as you can see by the very different paths StackExchange and Trello are on.
Joel wrote an essay about this. [1] His basic thesis is that organic growth wins over VC when there are entrenched competitors, few network effects, and little customer lock-in. VC wins when there are wide-open markets, strong network effects, and strong customer lock-in. Stack Exchange's investment was consistent with this thesis [2].
The developer tools market changed from one with very few network effects to one with a large network effect around 2010. The drivers for these were GitHub, meetups, forums like Hacker News, and just its general growth - they made coding social. When I started programming professionally in 2000, each company basically decided on a bugtracker and version control system independently, and it didn't matter what every other company did. By 2015, most new companies just use git, they host on GitHub, and if they don't do this, they're at a strong disadvantage when recruiting & training up developers, because that's what much of the workforce uses.
Interestingly, both GitHub and Atlassian resisted taking investment for many years - GitHub was founded in 2007 and took its first investment in 2012, while Atlassian was founded in 2002 and took its first investment in 2010.
Right! And this isn't even a compiler, the way most people think of "compilers". It's a transpiler to three target languages each of which has an extraordinarily full-featured runtime. Two of which are (a) widely available and (b) as source languages, awful.
And it's not their own language, it's an extension of VBScript. And now that the tools around C# are better and Linux support for .Net is official, they have used these tools to transition to C#. Like you, I don't get the outrage.
Do you think that the name Wasabi contributes to the outrage?
Coffeescript has a similar name to Javascript, so you can quickly draw an association between the two.
The name Wasabi doesn't have an obvious connection to the VBScript that it's based on, which seems to be the cause of people talking about writing a whole new language, etc.
I've written some toy compilers, and I can at least say:
1. compilers have bugs
2. it really sucks not knowing if a bug is in your code or in your compiler
3. it sucks not having a source-level debugger
Anyone can write a simple compiler, just like anyone can make a simple database. The hard part (at least for a non-optimizing compiler) isn't the comp-sci theory, it's making the tooling around, and the extensive amount of testing needed to be sure you don't have subtle data corrupting bugs lying around to bite you.
I won't categorically reject the idea, for instance I think Facebook writing their HipHop compiler was completely defensible. But you need people with compiler experience, and people who know the pain of working with crappy, undocumented, buggy toolchains to make that decision, not people who once took a compiler course.
I've written like 3 compilers* and Wasabi seems like it was probably a reasonable solution for the problem they had at the time. Compilers just aren't that magically hard and difficult.
There are very few situations where writing your own langue and toolchain is a good idea. I used to work on a proprietary company language that was actually a compiler generator for language-to-language translation, plus a bunch of other stuff, and it was a horrible pain.
Documentation? None
Online community? None
Transferability of skillset? None, apart from knowing how compilers work. Makes for good nerd conversation, but that's it.
Writing your own toolchain is almost as bad. I've seen multiple talented people leave companies I've worked at when they were forced to build and maintain horrible tools for the in-house ecosystem. Some too-big-for-his-britches second-system-as-a-first-system ass had written them, and everybody else got stuck with it.
As the other commenter noted, this seems like epitome of bad software engineering and I'm surprised employees put up with it if they were any good.
EDIT: I learned to program in assembly, so compilers didn't seem super mysterious to me as they are for someone who learns Java first perhaps.
Can't you say the same things about a proprietary database, or a proprietary template language? What are the kinds of computer science that we can safely deploy without taking extra precautions to document and maintain it?
Both of those should be looked upon with suspicion. I can't say "never do it", given that every employer I've ever worked at has had its own proprietary database, and one of the projects I worked on at Google was a proprietary template language. But all of them were a large maintenance burden, much larger than originally anticipated.
I think the old business adage about "In-source your core competencies, outsource everything else" applies here. If you derive a big competitive advantage from having a proprietary database or proprietary template, and it generates enough revenue to afford a dedicated team of experts to maintain it, build it. But if you have a bunch of smart & motivated developers who can build a proprietary database, but your product isn't databases or templates and your core differentiator isn't the performance or query patterns you get from building it yourself? Put them to work improving the product, and work with the infrastructure that other firms have built already.
i'd actually be way more suspicious of a proprietary database, unless there was a very compelling reason why none of the existing ones worked. maybe this is just my inexperience in the field, but a database engine seems orders of magnitude harder to get right and maintain than a compiler (that too a transpiler, so you can even human-inspect the output!) does.
Yes. Any proprietary system will require you to document/educate the users, and you will not have the benefit of an online community to get help from, or bug fixes, or security analyses. There are very few problems where rolling your own solution is the right solution. Maybe if you are Google and your database isn't big enough or something.
If you have great people building the software, or at least competent ones, and you have competent users, you might succeed, maybe. But that's assuming you have a maintenance plan and a roadmap, which most software companies do not. Maintain software? YOLO! What happens when you have a bunch of morons using and maintaining the software?
In short, computer science in industry is largely practiced as shamanism by people who cannot engineer their way out of a crackerjack box.
"There are very few situations where writing your own langue"
Well, I can see how you might struggle there ;-)
Good natured snarks about spelling aside, part of the issue is that writing, documenting and maintaining your own language is only hard if your toolchain sucks.
If you're interested in writing a specialized language to solve a particular problem, take a look at PEG for JS, and either Racket or Common Lisp (the latter if you need native compilation).
I've recently been involved in the design and implementation of an English-like language for the expression of business domain concepts in web apps. It's a great approach if done thoughtfully and professionally.
That's probably the key, actually. The horror stories we hear are of the bad examples. And we all know that shitty tools, weak languages and bad documentation can come out of large software companies as commercial products as well.
I didn't take a course on compiler construction, and now I don't remember if my university's CS department had one (it was a fairly mediocre CS department at a state university). Now I wish I had.
Do you think a good compiler course would prepare the student to do a project with the scope and complexity of Wasabi? For one project, I wrote an interpreter for a little domain-specific language, then later reworked that interpreter into an on-the-fly compiler (to Lua, to avoid double interpretation). But that's a long way from writing a compiler for a general-purpose language, that can do global type inference and produce human-readable output in a target language that's fairly different from the original VBScript (if not Wasabi itself).
The trickiest bit of Wasabi is the type inference, which I admit is not "production-ready" (or "good code") because we basically invented it from scratch. If I were to do it now, I would know just enough to realize that I need to read about Hindley-Milner rather than reinvent the wheel.
Producing human-readable output is an exercise in tedium and bookkeeping, not any particular amount of skill or brilliance.
Thanks for confirming my guess that the type inference was the trickiest part. These days, I guess Flow (http://flowtype.org/) would also be worth studying. Edit: Or PyPy's RPython.
Compilers just aren't that magically hard and difficult. I'll cop to not having written a true compiler yet but I've written a number of interpreters, and I've written all the pieces several times (compile to AST, interpret, serialize back out, just never had the whole shebang needed at once).
If you're reading this, and you're still in a position where you can take a compilers course, take it! It's one of the most brutally pragmatic courses in the whole of computer science and it's a shame how it's withered. (Even if, like me, you'll probably write more interpreters than compilers. And nowadays you really ought to have a good reason not to pick an existing serialization off-the-shelf. But it's still useful stuff.) It's one of those things that is the difference between a wizard and a code monkey.
(If I said that too concisely for your tastes, see: http://steve-yegge.blogspot.com/2007/06/rich-programmer-food... )