Writing a compiler for a DSL in Haskell using Parsec
This winter, I took Compiler Design with Prof Laurie Hendren.
Ever since I began programming, I’ve been interested in programming languages. Languages (both the natural kind, like English (the rules of which you’re using to parse this sentence inside your head (unless you’re a bot, in which case 01101000 01101001 00100001
)) as well as the programming kind like Java or C) are tools which people use to make Stuff. Once you understand how to use the symbols of a language, you can compose them in arbitrarily complex patterns to write arbitrarily complex things like a novel or a search engine. Programming languages, in particular, are reminiscent of wizardry (to use a somewhat tired cliche). Invoking a language is spell-casting.
Now, knowing how to use a language (or a spell!) makes you powerful. But True Power comes from knowing the meta-language, and True Wielders of Awesome Power are language-writers like Richard Stallman, James Gosling and Grace Hopper; people who invent programming languages, who give our imagination hammers and chisels, who define the way we create things, who give us the abstractions we need to build the things we do.
Yay italics.
I wanted to be a True Wielder of Power. I wanted to learn how programming languages work, and how they’re created, and what a compiler does. Also Steve Yegge says that students who don’t take Compilers “run the risk of forever being on the programmer B-list” (this is also good reading if you’re debating whether or not to take Compilers; it’s what convinced me!), and by golly, that was not going to happen to me. I’m a Good Programmer Software Engineer! I’m not just a JavaScript junkie who strings together APIs (jk <3 js); I want to know how things work From The Inside! I wanted to be free from the tyranny of ignorance of what the JVM is or does or what it means to transpile or cross-compile etc!
Anyway.
The course was fun and hard and I learned loads. A massive component of the course was a project where you actually write a compiler for a high-level source language. Two source language options were given - a subset of Go, or an experimental oncology-research domain-specific language used for radio-oncology data (phew) called OncoTime. My friends Yusaira and Brendan and I decided to implement the latter, because as OncoTime is still under development, we would get to make language-design decisions and be creative and all that instead of just implementing a language spec.
We ended up doing a good job, I think. We implemented our compiler in Haskell (using Parsec, a monadic parser combinator library), and compiled down to JavaScript. We made some interesting modifications to the traditional compiler pipeline. You can check out the source code for the compiler at the GitHub repo and read about the design decisions we took here.