Project 4: vectors, set!, and loops
Project due date:
Download the starter files here
This project is inspired by Essentials of Compilation (EoC), but in this project I take a departure from the book’s content in several important ways. The language in this project, R3, is an imperative-style language with loops, but at the core the important thing happening is dynamic memory allocation on the heap. In the EoC book, this is covered alongside garbage collection–in my class (at least this iteration) I prefer to skip implementing garbage collection: you need to know C well (to do it correctly), and the algorithms are both tricky to describe and challenging to implement and debug in class where we have not covered C programming. Thus, I opt for a simpler route: just use malloc, and (optionally) get garbage collection by using (e.g.,) the Boehm garbage collector; I also leave something such as arena allocation as a stretch goal, which you should pursue if you are a PhD student or interested to learn more.
Input Language (R3)
this language, which I am calling R3, adds imperative features to R2/LIf. Specifically, we add the forms:
(let* ([x e] ...) e-b) ;; let*, allows multiple simultaneously definitions
(void) ;; operations like vector-set! return void
(make-vector i) ;; Allocate a vector of size i, initialize all entries to 0
(vector-ref e i) ;; Dereference the vector e at index i, i must be static
(vector-set! e i ev) ;; e[i] := v
(set! x e) ;; Any variable can be set!d
(while e-g e-b) ;; While e-g, do e-b, eval to (void)
(begin es ... e-last) ;; Do all of es..., eval to the result of e-last
As an example of an R3 program…
(program
(let* ([x (read)]
[i 0]
[acc 0])
(begin
(while (< i x)
(begin
(set! acc (+ acc i))
(set! i (+ i 1))))
acc)))
The major extensions are these:
-
We have a new literal, called
(void). This is sometimes called the “unit value,” and carries no computational information, i.e., control-flow never depends on the value of its contents; in an ideal world we would eliminate voids entirely via optimization, but their inclusion allows us to answer questions like: “what should vector-set! return?” -
Programs may allocate vectors using
(make-vector i), which zeros all entries of the vector. These get translated into calls to an allocation function inruntime.c. -
Programs may perform vector operations, but these operations are limited to constant indices (i.e., we can’t compute the index that we want to write / read via
vector-ref/vector-set!))–we may generalize this in subsequent projects. -
All variables may be mutated via set!, our language is no longer pure. We handle this by “boxing” every variable, a process we describe below.
-
Mutability goes hand-in-hand with loops, in the sense that mutable variables are not much fun unless you add loops. So we add a form,
(while e-g e-b), which continually evaluates the loop guarde-gand executes the bodye-b(discarding its result) until the loop is finished.(while ...)reduces to(void).
Repository Layout
compile.rkt– Your pass implementations. You will edit functions provided here. -> This is the only file you will edit! The rest are read-onlyirs.rkt– IR definitions and predicates likeanf-program?,c1-program?, etc. (see also typed/shrunk variants)interpreters.rkt– Reference interpreters for several IRs (used by tests and for your own debugging).system.rkt– System/ABI configuration, pass names, runtime filenames, output paths, etc.main.rkt– Driver that runs all passes, can build a binary, and can launch a debug server.test.rkt– Test harness. Runs isolation tests or end-to-end native tests depending on-mmode.runtime.c– Minimal runtime (read_int64,print_int64, etc.).test-programs/– Example programs (.scm).input-files/– Input streams for programs (lines of integers).goldens/– Instructor goldens (IR snapshots, interpreter outputs, and stdout baselines).
Please do not rename files or directories (grading infra depends on them).
The Compiler Pipeline (passes)
We skip typechecking in this project; programs are assumed type-correct. If not, behavior is undefined (e.g., segfaults). You’ll implement the following passes in compile.rkt. Each pass must produce an output that satisfies the corresponding predicate in irs.rkt, and preserve behavior across interpretable stages.
-
shrinkRemoves binary-,and/or, and<=, >, >=by rewriting toeq?/</if. Preserveslet,while, vectors,set!, andbegin(canonically asletchains). Rewriteswhileso that it appears only in non-tail position:(while e-g e-b) => (let ([_ (while e-g e-b)]) (void))Input:R3?→ Output:shrunk-R3? -
uniqueifyAlpha-renames so that each bound identifier is written exactly once. Output program’s info field is().
Input:shrunk-R3?→ Output:unique-source-tree? assignment-convertBoxes variables. Rewrites:let→(make-vector 1)+vector-set!to initialize- uses of
x→(vector-ref x 0) (set! x e)→(vector-set! x 0 e)Loops and vector ops remain.
Input:unique-source-tree?→ Output:assignment-converted-program?
-
anf-convertConverts to ANF: RHS are atoms or small ops over atoms; side-effects sequenced vialet ([_ (vector-set! …)]) …andlet ([_ (while …)]) …;ifguards are atomic. Input:assignment-converted-program?→ Output:anf-program? -
explicate-controlLowers ANF to C2: a map from labels to tails withseq (assign x rhs) …,if (cmp a0 a1) (goto lt) (goto lf),goto,return a, and standalonevector-set!.whilebecomesheader/body/restlabeled blocks. Input:anf-program?→ Output:c2-program? -
uncover-localsComputes the set of locals (variables assigned or used as vector operands) for stack layout; carries it as the program’sinfoset. Input:c2-program?→ Output:locals-program? -
select-instructionsSelects pseudo-x86 instructions still using(var x)operands. Usescmpq+set{e,l}+movzbqfor comparisons, callsread_int64/make_vector, and onreturnjumps to a specialconclusionlabel (no earlyret). Input:locals-program?→ Output:instr-program? -
assign-homesReplaces(var x)with stack slots(deref (reg rbp) n)using the locals set; negative 8-byte offsets. Input:instr-program?→ Output:homes-assigned-program? -
patch-instructionsFixes illegal instructions (e.g.,movq (deref …) (deref …), certaincmpqforms) by shuttling through a temp register. Input:homes-assigned-program?→ Output:patched-program? -
prelude-and-conclusionInserts function prologue/epilogue and theconclusionblock that prints%rax, sets%rax := 0, and returns. Ensures 16-byte stack alignment beforecallqand reserves aligned space for locals. Input:patched-program?→ Output:x86-64? dump-x86-64Renders GAS assembly text. Emits a global for the entry symbol (rt-sym), includesruntime-function-externs, and prints operands as$imm,%reg,disp(%reg). Input:x86-64?→ Output:string?
Your task: implement these passes so each output satisfies its predicate in irs.rkt and preserves semantics (the harness checks using interpreters where applicable).
“Boxing” and assignment-conversion
In this project we embrace mutability by enabling set!, which allows
us to make mutable updates to variables:
(program
(let* ([x (read)]
[y (read)]
[tmp 0])
(begin
(set! tmp x)
(set! x y)
(set! y tmp)
y)))
Notice that we can use set!, which mutably updates a variable; this
is not possible in “pure” functional programs, and mutation is simply
not provided in those languages (at least as a builtin). The question
is how to implement set!. It may seem like an obvious choice: each
variable becomes a single register-sized (8-byte) value on the
stack. However, this approach starts to fall apart when variables
outlive their stack frames (which will happen when we add functions /
procedures). Instead, we use this as an opportunity to discuss another
necessary challenge to work towards a general-purpose programming
language: heap-allocated data. You likely know about the heap from
programming in C with malloc (or C++ with new). These procedures
allow us to allocate memory from a large block of available memory
(the heap), with the obligation that we later release (“free”) it back
to the operating system.
To enable mutability and set!, we will employ a technique called
“boxing.” In short, we assume that all data is put in a vector
(“box”) on the heap. In our case, we will use a very simple
representation of vectors:
typedef struct {
int64_t len;
int64_t data[];
} TinyVecPacked;
In terms of memory representation, a TinyVecPacked struct has a very
simple layout: a single 64-bit value representing the length, followed
by a data array of 8-byte values. For example, the vector containing
the sequence 5 followed by 20 (starting at the address 0xAAAA00)
would be represented as:
lower addresses → higher addresses →
┌───────────────────────┬───────────────────────┬───────────────────────┐
│ len (int64) │ data[0] (int64) │ data[1] (int64) │
│ 8 bytes │ 8 bytes │ 8 bytes │
├───────── ─ 0xAAAA00 ──┼────────── 0xAAAA08 ──┼──────── 0xAAAA10 ────┤
│ e.g., 2 │ e.g., 5 │ e.g., 20 │
└───────────────────────┴───────────────────────┴───────────────────────┘
The length can be used to avoid accesses outside of the buffer: we don’t want to be able to write outside of the bounds of a buffer because it could contain junk data, we want to be able to write outside of the bounds of a buffer because we could mess up other data, unrelated to this vector (but still within our same address space, assuming the OS provides virtual memory). To handle this, we could take one of three positions:
(a) don’t worry about it–just be unsafe, and let the program crash.
(b) check the bound before each access, if a bad access would occur, jump to an exception handler.
(c) use a static type system to ensure this could never happen.
Each of these approaches has different trade offs: C, C++, and similar
languages opt for (a), which is fast but risky–the program might
crash in a subtle or even exploitable way. Managed languages like
Java, C#, etc. often opt for (b), but can sometimes optimize to avoid
checks if certain conditions are met (e.g., using profiling data and
just-in time compilation). The issue with (b) is that it is slow:
every single access necessitates a branch, compared with (a) which
might just be as simple as movq. Languages like Rust use (c), which
enables the benefits of (b) as a zero-cost abstraction, often matching
(or even exceeding) the performance of (a).
In this project we will do (a): we assume the program is type correct. This means that as you generate code, you assume that operations for various builtins are also type correct. You will not be given any programs to test which contain type errrors.
In short:
- To implement
make-vector, you will ultimately emit a call tomake_vectorinruntime.c. - To impelment
(vector-ref v i), you will emit code that usesmovqto dereference the vectorv. You want to movq the value at 8(i+1), the +1 is due to the fact that the first 8 bytes of every vector holds its length. You could also consider emitting a *safe vector reference, checking the bounds each time (but it is not necessary in this specific case, because we don’t test any offensive programs). - To implement
vector-set!, you will also emit amovqinstruction.
Translating Loops
A while e-g e-b lowers to three blocks:
header: evaluate guard, then if (eq? a-g #f) (goto rest) (goto body).body: code for e-b, then goto header.rest: continuation after the loop.
In ANF/C2 we sequence any side-effects (e.g., vector-set!) via seq and keep the tail forms restricted to return, goto, or a single if.
IR Predicates and Interpreters
-
Predicates (
irs.rkt): Each pass has a precise predicate (e.g.,anf-program?,c1-program?, etc.). Tests check both input and output predicates at every step to catch malformed IR early. You should read and understand each predicate. -
Interpreters (
interpreters.rkt): For many stages we run the interpreter over your IR to check semantic consistency—different IRs must compute the same result on the same input stream. The harness reports whether outputs match and whether stdout across passes is consistent. Please do not modify this file.
How to Compile and Run
To directly compile a single file using main.rkt:
racket main.rkt test-programs/<prog>.scm
The compiler dumps a long summary of the work done at each pass, and
ultimately yields a program ./output which it compiles using
system-specific infrastructure (works on Linux/Mac).
Example:
$ racket main.rkt test-programs/example.scm
Compiling IR (using *your* compile.rkt) …
'(pushq (reg rbp))
...
🏗️ Now building a binary... 🏗️
-> Host: macosx/x86_64 Target: x86_64-apple-darwin Entry: main
/usr/bin/clang -target x86_64-apple-darwin -Wall -O2 -c ./output.s -o ./output.o
/usr/bin/clang -target x86_64-apple-darwin -Wall -O2 -c ./runtime.c -o ./runtime.o
/usr/bin/clang -target x86_64-apple-darwin -Wall -O2 ./output.o ./runtime.o -o ./output
Success! Executable produced at: ./output
Done!
$ ./output
42
Programs that (read) input will prompt or consume stdin. You can use
input redirection:
./output < input-files/1.in
Debugging Guide
This is a tricky project, and it is really important that you lean into a debugging methodology that works for you. Let me share with you some advice I used when I was writing the compiler myself.
- First, I started by writing a single manual test in the top-level of
compile.rkt, so that I could easily just sit at the command line (or Dr. Racket) and runcompile.rkt(with no command-line arguments) and see if the test would pass. At the early stages of debugging, this is an excellent strategy, since it means fully-pushbutton test. For example, I had (appended tocompile.rkt) something like:
(displayln ;; or pretty-print
(dump-x86-64
(prelude-and-conclusion
(patch-instructions
...))))
-
I started with a fairly large test in my case–since I was merely adding forms not present in my previous implementation. In this case, I would often be hitting match failures–this is a good thing, it allows me to trace down exactly where I need to add match cases and handle new behavior. Of course, the issue is that I also need to mix that with thinking holistically about the specs of each IR.
-
After I thought I had each pass of the compiler working, I started switching over to
main.rkt, which will run all passes of the compiler and will report their outputs. I also needed to write the interpreters (you do not), and so I debugged some of those usingmain.rkt. -
Last, once things are acutally working, I used
test.rkt, which I adjusted to account for the possibility of expected type errors in malformed inputs. -
My advice to you is similar: start with something where you can press “Run” (or continually reinvoke
racket compile.rkt). It will facilitate rapid testing, and it is really important to build some skill and intuition for how to accomplish that exercise. -
Remember, debugging is a key concept that you are practicing in this class. The worst possible thing you can do is “guess and check” debugging (running the tests and hoping they pass)–the issue is that doing this gives you very little observability into why the code is broken. To fix this, you need to have some way of interacting with the code. Experts debug their code using an iterative, hypothesis-driven methodology, where they (a) articulate a falsifiable hypothesis (“this match pattern never matches anything”), (b) change their code to observe the bug (“add a displayln at every match handler”), and (c)
Tricky Parts of this Project
BE CAREFUL of the following:
-
There is no type checking.
-
In this case,
explicate-controlneeds to be updated, because when generating loops, you need to “take hold” of the return value from the translated block and use it for some purpose (e.g., branch on it, or ignore it and jump back to the header). I have generalizedexpr->blocksto accept a third argument,k, which is a continuation that gets applied to the eventual return value. -
Non-loop conclusion flow: select-instructions must end “returns” by jumping to conclusion; don’t emit ret in the middle of blocks.
-
Indices must be static:
vector-ref/vector-set!indices are fixnum literals at compile time in this project. -
On Mac OSX, you need to prefix labels for functions, meaning
_maininstead ofmain. To facilitate this, I provide a function(rt-sym s), which converts a symbol to a system-specific variant (based on the OS). However, you only need to worry about this for labels that correspond to exported function entrypoints, not other labels that are internal to the program (e.g., the target of a jump, or evenconclusion). This is important, because you don’t want to call(rt-sym ...)until the very last pass (it is ugly to make the previous passes OSX/Linux-specific). The issue is this: if you naively rename every label fromfooto_foo, then you also need to ensure that you clean up jumps so that instead of jumping tofoo, they go to_foo. In my implementation, I handled this as follows: I simply used(rt-sym ...)on the(entry-symbol)(i.e.,main), which is the only function in this project. -
Ensure 16-byte stack alignment before
callq
Testing Infrastructure
The file test.rkt runs a set of formal tests, as in the last
project.
frontend– runstypecheck→anf-convert, checks ANF via interpretermiddleend– runsexplicate-control→uncover-locals, checks viainterpret-c1backend– runsselect-instructions→patch-instructions, checks viainterpret-instrnative– full compile → build binary → run with stdin → compare stdout to golden
Usage:
racket test.rkt -m <mode> -i <comma-separated input files> -g <comma-separated goldens> <program.scm>
Example:
racket test.rkt -m native -i ./input-files/1.in -g goldens/example_1_uniqueify.stdout test-programs/example.scm
Goldens: What They Are
We use goldens to verify correctness:
- Isolation goldens – for non-native modes:
- Compare serialized ASTs and interpreter stdout
- Native goldens – for full pipeline:
- Run your compiled binary, compare stdout
Layout:
goldens/<prog>_<n>_<pass>.in-astgoldens/<prog>_<n>_<pass>.out-astgoldens/<prog>_<n>_<pass>.interpgoldens/<prog>_<n>_<pass>.stdout
Instructor-only gengoldens mode regenerates these.
Autograder Tests
The autograder invokes test.rkt in JSON mode. Your job is to make
each pass satisfy its predicates and produce semantically equivalent
IRs until final native code passes all tests.