Project 5: Functions, Lambdas, and Closures ↦ x86-64
Project due date:
Download the starter files here
This is the final language that everyone in the class will
implement. We’ll implement a language, R4, which includes functions
(define (f ...) ...) and application of user-defined functions (f
...). As an extension, the language R5 will also add lambdas, and
closures, via closure conversion covered in this class YouTube
video. The R5 extension
is extra credit, but the R4 baseline is a seriously impressive
language, with an impressive degree of expressivity.
Still, however, there are some painfully obvious things this language omits, which represent extensions I encourage you to implement on your own (as part of your final project, if you are in CIS531):
-
On the easier end, we don’t even have things like multiplication
*, modulo/, etc. built into the language. Adding these would be trivial but a bit of work. -
Our program can still display only a single value: generalizing this would require either (a) adding strings (which has its own set of runtime choices, if we allow ourselves to dynamically allocate strings) or (b) adding some primitives like
(print_int ...)and(print_char ...). Adding just these two could allow us to print whole lines. To get format strings, the easiest way is to write a runtime function inruntime.c -
On the more concerning end, we have no real type/memory safety in this language, even still–making it a very poor choice for 2025. In the last language, typechecking was easy because we didn’t have functions. For this language, the obvious “safe” choice is to do runtime type tagging and dynamic types, which would be the subject of the next mandatory project (if we had time). You could also imagine using a typechecker, but it starts to get complicated once we mix functions, vectors, etc. unless we are okay doing type annotations and skipping polymorphism, which is a practical limitation as we must monomorphize.
Congrats on making it to this point, by the end of this project you’ll have completed the whole (core) language!
Input Language (R4/5)
All students must implement the language R4, which extends our
language to include top-level (but not nested) defines. Compared to
R3 and the previous languages–where our program was a single
expression–our program now consists of a list of functions, followed
by a single “main” expression. When our program is compiled, we handle
this “main” expression a bit differently (we describe more below):
defn ::= (define (f x ...) e) ;; Fixed-arity top-level definitions
program ::= (program defn* e)
As an example of an R4 program…
(program
(define (fib x)
(if (eq? x 0)
1
(if (eq? x 1)
1
(+ (fib (- x 1)) (fib (- x 2))))))
(let ([x (read)])
(if (< x 8) (fib x) (fib 8))))
The major extensions are these:
(1) Definitions: to handle top-level functions, we need to extend all passes of the compiler to work on a per-definition basis. This is a pervasive change, but the amount of intellectual work is rather small: compilation happens on a per-function basis, and so it’s as simple as ensuring we map a per-function handler across each definition (more details about this below).
(2) Function calls: the syntax of expressions is extended to include
calls to functions. In the simplest case, this is something like
(f x), but in general the function could be computed like ((f g) x).
(3) Lambdas (R5): We support these via closure conversion, on which I have a video lecture. The testcases that include lambdas are bonus: if you’re not attempting the bonus, you can just make this pass the identity function.
Repository Layout
compile.rkt– Your pass implementations. You will edit functions provided here. -> This is the only file you will edit! The rest are read-onlyirs.rkt– IR definitions and predicates likeanf-program?,c1-program?, etc. (see also typed/shrunk variants)interpreters.rkt– Reference interpreters for several IRs (used by tests and for your own debugging).system.rkt– System/ABI configuration, pass names, runtime filenames, output paths, etc.main.rkt– Driver that runs all passes, can build a binary, and can launch a debug server.test.rkt– Test harness. Runs isolation tests or end-to-end native tests depending on-mmode.runtime.c– Minimal runtime (read_int64,print_int64, etc.).test-programs/– Example programs (.scm).input-files/– Input streams for programs (lines of integers).goldens/– Instructor goldens (IR snapshots, interpreter outputs, and stdout baselines).
Please do not rename files or directories (grading infra depends on them).
The Compiler Pipeline (passes)
The compiler is designed in passes, which go:
--> R4/R5? -- Source program (R5 is extra credit) <INPUT>
|
+-> shrunk-R5? -- Core R4/5 (removes syntactic sugar / extra forms) [shrink]
|
+-> unique-source-tree? -- Every bound identifier is written exactly once [uniqueify]
|
+-> revealed-functions-program? -- Makes all calls explicit via fun-ref and app [reveal-functions]
|
+-> assignment-converted-program? -- Eliminates set!; variables become 1-slot vectors [assignment-convert]
|
+-> closure-converted-program? -- Lift lambdas to top-level defines, allocate closures [lift-lambdas]
|
+-> limited-arity-program? -- Rewrites >6-arg functions to pass the rest in a vector [limit-functions]
|
+-> anf-program? -- A-Normal form (flattening nested expressions) [anf-convert]
|
+-> blocks-program? -- (Formerly C2) blocks of sequences of commands, if, gotos, calls [explicate-control]
|
+-> locals-program? -- Uncovering local variables for each function [uncover-locals]
|
+-> instr-program? -- Pseudo-x86, flattened blocks of instructions over pseudo-vars [select-instructions]
|
+-> homes-assigned-program? -- Assigns variables to stack locations (rbp-relative homes) [assign-homes]
|
+-> patched-program? -- Patches illegal x86 forms (e.g., mem-mem moves, bad leaq forms) [patch-instructions]
|
+-> x86-64? -- Final x86-64 IR with prologue/epilogue and printing logic [prelude-and-conclusion]
|
+-> string? -- Rendered as GAS assembly text suitable for writing to a .s file [dump-x86-64]
The following passes are NEW:
-
reveal-functions, which exposes calls to functions like(f ...), differentiating it from built-in functions like+by rewriting it to(app (fun-ref f) ...). -
lift-lambdas– this is an extra credit (required for PhD students) pass which performs closure conversion, enabling us to use lambdas anywhere we could otherwise call a function, and even return lambdas from other functions. -
limit-functions– this pass limits functions to six arguments, which avoids having to look in the callee for the rest of the variables. We do this by taking functions with greater than six arguments and putting the rest in a vector, which is constructed at each callsite and deconstructed upon function entry.
The rest of the passes are updated in a very systematic manner: generally speaking, you need to map your previously-single-function logic across multiple functions.
Implementing defines and applications
The major innovation of this project is to enable functions at the top level and (optionally) to perform lambda lifting. The most common change that our compiler will face is this: in our previous IR, we had an IR like `(program ,info ,code), where info and code got filled in throughout the compilation to be increasingly lower-level–but at each pass, we always matched with this pattern. But now, we can’t do that anymore–we need to generalize every pass. This sounds like a lot of work, but generally it ends up amounting to changing our code from something like this:
(match p
[`(program ,info ,e)
`(program ,info ,(process e))))
to something like this:
(define (per-defn defn)
(match-defne defn `(define ,info (,f ,formals ...) ,code))
`(define ,info (,f ,@formals) ,(process code)))
(match p
[`(program ,info ,defns ...)
`(program ,info ,(map per-defn defns))])
See what we did? We wrote a per-definition handler, (per-defn ...),
which does the bulk of the processing that we did before–then, we
mapped over that per-definition function for each of the
definitions. This trick is very common, and basically I suggest you
use this idea for every single pass. It seems like a lot of work, but
it’s not so bad once you get used to it, and almost all of the logic
from the previous project is isolated to the per-function
translation. That makes sense, because in general compilation happens
on a per-function basis.
The shrink pass assumes that the program has the structure (program
(define (f ...) ...) ... e-main), and transforms it so that e-main
is wrapped in a special top-level main block (you should use
(entry-symbol), as before). e may then call the functions being
defined, and its result is finally printed to the screen. uniqueify
works in mostly the expected way, adapted to map a per-defn function
across each. The new stuff is the following three passes:
-
reveal-functions– Syntactically distinguishes calls to user-defined functions like(f x ...)from builtin functions like(+ ...). We often want to handle these different ways, soreveal-functionscollects up all of the function names, forms a big set, and walks over the program to replace any call tofwith an(app (fun-ref f) ...). Later on,(fun-ref f)will be translated into aleainstruction to get the function pointer (which has to be determined once the program is loaded) -
lift-lambdas(EXTRA CREDIT) – Does closure conversion. Closure conversion is a topic we have covered a lot in class, there is also a course video, so I will skip the explanation here: closure conversion eliminates lambdas by lifting them to top-level defines, but it also allocates / unpacks closures: syntactic lambdas in the input language get translated into closure (vector) allocations in the output. -
limit-functions– Converts functions with greater than six arguments into functions of exactly six arguments, by passing the rest of the arguments through a vector. This is easier than the typical calling convention, which locates the rest of the arguments on the stack
Assignment conversion performs boxing and unboxing, eliminating set!
as usual. It sits between reveal-functions and lift-lambdas. There
are some concerns in assignment-convert to be mindful of, in this
project: we need to make sure that arguments to functions (and
lambdas) also get copied into a box.
ANF conversion is mostly the same, now dealing with only top-level
definitions (no need to handle lambdas). Explicate-control does not
significantly change: we handle calls to functions just as we handled
calls to things like read before. In emitting code at the lower
passes, we generally only need to handle new instructions. The main
significant intellectual change happens in select-instructions,
which needs to move all of the function’s arguments into the
appropriate registers. Arguments (at most six, due to
limit-functions) are passed in order in the registers
(argument-registers-list) (defined in system.rkt). You should use
that function, rather than hardcoding a list of registers (but they
are rdi, rsi, rdx, rcx, r8, and r9). At the select-instructions
phase, I instruct you to emit a function named conclusion in every
function of the form '((retq)) (i.e., a single return
instruction). In the pass prelude-and-conclusion, we fix this up by
calculating the requisite stack space to allocate, allocating it and
uniqueifying the block name from conclusion to conclusion183...
(i.e., a gensym).
One last point: check out the new irs.rkt. Notice how there is no
top-level metadata in the program anymore: no (program ,info
...). That’s because in this project, all information is either (a)
function-specific or (b) can easily be figured out by looking at the
definitions (e.g., collecting up all function names, since you know
they’re all of the form (define (,f ...) ...)). So in general, the
IRs will look like:
'(program (define (f x0 x1 ...) e-b0) (define (g y0 ...) e-b1) ...)
As mentioned above (this is such a key point I feel I have to repeat it), the general idea is to either map / fold over this list of definitions.
Pass-by-Pass Implementation Guide
Looking at my reference implementation, compared to my solution for the previous project, I will call out the major things that I see as relevant to know on a pass-by-pass basis.
shrink
First, shrink needs to support both function calls and (R5 only) lambdas. Second, this pass expects an input like:
(program (define (f x ...) e) ... e-main)
I.e., the final item in the list is the main expression in the
program. Every program in R4 is a list of definitions (possibly empty)
followed by a final “main” expression. The shrink pass needs to wrap
this in a main function:
(program (define (f x ...) e) ... (define (main) e-main))
uniqueify
This pass needs to change in two ways: it needs to map the rename
function across each definition, taking the variable names into
account. Second, it needs to accomodate function calls (simple: just
map rename across each argument / the function expression) and
lambdas: be sure to be mindful that lambdas establish bindings and
thus need to be handled similarly to let.
reveal-functions
This is a new pass: the purpose of this pass is to change function
applications like (f x y z) into (app (fun-ref f) x y z). The
reason this is important is basically this: we will sometimes want to
compile user-written functions in a specific manner (specifically:
tacking on an extra closure parameter, in the case of closure
conversion), and so we need to have some way to look at every function
call site and determine: is this a call to a user function, or a
builtin function? Also, at runtime, we represent functions using
function pointers, and we look up their values using the load
effective address (lea) instruction: we add (fun-ref ,f) as a new
kind of atom at this IR level.
This pass is pretty straightforward to write, it has two basic steps:
- Collect all of the top-level function names into a big set. You can do this by pattern matching:
(match p
[`(program (define (,names ,params ...) ,bodies) ...)
;; names is a list of function names
(define all-names (list->set names))
- Walk over every expression (in every body) and rewrite every plain
variable to
(fun-ref ...)when it is in this set: re-usages of the same name will already have been uniquified, so this is not too hard to write as a recursive function.
assignment-convert
-
Supporting
appis easy: we mapa-cacross all the things being applied. -
Supporting
lambdais a bit harder, because you need to ensure that we box all formal arguments passed into the function. I do this using a helper function
;; box-formals: helper function (use for defines and lambdas)
;; transform (lambda (x y z) ...) => (lambda (x123 y235 z523) (let ([x (vector x12)]) ...)
;; genformals and realformals are lists of symbols
;; e-body is an expression which will be used when no vars left
(define (box-formals genformals realformals e-body)
(match genformals
['() e-body]
[`(,x . ,rst)
`(let ([,(first realformals) (make-vector 1)])
(let ([_ (vector-set! ,(first realformals) 0 ,x)])
,(box-formals rst (rest realformals) e-body)))]))
lift-lambdas
This is the closure conversion pass. The closure conversion pass is
explained in this video: https://www.youtube.com/watch?v=zMtaXO_xHYU
Additionally, the course slides give some detailed coverage of the
ideas. The basic idea is this: every function in your program is
annotated with an extra clo argument. We will implement flat
closures, where a closure is a vector of (a) a function pointer
followed by (b) the closure’s free variables in some canonical
order. I recommend implementing bottom-up closure conversion, you walk
over each expression (in each definition, don’t forget that) to emit a
set of “lifted” lambdas (top-level defines which accept a clo
argument along with the rest of the lambda’s arguments) which are then
spliced on to the program. Additionally, lambda lifting ensures that
each syntactic lambda in the program is compiled into an allocation of
a vector and then population of it with the function pointer and free
variables (properly allocating / building the closure).
limit-functions
This pass is relatively small as well, it has a very simple purpose:
-
Every function that accepts >6 arguments should be rewritten so that its last argument is a vector, and the function’s body should be rewritten so that the first thing it does is unpack the variables via this vector.
-
Every function call that uses >6 arguments should be rewritten to allocate a vector, populate it, and pass it as the last argument.
anf-convert
- ANF conversion needs to be extended to support application:
(app ,e0 ,e-rest ...)needs to first converte0to an atom, then convert each one ofe-restto a list of atomse-vs. Finally, yougensyma new symbolxand return(let ([,x (app ,a0 ,@a-vs)]) ,(k x)). To walk over the expressions, I used a helper function that looks like:
(define (handle-rest e-rest as)
(match e-rest
['()
;; base case: gensym x and generate an app
[`(,hd . ,rest)
(convert-expr
hd
(lambda (a) (handle-rest rest (cons a as))))]))
explicate-control
This pass has almost no changes:
-
Map over each top-level definition using a
(per-defn ...), that appliesexpr->blocksto each function body. -
expr->blockssimply turns the(let ([x (app ...)]) ...)into(assign ,x (app ,a-f ,@a-args))
uncover-locals
- I gave this pass to you again, it is fairly simply, I have updated it in the expected manner.
select-instructions
-
This pass needs to be updated to support
(seq (assign ,lhs (app ,a-f ,args ...)) ,next). This should:- Copy all arguments into registers. Use the list
(argument-registers-list)insystem.rkt, which gives the list of registers<->arguments in order. - Make an
(indirect-callq ...)to the function atom - Move the result register
%raxintolhs
- Copy all arguments into registers. Use the list
-
This pass also needs to be updated to support
fun-ref, which will only over occur in the form(seq (assign ,x (fun-ref ,f)) ,rest):- This should be translated to a
leaqinstruction using thefun-refas an argument
- This should be translated to a
-
In this pass, we also generate a
conclusionblock. In our last project, we just left this empty. In this project, you should set it to just((retq))(i.e., a list with a single instruction,(retq)). In theprelude-and-conclusionpass we will generate the prelude / conclusion.
assign-homes
- This pass needs to be updated to support
leaq(which will always have a fun-ref as its argument) andindirect-callq
patch-instructions
- Also needs to be updated to handle
leaq, which requires its destination is a register.
prelude-and-conclusion
This pass needs a bit more work, due to the following reason: in our
previous pass, we used a block name conclusion. We rename the block
from conclusion to a globally-unique name. Perhaps it is exemplary
of bad design that we need to do it this way: it might be something we
change in future semesters, but for now I did something like this:
- I implemented a function,
rename-conclusion:
(define (rename-conclusion blocks name)
(define (h instr)
(match instr
[`(jmp ,blk) #:when (equal? blk (conclusion-block-name))
`(jmp ,name)]
[i i]))
(foldl (lambda (k acc) (hash-set acc k (map h (hash-ref blocks k))))
(hash)
(hash-keys blocks)))
-
For each definition, I do one of two things:
- If the function is the special function
(entry-symbol), then I make the conclusion block move the result into%rdiandprint_int64it, finally returning from main. In a future language, I might consider saying that main doesn’t print its final value unless the user chooses to do it: but I wanted to be maximally consistent with the previous projects, this time.
- If the function is the special function
So, I have code that looks something like this:
(define start-block (hash-ref blocks f))
(define new-start-block
`((pushq (reg rbp))
(movq (reg rsp) (reg rbp))
(addq (imm ,space-needed) (reg rsp))
,@start-block))
(define conclusion-block
(if (equal? f (entry-symbol))
`(;; move result into %rdi and print_int64 it
(movq (reg rax) (reg rdi))
(callq print_int64 0)
;; 0 return value (to the terminal/system) into %rax
(movq (imm 0) (reg rax))
;; reinstate stored %rbp
(movq (reg rbp) (reg rsp))
(popq (reg rbp))
;; transfer back to caller
(retq))
;; else, just return...
`((movq (reg rbp) (reg rsp))
(popq (reg rbp))
(retq))))
(define my-conclusion-block (gensym 'conclusion))
(define blocks+ ;; build the blocks here...)
;; now, return updated function
`(define ,locals (,f ,@args) ,blocks+)
dump-x86-64
-
As all other passes do, this needs to map across all definitions. But in this case, all of the blocks in every definition are all globally unique, so all we do is run the
render-blockfunction for each block in every function. There is no requirement that the blocks that get generated sit next to each other in the emitted source code (though it may make debugging easier). -
We need to physically render two new assembly instructions:
leaqandindirect-callq. Both of these are fairly easy, forleaqyou want to use(format "leaq ~a(%rip), ~a" (rt-sym f) ...). -
Here you need to be especially careful about how you call
rt-sym: in short, everything that is a function name needs to be rendered by callingrt-sym, because functions on OSX need a_prepended. I do this in the following manner: inrender-block, I check if the block name I’m spitting out is known to be a function name: if so, I callrt-symon it, if not I don’t:
(define txt-label (if (set-member? functions name) (format "~a:\n" (rt-sym name)) (format "~a:\n" name)))
- In the
leaqinstruction, I callrt-symas well (as I do in the regularcallinstruction for things fromruntime.c).
IR Predicates and Interpreters
-
Predicates (
irs.rkt): Each pass has a precise predicate (e.g.,anf-program?,c1-program?, etc.). Tests check both input and output predicates at every step to catch malformed IR early. You should read and understand each predicate. -
Interpreters (
interpreters.rkt): For many stages we run the interpreter over your IR to check semantic consistency—different IRs must compute the same result on the same input stream. The harness reports whether outputs match and whether stdout across passes is consistent. Please do not modify this file.
How to Compile and Run
To directly compile a single file using main.rkt:
racket main.rkt test-programs/<prog>.scm
The compiler dumps a long summary of the work done at each pass, and
ultimately yields a program ./output which it compiles using
system-specific infrastructure (works on Linux/Mac).
Example:
$ racket main.rkt test-programs/example.scm
Compiling IR (using *your* compile.rkt) …
'(pushq (reg rbp))
...
🏗️ Now building a binary... 🏗️
-> Host: macosx/x86_64 Target: x86_64-apple-darwin Entry: main
/usr/bin/clang -target x86_64-apple-darwin -Wall -O2 -c ./output.s -o ./output.o
/usr/bin/clang -target x86_64-apple-darwin -Wall -O2 -c ./runtime.c -o ./runtime.o
/usr/bin/clang -target x86_64-apple-darwin -Wall -O2 ./output.o ./runtime.o -o ./output
Success! Executable produced at: ./output
Done!
$ ./output
42
Programs that (read) input will prompt or consume stdin. You can use
input redirection:
./output < input-files/1.in
Testing Infrastructure
I’ve cut out the middleend option in this project.
frontend– runsshrink→anf-convert, runs the frontendbackend– runsexplicate-control→patch-instructions, checks viainterpret-instrnative– full compile → build binary → run with stdin → compare stdout to golden
Usage:
racket test.rkt -m <mode> -i <comma-separated input files> -g <comma-separated goldens> <program.scm>
Example:
racket test.rkt -m native -i ./input-files/1.in -g goldens/example_1_uniqueify.stdout test-programs/example.scm
Goldens: What They Are
We use goldens to verify correctness:
- Isolation goldens – for non-native modes:
- Compare serialized ASTs and interpreter stdout
- Native goldens – for full pipeline:
- Run your compiled binary, compare stdout
Layout:
goldens/<prog>_<n>_<pass>.in-astgoldens/<prog>_<n>_<pass>.out-astgoldens/<prog>_<n>_<pass>.interpgoldens/<prog>_<n>_<pass>.stdout
Instructor-only gengoldens mode regenerates these.
Autograder Tests
The autograder invokes test.rkt in JSON mode. Your job is to make
each pass satisfy its predicates and produce semantically equivalent
IRs until final native code passes all tests. Autograder tests can be
tough to debug, so if one is failing, you should use tools like
main.rkt or even debug-server.