Project 3: Scheme Interpreter

Due: Monday, March 24, 11:59PM

Submit via autograde.org. If you are not enrolled in the course, you can still work through this project using the public starter code and tests above.

Lecture notes: Closure-Creating Interpreters

Project video: Closure-Creating Interpreters

In this project you will build an interpreter for a Turing-complete subset of Racket/Scheme, including lambdas (of fixed and variable arity), if, let, let*, and/or, and a set of ten primitives over numbers and lists.

The central challenge in building an interpreter for a higher-order (or object-oriented) language is the run-time representation of functions (or objects, in the OOP setting). In this project, you will follow the “closure-creating interpreter” lecture–ensure you have a firm understanding of that lecture/video before attempting this project. We include a bit of starter code here, as inspiration, in the form of the function interp-ce-lambda. Make sure you fully understand how interp-ce-lambda works before you try to construct your solution: the way it uses environments is subtle and important.

For this project in particular, it will be crucial you know how to read the public tests and run them on your own. I recommend starting with the easiest tests you can possibly find: things like (and #t #f), (+ 1 2), etc… before moving on to the larger test cases.

Our reference solution is roughly 40 lines of Racket, all in the interp-ce function, but you may consider writing various “helper” functions to do various things (e.g., bind variables in preparation to call a function). Reasonable solutions would be expected to be around 50 to 150 lines of code–if you’re doing way more work than that, it probably indicates a mistaken approach. For example, multiple parts of this project will be made easier by implementing a bulk assignment operator to set multiple keys/values of a hash, which may be implemented using foldl:

(define (assign-keys-vals h keys vals)
  (foldl (λ (k v h) (hash-set h k v)) h keys vals))
;; e.g., (assign-keys-vals (hash 'x 2 'y 3) '(x y z) '(5 2 3))

The Language: Syntax

Below is the full syntax of the language you must support:

Primitives
prim = {+, -, *, =, equal?, cons, car, cdr, list, null?}

x is a metavariable
exp ::= (lambda (x ...) exp)         -- Fixed-arity lambda
      | (if exp exp exp)             -- Conditional
      | (let ([x0 exp0] ...) exp)    -- let-binding
      | (let* ([x0 exp0] ...) exp)   -- "Sequenced" let
      | (and e0 ...)                 -- Polyadic and (0 or more args)
      | (or e0 ...)                  -- Polyadic or (0 or more args)
      | (exp0 ...)                   -- Function application
      | datum                        -- Datums

Extra credit (encouraged, see below):
      | (lambda x exp)               -- Arbitrary-arity lambda
      | (apply exp0 exp1)            -- Apply variadic lambda to arg list

datum:
  | '()                              -- Empty list
  | n                                -- Literal numbers
  | #t | #f                          -- Literal booleans

Your interpreter, interp-ce, must correctly interpret any valid scheme-ir? program and yield the same value as it would in DrRacket, except for closures (generated by lambdas) which must be represented as `(closure ,lambda ,environment). For example, (+ 1 2) must evaluate to 3, (cons 1 (cons 2 '())) must yield '(1 2), and (lambda (x) (+ x 1)) must yield something like '(closure (lambda (x) (+ x 1)) #hash((+ . ...))). For programs that result in a runtime error, you should return `(error ,message)—giving some reasonable string error message. Handling errors and some trickier cases will give bonus points.

Note that while your language does not explicitly include recursion in its syntax, it does allow untyped lambdas, which allow recursion via a fixed-point combinator. For example, public-factorial uses this functionality to implement the factorial function using your interpreter.

Note that to match the empty list '(), your match pattern should use ''():

(match (first '('())) ['() "wrong"] [''() "right"])

Builtin Functions

Supporting built-in functions may be done in a variety of ways. My reference solution takes the following strategy: construct the initial environment so that it maps each builtin to a closure (with an empty environment) that looks something like '(lambda primargs (apply-prim ,op primargs), where op is the relevant primitive. This then naturally gets applied by interp-ce, where I add a special case for apply-prim (not a scheme-ir? expression), which uses Racket’s apply to apply the necessary function (obtained via either (eval op (make-base-namespace)), or something like (hash-ref (hash '+ + ...) op), this is the one place using eval is acceptable).

You may be tempted to specifically recognize each built-in function and handle them separately. This will kind of work, but we have some tests that do things like this:

((lambda (+) (+ 1)) (lambda (x) 25))

Naive solutions will not account for this kind of shadowing–handling shadowing requires that you allow builtins to live in the environment, just as other bindings. However, you may want to start at a naive solution if you cannot figure out a more general strategy–as this will allow you to still pass most of the tests.

WARNINGS and Hints

Do not try to use Racket’s eval to write the whole interpreter. It will not correctly produce or work with closures, and will thus not provide a general solution.
Be very careful to understand the difference a quote makes. Calling (interp-ce (let ([x (+ 1 2)]) x)) is essentially the same as calling (interp-ce 3), i.e., you are not even running your interpreter. Instead, you want (interp-ce '(let ([x (+ 1 2)]) x))
Make sure you handle cases where various bindings may be empty! Things like (and), (or), (let* () 1), etc… Following Racket’s behavior, (and) evaluates to #t and (or) evaluates to #f.
You may want to call interp-ce on synthetic syntax (not present in the original program). For example, our reference implementation defines let* by interpreting “one let at a time,” and finishing up in the (let () ,body) case. Doing this can avoid a lot of work!
You don’t need to worry about errors unless you want to pass some of the harder bonus tests. If you want to try those, consider using Racket’s with-handlers.
Use foldl to accumulate an environment and bind variables. You will likely need to do this in let*, let, and application (but likely not with apply)
To evaluate a function, evaluate each of its arguments, bind them into variables, and then evaluate the body with the new environment (updated with bindings for each argument).
To apply a function, evaluate each of its arguments, and then bind them (as a list) to the necessary variable.
If you support apply: be careful to match the closure’s lambda to determine whether you need to construct a list or bind named arguments.
Be careful to think about the difference between let and let*. At some point, you will need to iterate over the bindings to translate them into values before you create a new environment: in the case of let you need to reference the original environment, in the case of let* you need to accumulate an environment. Both of these can be done with foldl, but some care must be taken to ensure the correct behavior.

Grading

Tests prefixed with public- and server- (excluding bonus) are standard. Tests with bonus in their name (e.g., server-bonus-apply, server-bonus-runtime-error) are extra credit.