Project 4: Church Encoder
Project due date:
Due: Tuesday, April 21, 11:59PM
Starter code: p4-public.tar.gz
Submit via autograde.org. If you are not enrolled in the course, you can still work through this project using the public starter code and tests above.
Lecture notes: Church Encoding
In this project, you will write a Church encoder: a compiler from a
large subset of Scheme to lambda-calculus. In other words, you will
transform inputs like '(let ([x 0]) x) to outputs like '((lambda (x)
x) (lambda (f) (lambda (x) x))). Your output will be quoted data in
the lambda-calculus: the test infrastructure will then interpret this into
a Racket lambda by using eval.
It is crucial to understand that you are not building an interpreter, as in project 3. Instead, you are building a compiler. Your project will translate the input language into the lambda-calculus by removing each Racket form, one-by-one, and encoding them in the lambda-calculus.
To understand the encodings, please read through the Church Encoding lecture notes, the posted course slides, the course videos, and the recorded class lectures on Blackboard. We have covered many of the forms already: the main challenge in this project is translating the “on paper” knowledge into a working compiler.
The central challenge with this project is debugging. You should try to learn a debugging methodology that works for you. I recommend calling your compiler on smaller and smaller programs until you recognize which part is incorrect. Work to gradually whittle down the test cases until you isolate precisely what is wrong.
Good luck!
The Language: Syntax
Below is the full syntax of the language you must support:
e ::= (letrec ([x (lambda (x ...) e)]) e)
| (let ([x e] ...) e)
| (let* ([x e] ...) e)
| (lambda (x ...) e)
| (e e ...)
| x
| (and e ...) | (or e ...)
| (if e e e)
| (prim e) | (prim e e)
| datum
datum ::= nat | (quote ()) | #t | #f
nat ::= 0 | 1 | 2 | ...
x is a symbol
prim is a primitive operation in list prims
(define prims '(+ * - = add1 sub1 cons car cdr null? not zero?))
The following are *extra credit*: -, =, sub1
This input language has semantics identical to Scheme / Racket, except:
- You will not be provided code that yields any kind of error in Racket
- You do not need to treat non-boolean values as #t at if, and, or forms (Racket allows “truthy” and “falsy” values – we won’t handle those)
- primitive operations are either strictly unary (add1 sub1 null? zero? not car cdr), or strictly binary (+ - * = cons)
- There will be no variadic functions or applications — but any fixed arity is allowed
(e.g.,
((lambda l (first l)) 1 2 3)is not allowed but((lambda (x y z) x) 1 2 3)is)
Output language
Your output must be a quoted expression in the lambda-calculus:
e ::= (lambda (x) e)
| (e e)
| x
Functions and Builtins
You will primarily implement the churchify function, which accepts
an input in the input language above and produces an output in the
lambda-calculus. churchify recognizes each form in the input language
(using a match pattern) and then calls itself.
To make things easier for you, we have used a trick: the function
church-compile calls churchify with a large let block to bind each
of the builtins. This allows you to handle builtin operators in a
relatively simple way: you can just translate (+ e0 e1) to use
currying ((+ e0') e1'), where e0' and e1' are the Church
encodings of e0 and e1. Then, + will be given a binding because
it is wrapped in the top-level let block in church-compile.
Please think about the above paragraph until it makes sense to you: it makes handling builtins overwhelmingly simpler.
Implementation Roadmap
This project is designed so you can implement it in stages, passing
more tests at each stage. The early tests (stages 1-8) test
churchify directly, so you do not need church-compile or any
builtins to start passing tests. Later tests (stages 9-11) go through
church-compile and require builtins.
Stage 1: Variables and Natural Numbers
Tests: public-01a-nat-zero, public-01b-nat-nonzero
Start here. Implement these two match cases:
-
Variables
(? symbol? x): Just return the symbol as-is. This is trivial but required for everything else. -
Natural numbers
(? natural? nat): Encode as Church numerals. A Church numeralnis(lambda (f) (lambda (x) (f^n x)))wheref^nmeansfappliedntimes. For example:0->(lambda (f) (lambda (x) x))3->(lambda (f) (lambda (x) (f (f (f x)))))
Write a helper function to produce the
(f (f ... x))part, then wrap it in the two lambdas. Remember to churchify the result.See the Church Encoding lecture notes for a full explanation.
Stage 2: Single-arg Lambda and Application
Tests: public-02a-lambda-id, public-02b-lambda-const
These are needed for almost everything else, so do them early.
-
Single-arg lambda
(lambda (x) e): Recursively churchify the body:`(lambda (,x) ,(churchify e0)) -
Single-arg application
(fun arg): Churchify both parts:`(,(churchify fun) ,(churchify arg))
Stage 3: Multi-arg Lambda and Application (Currying)
Tests: public-03a-curry-lam, public-03b-curry-app
In lambda calculus, all functions take exactly one argument. Multi-arg functions must be curried.
-
Multi-arg lambda
(lambda (x y z ...) e): Curry into nested single-arg lambdas. Hint:(churchify '(lambda (x . rest) e))->`(lambda (,x) ,(churchify `(lambda ,rest ,e))) -
Zero-arg lambda
(lambda () e): Add a dummy parameter_. -
Multi-arg application
(f a b c ...): Curry into nested applications.(f a b c)->(((f a) b) c). Hint: recursively churchify((f arg) rest...). -
Zero-arg application
(f): Apply with a dummy argument(lambda (_) _).
See the Church Encoding lecture notes for the background on currying.
Stage 4: Boolean Literals
Tests: public-04a-true, public-04b-false
A Church boolean takes two thunks and calls one of them:
#t->(lambda (tt ft) (tt))(calls the true-thunk)#f->(lambda (tt ft) (ft))(calls the false-thunk)
Hint: just churchify the lambda expression shown above. Since these use multi-arg lambdas and zero-arg application, you need stage 3 working first.
See the Church Encoding lecture notes for the full explanation of the thunk-based encoding.
Stage 5: If Expressions
Tests: public-05a-if-true, public-05b-if-false
Since a Church bool selects between two thunks:
(if e0 e1 e2)->(e0 (lambda () e1) (lambda () e2))
Just churchify that expression – it’s a 3-arg application.
See the Church Encoding lecture notes
for how if works with the boolean encoding.
Stage 6: Let Expressions
Tests: public-06a-let-simple, public-06b-let-empty, public-06c-let-multi
Let is syntactic sugar for lambda + application:
(let ([x e0]) body)->((lambda (x) body) e0)(let ([x e0] [y e1]) body)->((lambda (x y) body) e0 e1)(let () body)-> justbody
Churchify the desugared form.
Stage 7: Let* and Null Literal
Tests: public-07a-letstar, public-07b-null
-
let*: Desugar to nested let:
(let* ([x e0] rest...) body)->(let ([x e0]) (let* (rest...) body)) -
Null
'(): Encode like a boolean pair callback that calls the when-null case:(lambda (when-cons when-null) (when-null))
See the Church Encoding lecture notes for the list encoding.
Stage 8: And / Or
Tests: public-08a-and-true, public-08b-or-true, public-08c-and-false
Desugar to nested if expressions:
(and e0 e1 ...)->(if e0 (and e1 ...) #f)(or e0 e1 ...)->(if e0 #t (or e1 ...))- Base cases:
(and)=#t,(or)=#f(already provided)
See the Church Encoding lecture notes for the desugaring.
Stage 9: Arithmetic Builtins (in church-compile)
Tests: public-09a-add1, public-09b-plus, public-09c-mult,
public-09d-zero-true, public-09e-zero-false, public-09f-not
Now fill in the Church-encoded definitions in church-compile:
- add1:
(lambda (n) (lambda (f) (lambda (x) (f ((n f) x))))) - +:
(lambda (n0 n1) (lambda (f) (lambda (x) ((n1 f) ((n0 f) x))))) - *:
(lambda (n0 n1) (lambda (f) (lambda (x) ((n0 (n1 f)) x))))- Warning: heed the note in the course videos: the video gets
*wrong, but the slides get it right.
- Warning: heed the note in the course videos: the video gets
- zero?:
(lambda (n) ((n (lambda (_) #f)) #t)) - not:
(lambda (b) (if b #f #t))
Note: these definitions themselves use forms like #t, #f, if,
and multi-arg lambdas, which churchify will process. So you need
stages 1-8 working for these to work.
See the Church Encoding lecture notes for the derivation of each of these definitions.
Stage 10: List Builtins
Tests: public-10a-cons-basic, public-10b-car, public-10c-cdr,
public-10d-null-true, public-10e-null-false
- cons:
(lambda (a b) (lambda (when-cons when-null) (when-cons a b))) - car:
(lambda (p) (p (lambda (a b) a) (lambda () (lambda (x) x)))) - cdr: Follow the same pattern as
car– what changes? - null?: Think about what each callback should return.
See the Church Encoding lecture notes for the observer pattern behind these definitions.
Stage 11: Letrec (Y-combinator)
Tests: public-11a-letrec-basic, public-11b-letrec-recursive
To handle letrec, use the Y combinator for recursion:
(letrec ([f (lambda (args...) body)]) e1)->(let ([f (Y-comb (lambda (f) (lambda (args...) body)))]) e1)
You need to define Y-comb in church-compile:
Y = ((lambda (u) (u u)) (lambda (y) (lambda (mk) (mk (lambda (x) (((y y) mk) x))))))
See the Church Encoding lecture notes
for the full explanation of the Y combinator and how it enables recursion.
The answer is surprisingly simple: you apply the Y combinator to
(lambda (f) ...), then bind the result with let.
Integration Tests
The remaining tests (public-add, public-arith0, server-append,
etc.) combine multiple features. Once you have stages 1-11 working,
most of these should pass automatically.
Extra Credit: sub1, -, =
The sub1 (predecessor) function is significantly more complex than
the other builtins. Look it up on Wikipedia if you’d like to attempt
it. Interesting history:
“Be this as it may, Kleene did find a way to lambda define the predecessor function in the untyped lambda calculus, by using an appropriate data type (pairs of integers) as auxiliary device. In [69], he described how he found the solution while being anesthetized by laughing gas for the removal of four wisdom teeth.” – Henk Barendregt’s “Impact of the Lambda calculus”
Debugging Tips
- Call your compiler on smaller and smaller programs to isolate which part is broken.
- Use the staged tests: if
public-01a-nat-zerofails, focus on your natural number encoding. Ifpublic-05a-if-truefails, look at yourif/boolean encoding. - You can test
churchifydirectly in the REPL:> (churchify '0) '(lambda (f) (lambda (x) x)) > (churchify '#t) '(lambda (tt) (lambda (ft) (tt (lambda (_) _)))) - You can evaluate and convert back:
> (church->nat (eval (churchify '3) (make-base-namespace))) 3