Simply-Typed λ-Calculus
In our Church encoding lecture, we saw that the λ-calculus can encode numbers, booleans, lists,
and recursion. But we also saw a problem: there’s nothing stopping you
from writing nonsense like (+ 1 #f). It encodes just fine, but
reduces to garbage. In this lecture, we introduce type systems: a
formal mechanism for ruling out programs that would exhibit runtime
type errors.
Why Type Systems?
Consider the following expressions. Which ones should be allowed to run?
(λ (g) (g 5))
((λ (f x) (f x)) g)
(λ (f) (+ (f 1) (if (= 0 (string-length (f 1))) 1 0)))
The first looks fine: nothing obviously wrong, as long as g is a
function that can accept 5. The second is problematic: it takes two
arguments, so ((λ (f x) (f x)) g) only passes one, and (f x)
tries to call g as a function with whatever x is. The third is
deeply broken: (f 1) would need to return both a number (for +)
and a string (for string-length).
A type system assigns each subexpression a type, a rough
approximation of how it will behave at runtime. For example, int
represents integers, and int → int represents functions from
integers to integers. The goal: rule out programs that would crash
with a type error, before we even run them.
QuickAnswer
In Racket, what happens if you evaluate
(+ 1 #t)? At what point does the error occur: compile time or runtime?
The Simply-Typed λ-Calculus
The Simply-Typed Lambda Calculus (STLC) is a restriction of the untyped λ-calculus. Not all untyped terms are well-typed: STLC rejects programs that could go wrong. This is deliberate. By restricting which programs can be written, we gain a guarantee: every well-typed program is safe.
Term syntax
STLC extends the λ-calculus with type annotations on lambda parameters:
e ::= (λ (x : τ) e) – annotated lambda
| (e e) – application
| x – variable
| n – integer literal
| b – boolean literal (#t, #f)
| (+ e e) – addition
| (- e) – negation
| (if0 e e e) – conditional (if first arg is 0)
| (let ([x e]) e) – let binding
The key difference from the untyped λ-calculus: every lambda must annotate its parameter with a type. This is what makes type checking straightforward.
Type syntax
Types in STLC are simple:
τ ::= int – integers
| bool – booleans
| τ → τ – function types
The arrow type τ₁ → τ₂ describes functions that accept an argument
of type τ₁ and return a result of type τ₂. Arrow is
right-associative: int → int → int means int → (int → int).
QuickAnswer
What is the type of a function that takes an integer and returns a function from integers to integers? Write it using the arrow notation.
QuickAnswer
What is the type of
(λ (x : int) (λ (y : bool) y))? (Hint: the outer lambda takes an int, the inner returns a bool.)
The Typing Judgement
We express “expression e has type τ” using a typing judgement:
Γ ⊢ e : τ
This reads: “under the typing environment Γ, expression e has type τ.” The environment Γ maps variables to their types, just as our interpreter environments mapped variables to values. For a closed term (no free variables), Γ is empty.
For example:
⊢ 42 : intsays “42 has type int” (empty environment)x : int ⊢ x : intsays “assuming x is an int, x has type int”⊢ (λ (x : int) x) : int → intsays “the identity function on ints has type int → int”
We prove these judgements using natural deduction rules, just as we did for evaluation in our closures lecture. Each rule tells us how to conclude that an expression has a certain type, based on the types of its subexpressions.
Typing Rules
Literals: T-Int and T-Bool
The simplest rules. A literal integer has type int, a literal
boolean has type bool, regardless of the environment:
These rules have no premises (nothing above the line). They are axioms: you can always conclude that a number is an int.
QuickAnswer
Using the T-Int rule, what is the type of
42? What about-7? Does the environment matter?
Variables: T-Var
To type a variable, we look it up in the environment:
Side condition: (x : τ) ∈ Γ
This says: if the environment maps x to type τ, then x has type
τ. The side condition is critical: if x is not in Γ, the
variable is unbound and the term is ill-typed.
QuickAnswer
Using T-Var, complete this judgement:
{x : int, y : bool} ⊢ y : ???
QuickAnswer
Can we derive a type for
xunder the empty environment{}? Why or why not?
Functions: T-Lam
This is the central rule. It says: if, assuming x has type τ₁,
you can show the body e has type τ₂, then the whole lambda has
type τ₁ → τ₂:
The notation Γ, x:τ₁ means “extend Γ with the binding x : τ₁.”
Because the lambda is annotated with τ₁, we don’t have to guess the
type of x: it’s written right there in the syntax.
Let’s prove that (λ (x : int) x) has type int → int. Click to
load this into the proof explorer and try building the derivation:
⊢ (λ (x : int) x) : int → int
The completed derivation looks like this:
Reading bottom-up: we want to show ⊢ (λ (x : int) x) : int → int.
By T-Lam, we need to show x : int ⊢ x : int. By T-Var (since x :
int is in the environment), this holds.
QuickAnswer
What is the type of
(λ (x : int) 42)? Build the typing derivation.
Application: T-App
To type a function application, we need the function to have an arrow type, and the argument to match the input type:
This is where type checking catches errors. If e₁ has type int →
bool but e₂ has type bool, the types don’t match (we need e₂ :
int) and the application is ill-typed.
Let’s prove that ((λ (x : int) x) 5) has type int. Click to try:
⊢ ((λ (x : int) x) 5) : int
The completed derivation:
Notice how T-App requires two premises: one for the function, one
for the argument. The key constraint is that the argument type τ₁
must match in both premises.
QuickAnswer
Try building the derivation for the following in the proof explorer:
⊢ ((λ (f : int → int) (f 1)) (λ (x : int) x)) : int
Arithmetic: T-Add and T-Neg
Addition requires both arguments to be integers:
Negation takes one integer argument:
QuickAnswer
Build the derivation for:
x : int ⊢ (+ x 1) : int
Conditionals: T-If
Our conditional is if0: it tests whether its first argument is zero.
Both branches must have the same type:
The condition e₁ must be an integer (we test if it’s zero), and
both branches e₂ and e₃ must have the same type τ. This
prevents the problem from the motivation: we can’t have one branch
return an int and the other return a bool.
QuickAnswer
Why does STLC require both branches of
if0to have the same type? What would go wrong if we allowed(if0 0 42 #t)?
Let bindings: T-Let
The T-Let rule binds a variable for use in the body:
First, type check e₁ to get its type τ₁. Then, type check the
body e₂ with x : τ₁ added to the environment.
QuickAnswer
Build the derivation for:
⊢ (let ([x 5]) (+ x 1)) : int
Building Complete Derivations
Now that we have all the rules, let’s practice building complete derivations. The key skill is working bottom-up: start with the conclusion you want to prove, identify which rule applies (based on the outermost form of the expression), and generate the premises.
Example: a higher-order function
Let’s type (λ (f : int → int) (f 1)). This takes a function f and
applies it to 1. Click to try:
⊢ (λ (f : int → int) (f 1)) : (int → int) → int
The derivation:
Notice the type of the whole expression: (int → int) → int. This is
a function that takes a function as input, making it higher-order.
QuickAnswer
What is the type of
(λ (f : int → int) (λ (x : int) (f (f x))))? This is a function that appliesftwice. Build the derivation in the proof explorer.
⊢ (λ (f : int → int) (λ (x : int) (f (f x)))) : (int → int) → int → int
What STLC Can’t Type
STLC is deliberately less expressive than the untyped λ-calculus. Some important terms cannot be typed:
Self-application is untypable. Consider (λ (f) (f f)). For this
to type check, f would need to have type τ → τ' (so we can apply
it), but also type τ (since it’s the argument). So we’d need τ = τ
→ τ', which is an infinite type: no finite type works.
Ω is untypable. Since (λ (x) (x x)) can’t be typed, neither
can ((λ (x) (x x)) (λ (x) (x x))). This means the term that loops
forever is rejected by STLC.
The Y combinator is untypable. Since the Y combinator uses self-application internally, STLC rejects it. This means we cannot write general recursion in STLC.
This is a feature, not a bug. In fact, every well-typed STLC program is guaranteed to terminate. The price we pay for this safety is that we can’t express some useful programs (like recursive functions). More advanced type systems, such as System F and those in OCaml or Haskell, recover expressiveness while maintaining safety by adding controlled forms of polymorphism and recursion.
QuickAnswer
Why can’t
(λ (f : ???) (f f))be typed in STLC, no matter what type we write for????
Type Soundness (Preview)
The payoff of type systems is a strong guarantee:
Theorem (Type Soundness): Well-typed programs don’t go wrong.
This is traditionally split into two properties:
-
Progress: If
⊢ e : τ(i.e.,eis well-typed and closed), then eithereis already a value, orecan take a step of evaluation. -
Preservation: If
⊢ e : τandesteps toe', then⊢ e' : τ. Evaluation preserves the type.
Together, these say: a well-typed program never reaches a stuck state (like trying to add a boolean to a number). It either produces a value or keeps computing, and its type never changes along the way.
We will not prove type soundness in this class, but it is the foundational result that justifies type systems.
QuickAnswer
What does “progress” guarantee about the expression
((λ (x : int) (+ x 1)) 5)if it type checks?
From Rules to Code
A beautiful property of STLC: the typing rules are syntax-directed. Each rule applies to exactly one syntactic form (T-Int for integers, T-Lam for lambdas, T-App for applications, etc.). This means we can implement type checking as a simple recursive function over the expression. In this section we will build a nearly-complete type checker for STLC in Racket.
Representing types and expressions
First, we need predicates that recognize valid STLC types and
expressions. Types are either base types (int, bool) or arrow
types:
(define (type? t)
(match t
['int #t]
['bool #t]
[`(,(? type? t0) -> ,(? type? t1)) #t]
[_ #f]))
So 'int is a type, '(int -> bool) is a type, and
'((int -> int) -> (bool -> (int -> int))) is a type. Arrow is
represented as a two-element list with -> in the middle.
Expressions include variables, literals, annotated lambdas, applications, and type annotations on arbitrary subexpressions:
(define (expr? e)
(match e
[(? symbol? x) #t]
[(? boolean? b) #t]
[(? integer? i) #t]
[`(,(? expr? e0) ,(? expr? e1)) #t]
[`(,(? expr? e) : ,(? type? t)) #t]
[`(lambda (,(? symbol? x) : ,(? type? t)) ,(? expr? e)) #t]))
Notice two things. First, lambda parameters must carry type
annotations: (lambda (x : int) x) rather than (lambda (x) x).
This is what makes type checking (as opposed to type inference)
straightforward. Second, any subexpression can be annotated with a
type via (e : t), which serves as an explicit type ascription that
the checker must verify.
Type synthesis
The core of our type checker is synthesize-type: given a typing
environment env (a hash mapping variable names to types) and an
expression e, it returns the type of e, or raises an error if the
expression is ill-typed.
Because the rules are syntax-directed, each match case corresponds
to exactly one typing rule:
(define (synthesize-type env e)
(match e
;; T-Int: integer literals have type int
[(? integer? i) 'int]
;; T-Bool: boolean literals have type bool
[(? boolean? b) 'bool]
;; T-Var: look up the variable in the environment
[(? symbol? x) (hash-ref env x)]
;; T-Lam: extend the environment and synthesize the body
[`(lambda (,x : ,A) ,e-body)
(define B (synthesize-type (hash-set env x A) e-body))
`(,A -> ,B)]
;; Type ascription: synthesize, then check it matches
[`(,e : ,t)
(let ([e-t (synthesize-type env e)])
(if (equal? e-t t)
t
(error (format "types ~a and ~a are different" e-t t))))]
;; T-App: application
[`(,e1 ,e2)
'todo]))
The literal cases are axioms: integers are int, booleans are bool.
The variable case is T-Var: look up the variable in env. The lambda
case is T-Lam: extend the environment with x : A and recursively
synthesize the body’s type B, returning the arrow type (A -> B).
The annotation case synthesizes a type for the inner expression and
verifies it matches the annotation.
QuickAnswer
The application case is left as
???. Using the T-App rule as a guide, implement it. You need to: (1) synthesize the type ofe1, (2) check that it is an arrow type, (3) synthesize the type ofe2, (4) check that it matches the arrow’s input type, and (5) return the arrow’s output type. Raise an error on a mismatch.
Wiring up the initial environment
Our language includes two built-in operations: + (curried addition)
and is-zero? (integer test). Rather than adding special typing rules
for these, we simply put them in the initial environment with the
right types:
(define (synthesize-stlc-type e)
(synthesize-type (hash '+ '(int -> (int -> int))
'is-zero? '(int -> bool))
e))
Notice that + is curried: it takes an int, returns a function
(int -> int). So (+ 3) has type (int -> int), and ((+ 3) 4)
has type int. This means addition is written as ((+ e1) e2) in our
syntax.
Let’s test it:
(synthesize-stlc-type '((lambda (x : int) x) 23))
;; => 'int
(synthesize-stlc-type '(lambda (x : int)
(lambda (y : bool)
(lambda (z : int) x))))
;; => '(int -> (bool -> (int -> int)))
Type checking as a wrapper
Once we can synthesize types, checking is trivial: synthesize and compare:
(define (check-stlc-type e t)
'todo)
QuickAnswer
Implement
check-stlc-type. It takes an expressioneand a typet, and returns#tifehas typet,#fotherwise. (Hint: this is a one-liner usingsynthesize-stlc-typeandequal?.)
Type erasure
Here is the key insight connecting our type checker to actual execution: STLC terms are just λ-calculus terms with extra annotations. Once we know a program is well-typed, we can erase all the type annotations and run the result as plain Racket:
(define (erase-types e)
(match e
;; + is curried in STLC but not in Racket
[`((+ ,e0) ,e1) `(+ ,(erase-types e0) ,(erase-types e1))]
;; Drop the type annotation from lambdas
[`(lambda (,x : ,t) ,e-body) `(lambda (,x) ,(erase-types e-body))]
;; Variables and literals pass through unchanged
[(? symbol? x) x]
[(? boolean? b) b]
[(? integer? i) i]
;; Drop type ascriptions
[`(,e : ,t) (erase-types e)]
;; Recurse into applications
[`(,e0 ,e1) `(,(erase-types e0) ,(erase-types e1))]))
The lambda case drops : t from the parameter. The ascription case
(e : t) strips the annotation entirely. The + case uncurries
((+ e0) e1) back into Racket’s (+ e0 e1).
The full pipeline: STLC to Racket
Putting it all together, we can compile STLC to Racket by first type
checking, then erasing types. If the type checker raises an error, we
catch it and return 'type-error:
(define (stlc->racket e)
(with-handlers ([exn:fail? (lambda (exn) 'type-error)])
(synthesize-stlc-type e)
(erase-types e)))
This gives us a complete pipeline. Well-typed programs get compiled to executable Racket, ill-typed programs are rejected:
(stlc->racket '((lambda (x : int) x) 23))
;; => '((lambda (x) x) 23)
;; evaluates to 23
(stlc->racket '(lambda (x : (int -> int))
(lambda (y : bool)
(lambda (z : int) (x y)))))
;; => 'type-error
;; because y : bool but x expects an int
The structure of the code mirrors the structure of the proof. This is a recurring theme in programming languages: proofs and programs are deeply connected.
QuickAnswer
Trace through what
stlc->racketdoes on'(lambda (x : int) ((+ x) 1)). What doessynthesize-stlc-typereturn? What doeserase-typesproduce?
Summary
| Concept | Key idea |
|---|---|
| Types | Rough approximation of runtime behavior |
| Typing judgement | Γ ⊢ e : τ: “under Γ, e has type τ” |
| Environment Γ | Maps variables to their types |
| Typing rules | Natural deduction rules, one per syntactic form |
| Type soundness | Well-typed programs don’t go wrong |
| STLC limitations | Can’t type self-application, Ω, or Y |
| Syntax-directed | One rule per form, so type checking is a recursive function |
| Type erasure | Once checked, strip annotations and run as untyped code |
The STLC is the foundation of type systems in modern functional languages. In our next lecture, we’ll study type inference: how to automatically deduce types without requiring annotations, using Hindley-Milner type inference as implemented in languages like OCaml, Haskell, and Rust.