Project 1: Stack Interpreter / Compiler
Project due date:
CIS531 – Project 1 – Stack Machine
In this project, we will not use the Autograder. It is due Sep 23,
- Please have your team (CC team members) submit to me via email. Projects up to 3 days late incur a -15%; Projects up to the end of the semester incur a -25%. Please submit promptly–I request you do not get behind quite yet.
In this project, you will implement an interpreter for a stack-oriented language. A program is an A-expression, giving a list of commands wrapped in a ‘(program …) block. The list of commands enumerates a straight-line sequence of actions to be taken on a stack, a LIFO data structure that is used for processing.
Stack-oriented languages have been common throughout the history of computing, offering a simple alternative to named variables. Notable examples include JVM bytecode, PostScript (used to generate graphics), and Forth. Our language will have the following commands:
push i # i is an integer
pop # remove the top item from the stack
dup # duplicate the last item on the stack
mul # multiply the top two elements of the stack
add # add the top two elements of the stack
sub # subtract the second-to-top element from the top element
neg # negate the top stack element
print # print the top stack element to stdout
read # read an integer into the top stack element
Part 1 [20%] – Parse files of these instructions to S-expressions
In this first part, you will take a file like this
push 5
push 7
mul
push 8
sub
print
read
read
mul
print
And you should turn them into S-expressions that correspond to the
program?
predicate defined.
Grading: based on equal? with golden outputs. See all-tests.sh
.
Part 2 [40%] – Stack-based interperter of program?
For this part, you will implement the interp-cmds
function. This
function takes a list of commands, along with a current stack. The
stack is a Racket list, such that the top of the stack is the first
element of the list, and so on. Thus, you can always use car
(or
first
) to inspect the first element of the stack, and cdr
or
rest
to pop the first element. This function should be written to
(a) if the list of commands is empty, simply return #t
, (b) if the
list of commands is not empty, then the function should execute one
command and then recursively call itself to handle the rest.
This function should be implemented in a tail-recursive style. In
other words, to make changes to the stack, you do not make any
mutations: instead, you use pure functions to construct a new stack
(e.g., the restult of (rest ...)
) as a parameter passed to a
recursive invocation of the interp-cmds
function. The final return
value will be #t
, but the side effect will be reading from stdin/out
(by calling (read)
and (displayln ...)
).
Grading: Completely based on objective tests, given in all-tests.sh
Part 3 [40%] – Open-ended: Compiling infix programs to stack programs.
In this last part, you will build a parser which accepts infix strings
as its input and returns valid program?
s as its output. You will
also build testing infrastructure to convince me of the correctness of
your solution.
The input grammar is as follows:
Primary ::= INT | "(read)" | "(" "print" Expr ")" | "(" Expr ")"
Unary ::= "-" Unary | Primary
# Important: *left* associativity for Prod/Sum (see class for trick)
Prod ::= Unary ( "*" Unary )*
Sum ::= Prod ( ("+" | "-") Prod )*
Expr ::= Sum
Your job is to write a function: (infix->program infix-string)
which:
-
Accepts a
string?
as input, written in the infix language described above (seeinfix-programs
for details). -
Produces a
program?
, which can easily be rendered to the .sp language
I have provided testcases for this part, which may be invoked via
test.rkt (see notes below). There are example infix test programs in
the infix-programs/
subdirectory.
Note that Prod and Sum need to left associate, i.e., 5 - 2 - 3
is
(5 - 2) - 3
rather than 5 - (2 - 3)
. This is one place where LL(k)
parsers do not excel, and so the typical solution is to use a hack:
recognize a list of subordinate atoms (e.g., ("*" Unary )*
) and
then–once you have the list–build the right syntax tree. Ask if you
are confused about this, please.
Testing your code
There is a file, test.rkt, which can be called at the command line in one of three modes:
(define modes '("parse-stackprog" "interp-stackprog" "translate-infix"))
Testing Part 1 (Parsing .sp
files)
To test parsing, you use parse-stackprog
, like this:
racket test.rkt -m parse-stackprog -g goldens/parse-pop-behavior.gld stackprogs/pop-behavior.sp
The -m
flag specifies the parsing mode. The -g
flag specifies the
golden (instructor-provided) output, and the last argument is the
program to test.
Testing Part 2 (Interpreting program?
)
To test interpretation, you also need an input stream, which is
basically an stdin
that will be fed into the program. This is
important because our programs can read
input.
To test interpretation, you do this:
racket test.rkt -m interp-stackprog -g goldens/interp-sum-then-mul-5.gld -i input-streams/5.in stackprogs/sum-then-mul.sp
Here, the -g
flag specifies the golden (in this case, the expected
output of the program) and the -i
the requisite input stream,
followed by the program.
Testing Part 3 (Translating Infix -> program?)
For this part, you need to specify an infix file. Examples are given
in the infix-programs
directory.
Run all tests
Run the file all-tests.sh
, which is simply a shell script which runs
each test with an expected golden input, etc.
Submitting your solution
-
Please use GitHub – PRIVATE REPOS ONLY. You can collaborate with each other on GitHub.
-
Please send me a .zip file containing your solution, and also showing the results of your testcases.
-
Grading will be based on tests, using
alltests.sh
. -
I reserve the right to use my own tests (e.g., for due diligence purposes), but do not plan to add any extras at this time.
-
Email me: kkmicins@syr.edu
Hints and Advice
- I have tried to make the project testable, debuggable, etc. But it is the first time I have given it out. If you are concerned there is a bug, feel free to email me early, I can look into it.
Code
If you are a student in the class, you will get a link to the code from me. I provide this reference (possibly buggy–email me if you find any issues!) for posterity:
#lang racket
(require parser-tools/lex)
(require (prefix-in : parser-tools/lex-sre))
(provide (all-defined-out))
(define (command? c)
(match c
[`(push ,(? integer?)) #t]
['pop #t]
['mul #t]
['add #t]
['sub #t]
['neg #t]
['print #t]
['read #t]
[_ #f]))
(define (program? p)
(match p
[`(program ,(? command?) ...) #t]
[_ #f]))
;; Parses a string representation of the program (containing newlines)
;; and translate it into a program? for subsequent interpretation.
;;
;; listof string? -> program?
;;
;; Assume the program is well formed--no need to handle error states
(define (parse-stackprog p)
(define cmds
'todo)
`(program ,@cmds))
;; Given a list of remaining commands, and a stack, run each of the
;; commands, possibly reading from stdin (via (read)) and writing to
;; stdout (via displayln).
;;
;; Hint: write this function in a tail-recursive style, meaning that
;; you first match on cmds: when there are no commands left, the
;; program is done and you should exit (no output needed) by returning
;; #t from this function.
(define (interp-cmds cmds stack)
'todo)
;; Write a translator which parses an infix program (see the
;; README.md) for the acceptable grammar and translates the program
;; into a program in the stackprog `.sp` language. Your output should
;; be written to stdout, and it should be able to be run with some
;; input stream (examples are in `input-streams/`) to produce some
;; output stream.
;;
;; Please *also* note that part of your grade is developing test
;; infrastructure in main.rkt. See the readme for the requirements.
;; I will provide a lexer for you--my solution builds a
;; recursive-descent parser which pulls from this token stream. You
;; may also consider using racket's yacc. However, I strongly
;; recommend writing a basic recursive-descent parser (I will cover
;; the tricks in class) for this assignment.
;;
;; You should simply use my lexer by calling the function
;; tokenize-string, which will return a list of tokens.
(define-lex-abbrev WS (:+ (char-set " \t\r\n")))
(define-lex-abbrev DIGITS (:+ numeric))
(define expr-lexer
(lexer
[WS (expr-lexer input-port)]
["+" 'PLUS]
["-" 'MINUS]
["*" 'TIMES]
["(" 'LPAREN]
[")" 'RPAREN]
["read" 'READ]
["print" 'PRINT]
[DIGITS `(INT ,(string->number lexeme))]
[(eof) 'EOF]))
(define (tokenize-port in-port)
(let loop ([acc '()])
(define t (expr-lexer in-port))
(if (eq? t 'EOF)
(reverse (cons t acc))
(loop (cons t acc)))))
;; Tokenize the string, turning it into a list of tokens.
(define (tokenize-string str) (tokenize-port (open-input-string str)))
;; (pretty-print (tokenize-port (open-input-string "3 + 3 * 5")))
;; (pretty-print (tokenize-string "3 + 3 * 5"))
;; Primary ::= INT | "(read)" | "(" "print" Expr ")" | "(" Expr ")"
(define (parse-Primary stream)
'todo)
;; Unary ::= "-" Unary | Primary
(define (parse-Unary stream)
'todo)
;; Product and Sum (left associate!)
;; Prod ::= Unary ( ("*" | "/") Unary )*
(define (parse-Prod stream)
'todo)
;; Sum ::= Prod ( ("+" | "-") Prod )*
(define (parse-Sum stream)
'todo)
(define (parse-Expr stream) (parse-Sum stream))
(define (parse-infix input-string)
(define toks (tokenize-string input-string))
;; now call parse-Expr on it, etc..
'todo)
;; Convert a string? written in the infix style
(define (infix->program infix-string)
;; I recommend writing a function here, h, which translates the
;; result of parsing and lineraizes that into a sequence of
;; commands. Basic idea: for an operator like (+ e0 e1), generate a
;; list where you append the translation of e0 (a list of commands)
;; to the translation of e1, followed by an add instruction (which
;; consumes the previous two).
(define (h expr)
(match expr
[(? number? n)
`((push ,n))]
[`(- ,expr)
'todo]
;; other cases here...
))
;; Parse the infix program, which results in some intermediate
;; S-expression (like (+ 2 (* 3 4))) and then uses h to translate it
;; into a list of commands...
`(program ,@(h (parse-infix infix-string))))
and test.rkt
#lang racket
(require "stack-machine.rkt")
(define mode (make-parameter #f))
(define input-file (make-parameter #f))
(define golden (make-parameter #f))
(define modes '("parse-stackprog" "interp-stackprog" "translate-infix"))
;; Interpret a program, starting with the empty stack Assume that p
;; satisfies program?, read inputs (one integer followed by newline)
;; from stdin (i.e., using (read)). Write output using displayln.
;;
;; Returns stdout
(define (test-interp-cmds p input-file)
(define in-port (open-input-file input-file))
(define out-port (open-output-string))
(parameterize ([current-input-port in-port]
[current-output-port out-port])
(match p
[`(program ,cmds ...)
;; interpret in the empty stack
(interp-cmds cmds '())]))
(get-output-string out-port))
;; Convert an infix program into a program? by calling infix->program
;; in stack-machine.rkt. That function
(define (test-infix ifx input-ints)
(define p (infix->program ifx))
(if (program? p)
(displayln "YES -- translation satisfies program?")
(displayln "NO -- translation *not* a program?"))
(pretty-print p)
(displayln "Interpreting...")
;; now, just call test-interp-cmds on the converted program
(test-interp-cmds p input-ints))
;; parse the command line (using a racket library)
(define prog-file
(command-line
#:once-each
[("-m" "--mode") m "Run tests in mode <m>" (mode m)]
[("-i" "--in") input "input file" (input-file input)]
[("-g" "--gld") g "Golden (expected) output file" (golden g)]
#:args rest-args (if (empty? rest-args) 'no-file (first rest-args))))
;; get all files in a directory as a list
(define (list-files dir)
(for/list ([p (in-directory dir)]
#:when (file-exists? p))
p))
;; Main entrypoint, run a test
(define (tests)
(cond
[(not (mode)) (displayln "error: must specify a mode")]
;;
;; Golden inputs (instructor use--but no reason you cannot read it)
;;
[(equal? (mode) "gengoldens")
;; generate goldens for parse-stackprog -- used by the instructor
(for ([file (list-files "stackprogs")])
(define parts (string-split (path->string file) "/"))
(define fname (last parts))
(define test-prog (first (string-split fname ".")))
(define golden-out (parse-stackprog (file->lines file)))
(define fout (format "goldens/parse-~a.gld" test-prog))
;; write the golden for parse-stackprog
(with-output-to-file fout
(lambda () (displayln golden-out)) #:exists 'replace)
;; write the command line for parsing
(displayln (format "racket test.rkt -m parse-stackprog -g ~a ~a" fout file))
;; for each input stream...
(for ([input-stream (list-files "input-streams")])
(define in-stem
(first (string-split (last (string-split (path->string input-stream) "/")) ".")))
(define fout (format "goldens/interp-~a-~a.gld" test-prog in-stem))
(with-output-to-file fout
(lambda ()
(call-with-input-file input-stream
(lambda (in)
(parameterize ([current-input-port in])
;; actually *RUN* the program using interp-cmds
(match golden-out
[`(program ,cmds ...) (interp-cmds cmds '())])))))
#:exists 'replace)
;; generate call to test.rkt for this program/input combination
(define fmt-string "racket test.rkt -m interp-stackprog -g ~a -i ~a ~a")
(displayln (format fmt-string fout input-stream file))))
;; now, for each infix program...
(for ([infix-program (list-files "infix-programs")])
(define in-stem
(first (string-split (last (string-split (path->string infix-program) "/")) ".")))
;; for each input stream...
(for ([input-stream (list-files "input-streams")])
(define input-stream-stem
(first (string-split (last (string-split (path->string input-stream) "/")) ".")))
(define out-port (open-output-string))
(define golden-file (format "goldens/infix-interp-~a-~a.gld" in-stem input-stream-stem))
(displayln (format "racket test.rkt -m translate-infix -g ~a -i ~a ~a" golden-file input-stream infix-program))
;; Translate
(parameterize ([current-input-port (open-input-file input-stream)]
[current-output-port (open-output-file golden-file #:exists 'replace)])
;; now run...
(match (infix->program (file->string infix-program))
[`(program ,cmds ...)
(interp-cmds cmds '())]))))]
;;
;; Parsing tests
;;
[(equal? (mode) "parse-stackprog")
(unless prog-file
(displayln "need a program file (last argument), input, and golden")
(exit))
(displayln (format "[test: parse-stackprog] Parsing stackprog ~a" prog-file))
(let ([p (parse-stackprog (file->lines prog-file))]
[gold (with-input-from-file (golden) read)])
(if (equal? gold p)
(displayln "✅ PASSED -- matches golden")
(begin
(displayln "❌ FAILED -- does not match golden")
(displayln "You said the answer was:")
(pretty-print p)
(displayln "But this is wrong. The answer is:")
(pretty-print gold))))]
;;
;; Interp tests
;;
[(equal? (mode) "interp-stackprog")
(unless prog-file
(displayln "need a program file (last argument), input, and golden")
(exit))
;; INTERP TESTS
(displayln (format "[test: interp-stackprog] Parsing stackprog ~a, Input file: ~a" prog-file (input-file)))
(define p (parse-stackprog (file->lines prog-file)))
(match p
[`(program ,cmds ...)
;; get the program's output
(define output (map string->number (string-split (test-interp-cmds p (input-file)))))
;; and golden output
(define golden-output (map string->number (file->lines (golden))))
(displayln (format "Your output stream: ~a\tGolden output stream:~a" (pretty-format output) (pretty-format golden-output)))
(if (equal? output golden-output)
(displayln "✅ PASSED")
(displayln "FAILED"))])]
;;
;; Compile infix -> .sp mode
;;
[(equal? (mode) "translate-infix")
(displayln (format "[test: translate-infix] Parsing stackprog ~a, Input file: ~a, Golden: ~a" prog-file (input-file) (golden)))
;; Write a tester for your translator--see README.md
(displayln "Translating program...")
(define p (infix->program (file->string prog-file)))
(unless (program? p)
(displayln "ERROR: translated program does *not* satisfiy program?\nThe offending program is: ~a"
(pretty-format p)))
(match p
[`(program ,cmds ...)
;; get the program's output
(define output (map string->number (string-split (test-interp-cmds p (input-file)))))
;; and golden output
(define golden-out (with-input-from-file (golden) (λ () (for/list ([l (in-lines)]) (string->number l)))))
(displayln (format "Your output stream: ~a\tGolden output stream:~a" (pretty-format output) (pretty-format golden-out)))
(if (equal? output golden-out)
(displayln "✅ PASSED")
(displayln "FAILED"))])]))
(module+ main (tests))