Wednesday, December 27, 2017

Writing an interpreter for "X" in Racket

Interpreter vs Compiler

Let A, B and C are 3 programming languages. Then the Interpreter is a program written in B that executes a program written in A. The Compiler is a program written in B, which takes a program written in A and translates it to a program in C.

Example: add(2, 3) is a A statement. An interpreter would take this statement and produce 5. A compiler might produce a sequence of instructions such as,
mv r1, sp
addi sp, 1
mvi r1, 2
mv r2, sp
addi sp, 1
add r0, r1, r2
subi sp, 2

Goal: I have written an interpreter for language X in language Racket. The interpreter expects an AST built from a program written in A and evaluates the AST to a value.

Language X Description:
X is a functional language. It supports lexical scoping, higher order functions. Here are instruction types of the language X.

constants: 3
variables: var x = 3
if greater: ifgreater x y then e1 else e2
function: fun f(x) = x
lambda: \x => x
function call: f 5
let: let x = 3 in f(x) end
pair: (val1, val2)
empty pair: aunit
empty pair checker: isaunit?
fst: fst(p) produces val1
snd: snd(p) produces val2
add: add(v1, v2) produces v1+v2

A program in X is always an expression. Example program in X
let x = 3
 in let y = 3
   in let
    f = fn \t =>
         fn \s => add(s,t)
    in
     f x y
    end
   end
  end
 end

Designing Abstract Syntax Tree
We need to parse a program of X and represent the program using Racket data structructures and build an Abstract Syntax Tree.
Racket representation of X's constructs:
constant: (int 3)
variable: (var "x")
ifgreater: (ifgreater e1 e2 e3 e4) #if e1 and e2 evaluates to int and e1 > e2 then e3 gets evaluated else e4 gets evaluated.
function definition: (fun "f" "x" (var "x"))
lambda: (fun #f "x" (var "x")) ##lambda function, #f means the name field is false
function call: (call function arguments)
let: (mlet var-name e1 e2) # var-name is introduced in let, it's value is e1, var-name can be accessed from e2.
pair: (apair 2 3)
empty pair: (aunit)
empty pair checker: (isaunit e1) #evalutes e1, if it yields to (aunit) yields (int 1) else (int 0)
fst: (fst e)
snd: (snd e)
add: (add e1 e2)

Example AST:
(mlet "x" (int 3)
 (mlet "y" (int 3)
  (mlet "f" (fun #f "t"
        (fun #f "s" (add s t)))))
 (call (call (var "f") (var "x")) (var "y")))

Evaluation
Once we have AST we can proceed on evaluating it. We will need some more data structures and a few helper functions to do the evaluation.

Environment
To evaluate a program, the interpreter needs a table, that holds (binding-name, value) pairs. During evaluation of a X program, we can lookup this table to find value of a binding. We will call the table "environment".

Closure
To achieve lexical scoping, we will introduce closure. A closure is a pair (environment, function). Whenever a funtion is declared, we will take a snapshot of the current environment and produce a closure object with (environment, function).
Representation of closure (closure env function).

Values in Environment
Constant, pair, empty-pair and closure.

Evaluation Rules:
constant: (int 4) should return (int 4)
variable: If environemnt looks like
 ( ("x", (int 2)),
  ("f", (closure null (fun "f" "x" (add "x" (int 1)))))) then (var "x") would produce value (int 2).
apair: (apair e1 e2) would return (apair v1 v2) where e1 evaluates to v1 and e2 evalutes to v2.
aunit: (aunit) yields (aunit)
closure: (closure env f) yields (closure env f)

function: (fun "f" "x" (var "x")) yields to (closure env (fun "f" "x" (var "x"))) where current environment is env.
call:

ifgreater: (ifgreater e1 e2 e3 e4) if e1 => v1, e2 => v2  and v1 > v2 then yields to v3 => e3
     else yields to v4 where e4 => v4
isaunit: (isaunit e) yields (int 1) if e yields to (aunit)
           yields (int 0) otherwise
fst: (fst e) yields v1 if e yields to (v1, v2)
snd: (snd e) yields v2 if e yields to (v1, v2)
add: (add e1 e2) yields v3 where e1 yields v1, e2 yields v2, v3 = v1 + v2

call: (call (closure env f) e) evaluates to v where
 e=>v1
 new-env = (f.argument-name, v1) + env
 new-env => (f.fun-name, (closure env f)) + env if f is not a lambda
 evaluate(f.fun-expression, new-env) => v

mlet: (mlet var-name e1 e2) evulates to v where
 e1 => v1
 new-env = (var-name, v1) + env
 evaluate(e2, new-env) => v

Language Expansion using Macro:
Macro substitution is done by racket helper functions
(ifaunit e1 e2 e3) => (ifgreater isaunit(e1) (int 0) e2 e3)
(ifeq e1 e2 e3 e4) => (ifgreater e1 e2
            e3
            (ifgreater e2 e1 e3 e4))
(mlet* (list (var1 e2) (var2 e2)..) (en) ) : expands the following way
 (mlet var1 e2
  (mlet var2 e2
   (mlet var3 e3
    (en))))

(m-map m) takes a function m and returns a function that takes a list and returns a new list after applying m to each element of provided list and populating the new list with the value of function application:
 (fun "mapper-fn" "list"
  (apair (call (var "m") (fst (var "list"))) (call (var "mapper-fn") (snd (var "list"))))

Summary
The interpreter expects an AST built from parsing a X program and after macro substitution. We did not discuss parsing steps here. The interpreter maintains a table called environment as it goes through evaluating expressions in the AST.

No comments:

Post a Comment