The Spineless Tagless G Machine

Haskell’s evaluation model is interesting because it efficiently maps lazy evaluation of a functional language onto modern hardware with registers and memory pointers.

The virtual machine running both the code in the abstract, and on the hardware, is called the Spineless Tagless G Machine, and is explained in the following two papers:

Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine (PDF)
Making a fast curry: push/enter vs. eval/apply for higher-order languages (PDF)

To understand how it works I implemented a version in TypeScript. Below are my implementation notes. You can skip the notes and try the machine here.

Push/Enter VS Eval/Apply

The basic construction of the machine is a Heap, a Stack and something like an instruction pointer, pointing to the current expression being evaluated.

The initial paper (1) describes a model called Push/Enter, where the arguments to a function application (think map double [1, 2, 3]) are pushed to the stack, and then the function itself checks whether it got passed

too many arguments (return a new function),
exactly the right arguments (run code),
or too few (curry).

The implementation of that model is apparently complex, so the second paper (2) describes a model that’s easier to implement with roughly the same performance, called Eval/Apply. I’m only implementing Eval/Apply.

For Eval/Apply we always push an Apply Continuation with specific arguments to the stack, and then move the instruction pointer to the function expression. If the expression returns something that can be applied to arguments, we pop the Apply Continuation of the stack, and then decide what to do based on the function arity and the arguments available.

Substitution

Neither paper really describes substitution in any detail, and that’s where my code is ugliest. I made things up as I went along. There is a Haskell implementation called MiniSTG which is infinitely nicer because Haskell’s pattern matching is great for writing state-machine transition code.

The basic idea behind substitution is this: When evaluating the following expression, allocating an integer on the heap with let, the x on the right needs to be replaced with the actual heap address of the allocated object.

let x = I 1 in x
-- becomes: let x = $0 in x
-- becomes: $0

The same happens e.g. in function applications: The passed arguments replace variables in the function body.

To substitute we need to walk expressions and objects on the heap recursively. This works fine in JavaScript where I have a dynamic type for everything, but it’s not clear to me yet how this works in machine code. The papers mention addressing memory relative to an allocation pointer so that’s probably part of it.