Compiling

An Empty Program

Let's start with something simple: an empty program. This will demonstrate the bare bones layout of the generated web assembly.

Every snax program is encapsulated into a webassembly module.

Web assembly modules consist of a series of declarations for various things that your program might need to run.

Let's look at each line in the output.

This declares a global variable, called $g0:#SP. The SP stands for "Stack Pointer". We'll learn more about how this stack pointer variable is used when we get to functions.

This line declares a linear memory buffer with id $0, with an initial size of 100 pages and a maximum size of 100 pages. 1 page of memory is 64 KB, so by default snax programs have 6.4MB of memory to work with.

The next two lines,

Make the stack pointer and the memory accessible to the host environment in which the web assembly is being executed. So for example, when running the compiled web assembly in a javascript environment, the javascript would be able to access the stack pointer, and the raw linear memory used by snax. This is useful for inspecting the state of a snax program at runtime.

A Simple Program

Ok, now let's look at a program that actually does something:

Things get a lot more complicated!

The _start function

WebAssembly itself is unopinionated about how web assembly programs are executed. Any function in web assembly can be exported to the host environment, allowing the host environment to call it whenever it wants. For libraries compiled to web assembly, the library will export a bunch of functions that can be called as needed. While snax can be used to make libraries, it can also be used like a script that just runs.

For the script scenario, there is a convention to export a function named _start which the host environment will call. This "convention" is actually part of the WASI (Web Assembly System Interface) standard. See here for documentation about _start.

This _start function initializes the stack pointer to 6553600, which is the end of our linear memory space. When functions are called, the stack will grow from the end.

The main function

Functions in snax get compiled directly to functions in web assembly, with name mangling to handle namespaces, which are not part of web assembly. The naming convention is <${snaxNamespace}${snaxFuncName}>f${funcOffset}. Since these examples are all being compiled in your web browser, where there is no concept of file paths (which would normally be part of the namespace), the namespace is <root>::. Since this is the 0th function that's been declared, the funcOffset is 0, resulting in a function name $<<root>::main>f0.

At the beginning of every snax function, you'll see this code:

Here we are creating a "local" variable in web assembly where we store the current value of the stack pointer. This will come in handy when we want to access snax variables that are stored on the stack.

Local Variable Allocation

Snax supports definining local variables in your function. Local variables come in two flavors: reg and let variables. reg variables are limited to data types that can fit into 64 bits (numbers, booleans), while let variables can store larger values (arrays and structs for example).

`reg` Variables

Let's look at how reg variables get translated into web assembly. In the below code, we have 5 reg variables, two of which are inside blocks.

You'll notice that the function in web assembly gained 4 local declarations:

TODO: insert link to webassembly documentation regarding local declarations

These map exactly to the reg declarations. Because blocks in snax create new scopes, it's impossible to access d and e outside of their respective blocks. The snax compiler keeps track of the lifetime of these reg variables and will map them onto the same (local) in web assembly if their types match and their lifetimes don't overlap. In this case, the lifetimes of d and e do not overlap, and they are the same type, so we'll reuse local $4 for both.

You'll also see a bunch of calls to local.set:

Every reg declaration will initialize it's corresponding local to 0. This prevents possible garbage results when the same local is used for multiple different reg declarations. So while d and e both use local $4, we can reset assured that e won't accidentally take on the last value of d.

`let` Variables

let variables allow storing values that can't fit in a web assembly local, like arrays and structs. This is achieved by storing the values of let variables in linear memory. Using linear memory also means that all let variables have an address in memory, and can therefore be passed around with pointers.

Let's look at a simple example where we declare a couple of arrays.

There are two important bits that got added. The first is some code to update the stack pointer:

Remember as mentioned earlier that the stack grows from the end of linear memory. Before we do anything else in our function, we have to allocate all the space on the stack we'll need for the function's let variables and other values we might need to store in linear memory. We do this by decrementing the stack pointer. We decrement by 52 bytes because we have the a variable takes up 12 bytes and the b variable takes up 40 bytes, which totals to 52 bytes.

The second interesting bit of code is:

Just like when we initialized all of our reg variables with (local.set $1(i64.const 0)), we also initialize all of our stack variables. In this case we use the memory.fill instruction to bulk write 0s to the range of bytes that we've set aside for each of our let variables.

Function Calls And Function Arguments

Web assembly provides the abstraction for passing arguments to functions, so long as those arguments can fit into the limited numeric data types that web assembly supports (i32,i64,f32,f64). Let's look at a simple program with an add function that takes two arguments:

First let's look at the add function. Here is the web assembly:

Note that our function declaration has been augmented with (param $0 i32) (param $1 i32) (result i32). This specifies the two parameter types and the return type of the function. The parameters have ids $0 and $1, and it's important to note that these are in the same namespace as local variables. That's why the local we use to save the stack pointer has id $2 in this function.

Next let's look at the function call:

Every function call is encapsulated inside a (block) instruction because there are several steps, read from inside out:

call the add() function with our two parameters 1 and 2:
store the function's return value in a temporary variable
reset the stack pointer (remember that the stack pointer gets modified at the beginning of every function definition). This effectively reclaims stack space that was used by the add() function:
do something with the returned result

It's important that the steps happen in this sequence, particularly resetting the stack pointer, because it's possible that the thing we do with the return value is immediately passed to another function!

Passing Large Values as Function Arguments

So what happens if we want to pass a larger value to a function, such as a struct or an array? In this scenario, we'll still be using the simple integer arguments that web assembly provides, but we'll be passing a pointer to the larger value instead of the value itself.

Let's look at an example:

Snax