Dissecting a Minimum (Useful) Webassembly Module

Lets write a Webassembly function using the low level assembly language, call it from Javascript, and dissect all bytes in the generated wasm file. During this process we are going to understand a few of Webassembly instructions, its module structure and some of Webassembly tools. The generated wasm file is a Minimum (Useful) Webassembly Module and we are going to test it by embedding this module inside a HTML file.

What Function?

A function that returns the XOR of two numbers. You can choose another function, like ADD, SUB, etc…, but using XOR may benefit a reader that is not familiar with the XOR function, besides I like XOR, it is like that child that wants to be different from the others. Hey ADD, what do you do? I add. Hey SUB, what do you do? I subtract. Hey XOR what do you do? I, well…

Hey ADD, what do you do? I add. Hey SUB, what do you do? I subtract. Hey XOR what do you do? I,well…

Webassembly Text format

This is our XOR function in text format with comments:


( 

   module                                              ;; It is a module 
   ( 
   func $WasmXOR (param i32) (param i32) (result i32)  ;; WasmXOR function, 
                                                       ;; it takes two i32 parameters and returns an i32 
   get_local 0                                         ;; Gets first parameter and pushes it to the stack 
   get_local 1                                         ;; Gets second parameter and pushes it to the stack 
   i32.xor                                             ;; Pops two values from the stack, execute the XOR 
                                                       ;; and pushes the result to the stack 

                                                       ;; The return value is the value in the stack, 
                                                       ;; which is the result of the XOR operation 
                                                       ;; Note I did not add a return instruction! Comments later 
   ) 
   (export "XOR" (func $WasmXOR))                      ;; Exports the WasmXOR function with the name XOR
) 

Now we need to compile it, which converts this Webassembly Text Format(WAT) into binary format(WASM). WABT: The WebAssembly Binary Toolkit is one way to go, but there is a more convenient alternative for this small experiment which is the wat2wasm demo page, copy the above WAT code and paste into the WAT box, the result WASM module is displayed with comments in the BUILD LOG box:

wat2wasm-xor-demo

There is something wrong though. The section sizes are set to zero and the comment says guess! The downloaded file is correct, click the Download button and save as xor.wasm.

Our XOR function generated a 69 bytes wasm file. Later we will see that there is a section that is unneeded, after removing it, the wasm module size goes down to 41 bytes.

xor-wasm-binary

Lets take a look at each section in detail. The Binary Encoding Specification is the official document that allows us to understand the sections in xor.wasm.

Module Preamble

A wasm module always starts like this:

Field Type Description
magic number uint32 Magic number 0x6d736100 (i.e., asm)
version uint32 Version number, 0x1

In xor.wasm:


0000000: 0061 736d    ; Magic number (here is little endian)
0000004: 0100 0000    ; version  (here is little endian)

Sections

All sections have the same high level format:

Field Type Description
id varuint7 section code
payload_len varuint32 size of this section in bytes
name_len varuint32 length of name in bytes, present if id == 0
name bytes ? section name: valid UTF-8 byte sequence, present if id == 0
payload_data bytes content of this section, of length payload_len – sizeof(name) – sizeof(name_len)

Type Section (0x01)

It declares the function signatures that are used by the module. The idea is that multiple functions can share the same signature:

Field Type Description
count varuint32 count of type entries to follow
entries func_type repeated type entries as described above

Func Type:

Field Type Description
form varint7 the value for the func type constructor
param_count varuint32 the number of parameters to the function
param_types value_type the parameter types of the function
return_count varuint1 the number of results from the function
return_type value_type? the result type of the function (if return_count is 1)

Value Type:

Value Type
0x7f i32
0x7e i64
0x7d f32
0x7c f64
0x70 anyfunc
0x60 func
0x40 pseudo type for representing an empty block_type

In xor.wasm, look at the comments and map the following bytes with the documentation described above:


; section "Type" (1) 
0000008: 01    ; section: 1 
0000009: 07    ; section size: 7 bytes 

; Number of types, we have only 1 function 
000000a: 01    ; num types: 1 
; type 0 
000000b: 60    ; func 
000000c: 02    ; num params: 2 
000000d: 7f    ; i32 
000000e: 7f    ; i32 
000000f: 01    ; num results: 1 
0000010: 7f    ; i32 

Function Section (0x03)

This section declares all functions in the module. Do not confuse it with the previous section where the signatures were declared. Functions are referenced by number, not by name:

Field Type Description
count varuint32 count of signature indices to follow
types varuint32 sequence of indices into the type section

In xor.wasm:


; section "Function" (3) 
0000011: 03    ; section code: 3 
0000012: 02    ; section size: 2 
0000013: 01    ; num functions: 1 
0000014: 00    ; function 0 uses signature index 0 

Export Section (0x07)

A list of exported objects. Our XOR function will be listed here. The XOR function name appears here, and only here.

Field Type Description
count varuint32 count of export entries to follow
entries export_entry repeated export entries as described below

Export entry:

Field Type Description
field_len varuint32 length of field_str in bytes
field_str bytes field name: valid UTF-8 byte sequence
kind external_kind the kind of definition being exported
index varuint32 the index into the corresponding index space

External Kind:

value Description
0 Function
1 Table
2 Memory
3 Global

In xor.wasm:


; section "Export" (7) 
0000015: 07    ; section code: 7 
0000016: 07    ; section size: 7 
0000017: 01    ; num exports: 1 
0000018: 03    ; string length: 3, which is the size of our function name 
0000019: 584f 52 ; export name: XOR 
000001c: 00    ; export kind: function 
000001d: 00    ; export func index: XOR() is function number 0 

Finally the Code Section, Where the Instructions are (0x0A)

Field Type Description
count varuint32 count of function bodies to follow
bodies function_body sequence of Function Bodies

Function Body:

Field Type Description
body_size varuint32 size of function body to follow, in bytes
local_count varuint32 number of local entries
locals local_entry local variables
code byte bytecode of the function
end byte 0x0b, indicating the end of the body

Local Entry:

Field Type Description
count varuint32 number of local variables of the following type
type value_type type of the variables

In xor.wasm:


; section "Code" (10) 
000001e: 0a    ; section code: 0x0A 
000001f: 09    ; section size: 9 
0000020: 01    ; num functions: 1 
; function body 0 
0000021: 07    ; func body size: 7 
0000022: 00    ; local decl count:0 XOR() does not have any local variable 
0000023: 20    ; get_local 0 
0000024: 00    ; (0) from get_local 0 
0000025: 20    ; get_local 1 
0000026: 01    ; (1) from get_local 1 
0000027: 73    ; i32.xor 
0000028: 0b    ; end of function 

Where is the return instruction? I have not added a return statement in the text format, my thinking was that it would not be necessary because I was defining a function, with a start and an end and the compiler would take care of adding the return statement, I was expecting to see the 0xf opcode:

Name Opcode Description
return 0x0f return zero or one value from this function

Not clear why it is not there, the XOR() function worked without adding the return instruction, it returned correctly, my guess is that it might be related with optimization, the interpreter may decide to inline this function or add a return instruction.

The Unneeded Custom Section (0x00)

According to the Binary Encoding Documentation:

Custom sections all have the same id (0), and can be named non-uniquely (all bytes composing their names may be identical). Custom sections are intended to be used for debugging information, future evolution, or third party extensions. For MVP, we use a specific custom section (the Name Section) for debugging information. If a WebAssembly implementation interprets the payload of any custom section during module validation or compilation, errors in that payload must not invalidate the module.

So, as far as execution goes, and with the goal of looking for the minimal useful module, we are safe in ignoring this section.

In xor.wasm:


; section "name"
0000029: 00    ; section code: 0
000002a: 1a    ; section size: 26
000002b: 04    ; string length: 4
000002c: 6e61 6d65 name  ; custom section name: name
... etc
... etc

How do We Test it? Lets produce 0xdead 0xbeef

Lets use the iconically (or ironically) 0xdead 0xbeef hex constants to verify if our Webassembly function is working.

Following is the entire, and small, xor.html file used to test the XOR() function, and below we look into all its parts.

Note that the Webassembly module is embedded inside our html file, that way we don’t need to fetch an external file, avoiding the pleasure to deal with Javascript promises, sorry I just think this is a terrible name to use in software languages, we already have enough problems keeping our software promises… The other advantage of this approach is the fact that we don’t need a HTTP server to run our xor.html file, just drop this file in your browser and it runs.


<html>
<head>
</head>
<body>
  <script>
    var xorBinWasm = new Uint8Array([ 0x00, 0x61, 0x73, 0x6D, 0x01, 0x00, 0x00, 0x00,
                                      0x01, 0x07, 0x01, 0x60, 0x02, 0x7F, 0x7F, 0x01,
                                      0x7F, 0x03, 0x02, 0x01, 0x00, 0x07, 0x07, 0x01,
                                      0x03, 0x58, 0x4F, 0x52, 0x00, 0x00, 0x0A, 0x09,
                                      0x01, 0x07, 0x00, 0x20, 0x00, 0x20, 0x01, 0x73,
                                      0x0B]);

    var WebAssemblyInstance = new WebAssembly.Instance(new WebAssembly.Module(xorBinWasm));

    var xorFunc=WebAssemblyInstance.exports.XOR;

    if(
        (xorFunc(0xFF00, 0x21AD) !== 0xdead ) ||
        (xorFunc(0xAA55, 0x14BA) !== 0xbeef )
        )
    {
        alert('Webassembly cant calculate XOR!');
    }
    else
    {
        alert('Webassembly XOR function is correct!');
    }

  </script>
</body>
</html>

OK, lets break it down to its parts.


    var xorBinWasm = new Uint8Array([ 0x00, 0x61, 0x73, 0x6D, 0x01, 0x00, 0x00, 0x00,
                                      0x01, 0x07, 0x01, 0x60, 0x02, 0x7F, 0x7F, 0x01,
                                      0x7F, 0x03, 0x02, 0x01, 0x00, 0x07, 0x07, 0x01,
                                      0x03, 0x58, 0x4F, 0x52, 0x00, 0x00, 0x0A, 0x09,
                                      0x01, 0x07, 0x00, 0x20, 0x00, 0x20, 0x01, 0x73,
                                      0x0B]);

This is the content of our xor.wasm file without the unneeded custom section. These are the 41 bytes of our module.


    var WebAssemblyInstance = new WebAssembly.Instance(new WebAssembly.Module(xorBinWasm));

Webassembly module and Webassembly instance are created, we need these objects in order to deal with our Webassembly module. Note the Webassembly instance is created with the Instance() constructor, not the Instantiate() constructor which returns a promise.

What is the difference?

When using a promise, the instantiate work(second phase compilation) is deferred and allows your Javascript code to keep running without waiting for the Instantiate work to finish, it is particularly interesting if you have a large Webassembly module where the second phase compilation may take a long time to finish, but it brings the complexity of dealing with the Javascript promise, which doesn’t make sense for our small test module.


    var xorFunc=WebAssemblyInstance.exports.XOR;

Not needed, it is here just for clarity, we create a variable that is actually the function we want to call, you could call the function by using: WebAssemblyInstance.exports.XOR(,), but I believe xorFunc(,) is clearer.


    if(
        (xorFunc(0xFF00, 0x21AD) !== 0xdead ) ||
        (xorFunc(0xAA55, 0x14BA) !== 0xbeef )
        )
    {
        alert('Webassembly cant calculate XOR!');
    }
    else
    {
        alert('Webassembly XOR function is correct!');
    }

Call the XOR() function and expects for DEAD BEEF!

XOR Teaser

If this is your first encounter with XOR, try to work out how XORing 0xFF00 and 0x21AD results in 0xDEAD. Look up the answer here.

Post Mindmap

mindmap-webassembly-minimal-useful-module-webassemblycode.com

Leave a message below. Webassembly is evolving rapidly, please let me know if this post got outdated.

Enjoyed this post?

Don't miss new posts: Share it with your friends:

Leave a Reply

Your email address will not be published. Required fields are marked *