Webassembly, Traps, Exceptions and other learning experiences

Trapping conversions was a topic discussed during the last question of the 2017,Dec Webassembly AMA (Ask me Anything). This post tries to answer: What is trapping? What is a trapping condition? What about its history?

50min and 30 sec into the video: Question:What is the next milestone in Webassembly and when is it going to be reached? Answer (Brad Nelson): … we’ve made one mistake already in the format we had trapping conversions we’re gonna have non trapping versions so…

Then I found the following in Features to add after the MVP:

Trapping or non-trapping strategies. Presently, when an instruction traps, the program is immediately terminated. This suits C/C++ code, where trapping conditions indicate Undefined Behavior, …

What is trapping? What is a trapping condition? What about its history?

I started my research by typing “Trap instruction” in Google, …, not a good idea. I was naively expecting some reference for a processor instruction, but I did not get quite that. Google showed me some dark places in the internet. Just caution. But that just increased my curiosity, our “high tech” profession uses weird metaphors, like bugs, traps, trapdoor, etc…

Trap instruction:Google showed me some dark places in the internet

Traps as a Catchall Concept

You find some references to the word trap in books and in the internet, such as the following:

Traps are the general means for invoking the kernel from user code. We usually think of a trap as an unintended request for kernel service, say that caused by a programming error such as using a bad address or dividing by zero. However, for system calls, an important special kind of trap discussed below, user code intentionally invokes the kernel.

Trap stacks the PC and then loads it with an interrupt vector. Trap is a software instruction used for debugging. TRAP is instruction for software trap.

The userland code then issues a trap instruction. As with other types of interrupts, the system reacts, saving the state of the userland process by recording the value of the program counter and the other hardware registers, the process stack, and other process data. The hardware switches into kernel mode and starts running the system call trap handler function. This function typically looks up the requested service in a table, in order to figure out what subroutine to call to carry it out. This entire process is often called a context switch.

Two basic kinds of traps can occur: synchronous and asynchronous. Synchronous traps are caused by, or during the operation of, an instruction. These may be actual trap instructions or hardware errors, such as bad address alignments, bad addresses (bus timeouts), illegal instructions, floating-point coprocessor errors, and so on.
… Asynchronous traps can be requested at any time but will be recognized and processed only after an instruction has completed. These traps are due to external events such as interrupts

In computing and operating systems, a trap, also known as an exception or a fault, is typically[NB 1][1] a type of synchronous interrupt typically caused by an exceptional condition (e.g., breakpoint, division by zero, invalid memory access)

I see controversy. According to the references, trap could be synchronous, asynchronous, used for debugging, to call system functions, or to initiate a context switch. That is a broad use for what I thought would have a very specific definition.

What history tell?

Trapped in History

Here is where my previous warning about looking for trap in Google could be unhealthy.

Norman Hardy was working with computer in the fifties, and this is what they used to call traps back then:

My first experience with a trap was the IBM 704 (1957) feature called transfer trapping

A program comes to a presumably rare situation where the CPU cannot obey the program given the unusual state of the computation. This is a situation that is either unanticipated by the programmer, or anticipated but inefficient to include instructions in the program to test for. An example of the latter is division by zero where the application author has not yet decided what to do in such an unlikely event. I call these traps here and say that the program has been trapped.

OK, so now we know that trap was already around in the fifties. Mark Smotherman says that an overflow trap was already part of Univac-I. According to Wikipedia, “UNIVAC-I was the first commercial computer produced in the United States.”

Traps are old.

Mark Smotherman’s page points to another page where you find UNIVAC-I’s searchable pdf manuals!

Lets pause for a second.

The other day I was looking in the internet for my microwave manual, which I bought two years ago. Yes, sometimes I lose paper manuals. Did I find it in the internet? No.

How Does UNIVAC-I Defines Trap?

I was unable to find the word trap in its manuals, but for sure it defines the procedure that Norman Hardy defined as trap.

From

univa1-Preliminary-Description
  1. Overflow Should overflow occur in the summing operation” the computer will automatically insert the pair of instructions located in the first memory location (000). Consequently, an instruction pair to initiate the desired corrective action must be placed in 000, prior to performing an addition.

That is it, the behavior is already there, and it allows the programmer to take a corrective action.

In the UNIVAC1_Programming_1959.pdf:

UNDESIRED OVERFLOW There are many uses of arithmetic instructions in which the unplanned occurrence of overflow would result in an incorrect solution. Although the occurrence of overflow can not be prevented, a minus sign coded in the second instruction digit of an instruction on which overflow occurs will stop the computer on the completion of the execution of the instruction.

Presumably, you can stop the computer to debug your code, write a new set of magnetic tapes, and then load into UNIVAC-I.

Overstatement of Trap

After all of the above, it appears to me that trap should be what was implemented by UNIVAC-I, perhaps with the addition of special debug instructions being characterized as trap, debugging could be seen as running your code and stop it, or trap it, when something goes wrong, and so not letting it get lost.

Are Syscalls considered traps? Context switches are traps?

I don’t think so. It appears to me that traps became a catchall concept for hard to define mechanism.

And how is Webassemby using the word trap?

Webassembly and Trap

Again, from Features to add after the MVP:

Trapping or non-trapping strategies. Presently, when an instruction traps, the program is immediately terminated. This suits C/C++ code, where trapping conditions indicate Undefined Behavior, …

It is very well aligned with UNIVAC-I concepts, although Webassembly seems to be a little bit behind. UNIVAC-I already allowed “corrective action” and would stop the computer only if that was the instruction on its address 000. Currently Webassembly is always “stopping the computer” by immediately terminating the program.

The possible “corrective action”, already present in UNIVAC-I computer, will be a new feature for Webassembly.

Next, lets try to create a small app that is supposed to generate a trap condition, and see what happens with Webassembly.

First Webassembly App, Lets Trap It!

Webassembly apps I see on the internet are either hello world apps or factorial apps or something of this sort. Boring. Not very useful right? In my life, I have written more apps that generates the NULL pointer reference or segmentation fault than hello type of world apps. Writing an early NULL pointer reference app is very useful, because we will know how the browser reacts when the real NULL pointer reference comes in the future.

See the following example of a very basic bad code:


#include <stdio.h> 

int main() 
{ 
    char *p=0;  // declare a pointer to char and make it point to 0 
     
    *p=1;   // Store 1 into address 0, traps in Linux and Webassembly 
} 

Kind of Expected, not Really

This is what Firefox gives me when I run this code:

firefox Webassembly Null Reference

Well, Firefox clearly trapped our application wrong doing, but the message just confirms the old saying that “Nothing is so certain as the unexpected”. I am so used to see the segmentation fault message that the message “The application has corrupted its heap memory area (address zero)!” through me out of balance for a minute. Here is the same code running on Cygwin:



~ 
$ gcc null.c 

~ 
$ ./a.exe 
Segmentation fault (core dumped) 

~ 
$ 

Reading Address 0

By Firefox’s error message, it should be no problem reading address 0 in webassembly, because we won’t modify the heap.

Lets try the following code in Firefox, where we read address 0, write back what we read, and then modify the content of address 0, with printfs in between so we can check what happens:


#include <stdio.h> 

int main() 
{ 
    char *p=0;  // declare a pointer to char and make it point to 0 
    char tmp=*p; // Read the content of address 0, enough to trap in Linux, but not in webassembly 
     
    printf("Here is what I have in address 0: %x\n", tmp); 
     
    *p=tmp; // Store the previously read character into 0, does not trap in webassembly, Linux already trapped earlier 
    printf("Wrote the same value back\n"); 
     
    (*p)++; // Modify the content of address 0, have late trap in Webassembly, unexpected. 
    tmp=*p; // Read the content of address 0 
    printf("Address 0 value was modified to %x\n", tmp); 
} 
firefox Webassembly Null ReadWrite

Whoa! Now I am really confused. The “Address 0 value was modified to…” line should not be there because the trap condition (modifying the heap memory) happened prior to that print statement. I got exactly the same behavior with Chrome.

With the UNIVAC computer, and I would guess for any recent processor, trap conditions are being checked while instructions are being executed, the hardware executing the instruction A=A+1 would generate a trap if A happens to overflow, or in ourcase, if we write into memory that is write protected. Each instruction that could generate a trap would check for trap conditions.

Webassembly is generating the trap, but not exactly when the trap condition happens, like processors do.

What is the problem in delaying the trap? The earlier the trap happens the better or else the astray code could mess so much the environment that would be very hard to trace back the bad behavior root cause.

What is happening?

I am going to apply the first law of holes: “if you find yourself in a hole, stop digging”, and this trapping behavior of Webassembly will need to be addressed on a new post (stay tuned!).

Conclusions

The most important conclusion I take from all of this: the shift key was very important in the fifties, UNIVAC’s keyboard had a “shift lock” key an “un-shift” key and a “single shift” key:

univac Keyboard

Seriously now, the trap mechanism was already implemented in the very first commercial computer(UNIVAC), in the fifties, and I would say, the concept did not change since then.

The current Webassembly’s trap implementation appears to be less tight than the UNIVAC’s implementation and left me with more questions than answers. To be continued…

Software version
emscripten 1.37.26
Firefox 57.0.2
Chrome 63.0.3239.84
Windows 7
Cygwin 1.7.27

Post Mindmap

mindmap-webassembly-Trap

Leave a Reply

Your email address will not be published. Required fields are marked *