This document describes a convention to implement C setjmp/longjmp via WebAssembly exception-handling proposal.
This document also briefly mentions another convention based on JavaScript exceptions.
This convention uses a few structures on the WebAssembly linear memory.
The first 6 words of C jmp_buf is reserved for the use by the runtime. ("words" here are C pointer types specified in the C ABI.) It should have large enough alignment to store C pointers. The actual contents of this area are private to the runtime implementation.
Emscripten has been using 6 unsigned long
s. (unsigned long [6]
)
GCC and Clang uses intptr_t [5]
for their setjmp/longjmp builtins.
It isn't relevant right now though, because LLVM's WebAssembly target
doesn't provide these builtins.
An equivalent of the following structure is used to associate necessary data to the WebAssembly exception.
struct __WasmLongjmpArgs {
void *env; // a pointer to jmp_buf
int val;
};
The lifetime of this structure is rather short. It lives only during a
single longjmp execution.
A runtime can use a part of jmp_buf
for this structure. It's also ok to use
a separate thread-local storage to place this structure. A runtime without
multi-threading support can simply place this structure in a global variable.
This convention uses a WebAssembly exception to perform a non-local jump
for C longjmp
.
The exception is created with an exception tag named __c_longjmp
.
The name is used for both of static linking and
dynamic linking.
The type of exception tag is (param i32)
. (Or, (param i64)
for memory64)
The parameter is the address of the __WasmLongjmpArgs
structure on the
linear memory.
void __wasm_setjmp(jmp_buf env, uint32_t label, void *func_invocation_id);
uint32_t __wasm_setjmp_test(jmp_buf env, void *func_invocation_id);
void __wasm_longjmp(jmp_buf env, int val);
__wasm_setjmp
records the necessary data in the env
so that it can be
used by __wasm_longjmp
later.
label
is a non-zero identifier to distinguish setjmp call-sites within
the function. Note that a C function can contain multiple setjmp() calls.
func_invocation_id
is the identifier to distinguish invocations of this
C function. Note that, when a C function which calls setjmp() is invoked
recursively, setjmp/longjmp needs to distinguish them.
__wasm_setjmp_test
tests if the longjmp target belongs to the current
function invocation. if it does, this function returns the label
value
saved by __wasm_setjmp
. Otherwise, it returns 0.
__wasm_longjmp
is similar to C longjmp
.
If val
is 0, it's __wasm_longjmp
's responsibility to convert it to 1.
It performs a long jump by filling a __WasmLongjmpArgs
structure and
throwing an exception with its address. The exception is created with
the __c_longjmp
exception tag.
The C compiler detects setjmp
and longjmp
calls in a program and
converts them into the corresponding WebAssembly exception-handling
instructions and calls to the above mentioned runtime ABI.
On the function entry, the compiler would generate the logic to create
the identifier of this function invocation, typically by performing an
equivalent of alloca(1)
. Note that the alloca size is not important
because the pointer is merely used as an identifier and never be dereferenced.
Also, the compiler converts C setjmp
calls to __wasm_setjmp
calls.
For each setjmp callsite, the compiler allocates non-zero identifier called
"label". The label value passed to __wasm_setjmp
is recorded by the
runtime and returned by later __wasm_setjmp_test
when processing a longjmp
to the corresponding jmp_buf.
Also, for code blocks which possibly call longjmp
directly or indirectly,
the compiler generates instructions to catch and process exceptions with
the __c_longjmp
exception tag accordingly.
When catching the exception, the compiler-generated logic calls
__wasm_setjmp_test
to see if the exception is for this invocation
of this function.
If it is, __wasm_setjmp_test
returns the non-zero label value recorded by
the last __wasm_setjmp
call for the jmp_buf. The compiler-generated logic
can use the label value to pretend a return from the corresponding setjmp.
Otherwise, __wasm_setjmp_test
returns 0. In that case, the
compiler-generated logic should rethrow the exception by calling
__wasm_longjmp
so that it can be eventually caught by the right function.
For an example, a C function like this would be converted like the following pseudo code.
void f(void) {
jmp_buf env;
if (!setjmp(env)) {
might_call_longjmp(env);
}
}
$func_invocation_id = alloca(1)
;; 100 is a label generated by the compiler
call $__wasm_setjmp($env, 100, $func_invocation_id)
block
block (result i32)
try_table (catch $__c_longjmp 0)
call $might_call_longjmp
end
;; might_call_longjmp didn't call longjmp
br 1
end
;; might_call_longjmp called longjmp
pop __WasmLongjmpArgs pointer from the operand stack
$env = __WasmLongjmpArgs.env
$val = __WasmLongjmpArgs.val
$label = $__wasm_setjmp_test($env, $func_invocation_id)
if ($label == 0) {
;; not for us. rethrow.
call $__wasm_longjmp($env, $val)
}
;; ours.
;; somehow jump to the block corresponding to the $label
...
...
end
The compiler converts C longjmp
calls to __wasm_longjmp
calls.
In case of dynamic-linking, it's the dynamic linker's responsibility to provide the exception tag for this convention with the name "env.__c_longjmp". Modules should import the tag so that cross-module longjmp works.
Emscripten has a mode to use JavaScript-based exceptions instead of
WebAssembly exceptions. In that mode, emscripten_longjmp
function,
which throws a JavaScript exception, is used instead of __wasm_longjmp
.
void emscripten_longjmp(uintptr_t env, int val);
The compiler translates C function calls which possibly ends up with
calling longjmp
to indirect calls via a JavaScript wrapper which
catches the JavaScript exception.
-
LLVM (19 and later) has a pass (WebAssemblyLowerEmscriptenEHSjLj.cpp) to perform the convertion mentioned above. It can be enabled with the
-mllvm -wasm-enable-sjlj
option.Note: as of writing this, LLVM produces a bit older version of exception-handling instructions. (
try
,delegate
, etc) binaryen has a conversion from the old instructions to the latest instructions. (try_table
etc.) -
Emscripten (3.1.57 or later) has the runtime support (emscripten_setjmp.c) for the convention documented above.
-
wasi-libc has the runtime support (wasi-libc rt.c) for the convention documented above.
-
__WasmLongjmpArgs
can be replaced with WebAssembly multivalue. -
Or, alternatively, we can make
__wasm_setjmp_test
take the__WasmLongjmpArgs
pointer so that we can drop the__WasmLongjmpArgs
structure layout from the ABI. -
It might be simpler for the complier-generated catching logic to rethrow the exception with the
rethrow
/throw_ref
instruction instead of calling__wasm_longjmp
. Or, it might be simpler to make__wasm_setjmp_test
rethow the exception internally. -
If/When WebAssembly exception gets more ubiquitous, we might want to move the runtime to compiler-rt.