tags : Web Development, Web Performance, Systems

Note: This is just some initial exploration. Idk shit, all of this is 90% wrong, there’s a 100% chance of that.

What?

It was initially meant for browsers but ppl build own stand-alone runtimes to run it server-side

What it is?

  • A specification
  • A Universal compute platform: A computer that takes ur code and executes it

Spec

  • Spec about WebAssembly is a specification that defines a bunch of semantics about how a computer that doesn’t exist should work.
    • It defines the virtual machine
    • It describes a machine, not an implementation
    • How the stack works
    • The instructions the machine can run
    • The format that compilers should target (how code is stored into and loaded from dot WASM files)
    • Other details like that.
    • Interesting things about spec
      • No native string type, like C
      • WebAssembly machine definition supports C’s abstract machine — C, C++, Golang, and Rust can compile to this target — acting as a virtual instruction set architecture.
      • Inherent isolation
      • stack is external to WebAssembly linear memory
        • No stack pointer
        • Functions can manipulate the stack by pushing to it and popping from it, but they can’t actually move it around.
  • What it does not specify
    • It doesn’t specify an API that programs written to target WebAssembly can use to talk to the outside world.

Components of WASM

ISA < VM < Platform < Runtime (TODO: This shit is wrong, need to recheck)

ISA/CPU arch

The arch. Eg. x86_64, arm64

  • wasm32

    • wasm32 is what we mean when we say compile down to webassembly
    • wasm32 code currently runs on a vm on top of existing ISAs.
  • Webassembly vm

    • Essentially defines a virtual 32-bit CPU
    • The vm includes a “stack”, which records operand values and control constructs, and an abstract store containing global state.
    • Store results in linear memory, return results from stack
    • The vm becomes powerful when you use it to run external functions that get imported
      • External functions
        • Making an HTTP request with the JavaScript fetch() function
        • Read from and write to local storage
  • Webassembly Code

    • Compiling

      • emscripten : Initially used to compile to asm.js, now supports webassembly
      • zig : Supports freestanding, wasi
      • Rust, go and others has compilation flags
      • Other supports/custom ways etc.

Platform/ABI

These expose an ABI to the from host to the webassembly vm via the runtime

  • When we pick a ABI, we also need to indirectly decide “which kind of host is this targeted towards?”
  • Platforms specify how to talk to the runtime to provide higher-level things that normally something like an OS would provide, like reading a file.
  • These platforms then run on top of some runtime to execute WebAssembly.
  • What this means?
    • This dictates whether your WebAssembly code will (not) work with another bit as it all needs to target the same platform.
    • Dictates what your WebAssembly code can actually do, like have network access.
  • Emscripten

    • Web Browser is one runtime, usually using emscripten platform. emscripten defines its own ABI.
    • This is both the platform and the tool :)
  • Custom

    • Eg. Golang has a custom ABI, Go’s WebAssembly support also was made before WASI even came out

Host / Runtime

  • “This plays out precisely as the blog post details with the split between WASI and Web Platform. Say you want to compile a Rust + C codebase to Wasm and run in the browser. You have three targets: wasm32-unknown-emscripten, wasm32-unknown-unknown, and wasm32-wasi. Emscripten is relatively old and not maintained. It’ll work but you get some old JS that doesn’t play well with newer stuff. wasm32-unknown-unknown has an ABI incompatibility which means you cannot interoperate between C compiled to Wasm and Rust compiled to Wasm. wasm32-wasi works, but now you have to have a WASI implementation in the browser and that’s still very immature. Tools like wasm-bindgen or wasm-pack don’t work with wasm32-wasi either.”
  • “For all the work that’s gone into making Wasm portable across CPU architectures, we’ve ended up with modules that are not even portable across platform vendors. After all, wasn’t that the selling point of Docker?”, we hope that the component-model will solve this eventually.
  • See Performance of WebAssembly runtimes in 2023 (TODO: Comeback 2 this)
  • Technically, “Virtual Machine Runtime”
  • Runtimes execute WebAssembly Modules, which are most often binaries with a .wasm extension.
  • They facilitates all of the necessary interactions between the
    • VM: Webassembly VM
    • Platform/ABI: Environment in which the VM exists(platform).
  • The webassembly runtimes in browsers (Eg. V8)
  • These connect to the underlying host environment to falicitate what they falicitate
  • There are 2 primary category here. Since the popular ABI’s are web-browser and wasi, the runtimes are mostly devided by these two.
  • The browser side

    • This is usually V8 (See Javascript Runtime and Browser)
    • For the host environment that is the web-browseR
    • These are for both browser & non-browser but JS runtimes. Usually overseen by WinterCG.
    • In this case, you use it via a js runtime/shim loading it, you can also use it alongside javascript. Compilers will also allow you to generate wasm standalone, in which case you’ll be rolling the ball towards the WASI side from the web side of things.
  • The wasi side

    • webassembly code can also run outside browsers as-well either via wasi or directly or something like assemblyscript.
    • Eg. wasmer, wasmtime, WAVM, wazero, nodejs etc. see list of webassembly runtimes
      • NOTE: From the news, there’s some bad rep about wasmer’s ceo
    • Vendors can extend wasi in their runtimes, and try to provide what’s missing in WASI such as HTTP requests etc.

Host environment / Runtime targets

  • Web Browsers (v8)
  • Unix systems (wasi)
  • Edge
    • These can either be running a wasi or v8 runtime
    • Does cf worker fit here?
      • cf workers run on v8
      • cf might be either mocking existing browser APIs, or create their own interfaces that are compatible with the JavaScript shim generator.
      • not sure if cf supports standalone wasm

Toolchain

  • Compilers

    • Some languages are harder to compile because they have GC and wasm itself does not etc.
    • Compilers compile the code of certain language to support certain platforms. So in some cases, you’d need to write code keeping the platform in mind.
    • Example. Go compiles down to the Go ABI platform (which required a js execution environment) but with TinyGo you can compile things down to WASI. But recently go added native WASI support.
    • https://github.com/leaningtech/cheerp-meta
  • wasm-bindgen

    • For the web platform
    • wasm-bindgen or wasm-pack don’t work with wasm32-wasi either

History

  • In 2010, work kicked off on Emscripten, a C/C++ to JavaScript compiler.
  • In 2011, Fabrice Bellard released JSLinux: a Linux operating system and virtual machine compiled to JavaScript using a patched version of his QEMU software.
  • Finally, in 2013, Alon Zakai released asm.js
  • Emscripten then started compiling to asm.js
  • In 2015, all interested parties concluded that asm.js pointed in the right direction, that a language like asm.js should be encoded as distributable bytecode. Google got on board with the effort, dropping the NaCl/Pepper/PNaCl project, and WebAssembly was born.

Languages & WebAssembly

See Cross Compilation | appcypher/awesome-wasm-langs

Javascript

  • See Javascript
  • To run JavaScript code, the runtime is compiled to WebAssembly, with your code running within the WebAssembly-hosted interpreter. This approach, which might sound inefficient, is surprisingly practical and increasingly popular.
  • You sacrifice speed, but gain isolation.

Python

  • See Python
  • Python code itself does not compile down to WebAssembly. You compile a Python interpreter like CPython to WebAssembly and have that run your Python code. So when I talk about compiling in this blog post, I’m referring to compiling CPython to WebAssembly, not your personal Python code.
  • As mentioned in PEP11, webassembly is in tier 3 support with wasm32-unknown-emscripten and wasm32-unknown-wasi

Golang

  • See The Go WebAssembly ABI at a Low Level - Xe Iaso
  • Go defines it own ABI.
  • Go’s WebAssembly support also was made before WASI even came out
  • If you want to adapt a Go program to use WebAssembly or make a new program with WebAssembly in mind
    1. Compile go to WebAssembly
  • Take a reasonable subset of Go programs and run them in browsers alongside JavaScript.
  • Go doesn’t support WASI at all. TinyGo does though. Go’s WebAssembly port mostly targets browsers, not Unix systems.
  • The ABI is described at
    • /syscalls/js: Gives access to the WebAssembly host environment when using the js/wasm architecture. This gives references to JS object in analogy to how File Descriptors are opaque handles to kernel objects in Unix.
    • Go to JavaScript interoperability uses NaN-space numbers to encode object ids in the same way that Unix uses numerical file descriptors to encode kernel objects.
  • “WebAssembly has a stack, but it’s not compatible with how goroutine stacks work. Go works around this by putting goroutine stacks in memory and passing around the stack pointer as a hot potato.”

FAQ

Tips on writing code to support webassembly

  • Just wrap the core functionality into a ‘pure’ WASM module which doesn’t need to access ‘system APIs’, and then if needed write two thin wrappers, one for the ‘web personality’ and one for the ‘WASI personality’.
  • You can instantiate and load the WASM binary for every call, similar to a CLI call, but it would be an expensive operation. The best way is to run it in the background (observe the use of a channel to keep it running) while developers call it through a JavaScript interface.

Security

RE

  • See Reverse Engineering
  • wasm is harder to RE than Java Bytecode but much easier than native code. I did some WASM reverse engineering prior to cloudflare and I wouldn’t put anything sensitive in a binary destined to be ran client side.

Proposals

Threads support

  • Currently in the browser, among other problems, each worker has to instantiate all the host javascript objects and instantiate its own copy of the wasm module, which has its own unique copy of the imports and the function pointer table. It’s definitely way less elegant than real threads and potentially creates performance and stability issues.
  • Don’t know about the server side, but I’ve been using threads on the browser for ~2 months, I didn’t hit any bug specific to it yet. I use it both with Rust (wasmbindgen with async/await) and C (Emscripten with pthread support). HTTPS with some headers is required for `SharedArrayBuffer`. I still build a single-threaded binary for Firefox, and fallback to it if `SharedArrayBuffer` is `undefined` or if the `WebAssembly.Memory` constructor fails (some iOS devices might throw when `shared` is `true` due to a bug).
  • For this case it’s complicated because some runtime supports https://github.com/WebAssembly/threads which mostly contains things like the spec for atomic but not the actual “threads” specs and then some runtimes (i.e wasmtime) also supports https://github.com/WebAssembly/wasi-threads which is one version of the threads. But a new proposal came into play https://github.com/abrown/thread-spawn so … it’s complicated.

Garbage Collection

After WASM GC, the language won’t need to ship it’s own GC

Usecases

See Pay attention to WebAssembly | Harshal Sheth

  • Sandboxing
    • “I use WASM instances as isolated containers for processing data in my pet project. Once you compiled the WASM module and kept it in memory, spawning instances is incredibly fast!”
    • Firefox used a library called RLBOX to convert common libraries into wasm code, then reconvert them into heavily sandboxed c code. It allows mozilla to ship potentially dangerous versions of libraries like hunspell or ogg without having a flaw in them carrying over to the firefox codebase, since they’ve been converted into a more secure form.
  • Plugin system
    • One use case to run JS inside a Wasm VM is Shopify Functions. Shopify allows their customers to customize things like checkout flow by writing code compiled to Wasm which gets executed during the checkout process. They want their customers to be able to write JS as well as other languages. (See Bringing Javascript to WebAssembly for Shopify Functions (2023))
  • Interop
    • You are running JS in the browser but there’s a library that does something in Go but that library doesn’t have a JS port.
    • “The main difference that makes me excited is not having to change languages. I was able to take a developer CLI tool written in Rust, split it into a library and CLI tool, and then compile the library into wasm and make a web form which served the same purpose as the CLI tool so that SREs didn’t need to download, build, and run the CLI tool or need to know how to do any of that.”
    • Figma makes use of a low-level C++ library called Skia for some graphics algorithms rather than building their own or porting them to JavaScript.7
    • My favorite chess server, lichess.org, runs the world-class Stockfish chess engine in users’ browsers, saving them the computational burden of running it server-side.
    • Google Earth and Adobe Photoshop ported their C++ codebases to the web using Wasm.
  • Virtualization
    • CheerpX, a WebAssembly powered Virtual Machine whose goal is to safely and efficiently run unmodified X86 binary code in the browser.