tags : Web Development, Web Performance, Systems, Javascript, Golang, Computer Architecture
Note: This is just some initial exploration. Idk shit, all of this is 90% wrong, there’s a 100% chance of that.
What?
It was initially meant for browsers but ppl build own stand-alone runtimes to run it server-side
What it is?
- A specification
- A Universal compute platform: A computer that takes ur code and executes it
Spec
- Spec about WebAssembly is a specification that defines a bunch of semantics about how a computer that doesn’t exist should work.
- It defines the virtual machine
- It describes a machine, not an implementation
- How the stack works
- The instructions the machine can run
- The format that compilers should target (how code is stored into and loaded from dot WASM files)
- Other details like that.
- Interesting things about spec
- No native string type, like C
- WebAssembly machine definition supports C’s abstract machine — C, C++, Golang, and Rust can compile to this target — acting as a virtual instruction set architecture.
- Inherent isolation
- stack is external to WebAssembly linear memory
- No stack pointer
- Functions can manipulate the stack by pushing to it and popping from it, but they can’t actually move it around.
- What it does not specify
- It doesn’t specify an API that programs written to target WebAssembly can use to talk to the outside world.
Components of WASM
ISA < VM < Platform < Runtime
(TODO: This shit is wrong, need to recheck)
ISA/CPU arch
The arch. Eg. x86_64, arm64
-
wasm32
wasm32
is what we mean when we say compile down to webassemblywasm32
code currently runs on avm
on top of existing ISAs.
-
Webassembly vm
- Essentially defines a virtual 32-bit CPU
- The
vm
includes a “stack”, which records operand values and control constructs, and an abstract store containing global state. - Store results in
linear memory
, return results fromstack
- The
vm
becomes powerful when you use it to run external functions that get imported- External functions
- Making an HTTP request with the JavaScript fetch() function
- Read from and write to local storage
- External functions
-
Webassembly Code
- See Raw WebAssembly — surma.dev
- This is what consists a webassembly module (
.wasm
file) that runtimes execute. - It’s a binary format and also has a text representation. (See Custom Protocols)
-
Compiling
emscripten
: Initially used to compile toasm.js
, now supportswebassembly
- zig : Supports
freestanding
,wasi
- Rust, go and others has compilation flags
- Other supports/custom ways etc.
-
Writing by hand
- WebAssembly binary format has a text representation — the two have a 1:1 correspondence.
- You can write or generate this format by hand and then convert it into the binary format.
- AssemblyScript TS based programming language that compiles to webassembly. Recently removed support for wasi
- wasmati
Platform/ABI
These expose an ABI to the from host
to the webassembly vm
via the runtime
- When we pick a ABI, we also need to indirectly decide “which kind of
host
is this targeted towards?” - Platforms specify how to talk to the runtime to provide higher-level things that normally something like an OS would provide, like reading a file.
- These platforms then run on top of some runtime to execute WebAssembly.
- What this means?
- This dictates whether your WebAssembly code will (not) work with another bit as it all needs to target the same platform.
- Dictates what your WebAssembly code can actually do, like have network access.
-
WASI
wasi
is another platform.wasi
specifically is barebones, it’s like a headerfile, it needs aruntime
.- One of the goals of WASI however is to define a strict sandbox such that it becomes simple for a host environment to decide exactly what resources to share with a WebAssembly module.
- WASI also doesn’t support system calls like “open network socket” yet
- Some ppl say jvm = wasi+wasm. I am not so sure of this argument, have to see for myself.
- stealthrocket/wasi-go: A Go implementation of wasi
- WASIX, the Superset of WASI Supporting Threads, Processes and Sockets | Lobsters
-
Emscripten
- Web Browser is one runtime, usually using
emscripten
platform.emscripten
defines its own ABI. - This is both the platform and the tool :)
- Web Browser is one runtime, usually using
-
Custom
- Eg. Golang has a custom ABI, Go’s WebAssembly support also was made before WASI even came out
Host / Runtime
- “This plays out precisely as the blog post details with the split between WASI and Web Platform. Say you want to compile a Rust + C codebase to Wasm and run in the browser. You have three targets: wasm32-unknown-emscripten, wasm32-unknown-unknown, and wasm32-wasi. Emscripten is relatively old and not maintained. It’ll work but you get some old JS that doesn’t play well with newer stuff. wasm32-unknown-unknown has an ABI incompatibility which means you cannot interoperate between C compiled to Wasm and Rust compiled to Wasm. wasm32-wasi works, but now you have to have a WASI implementation in the browser and that’s still very immature. Tools like wasm-bindgen or wasm-pack don’t work with wasm32-wasi either.”
- “For all the work that’s gone into making Wasm portable across CPU architectures, we’ve ended up with modules that are not even portable across platform vendors. After all, wasn’t that the selling point of Docker?”, we hope that the component-model will solve this eventually.
- See Performance of WebAssembly runtimes in 2023 (TODO: Comeback 2 this)
- Technically, “Virtual Machine Runtime”
- Runtimes execute WebAssembly Modules, which are most often binaries with a
.wasm
extension. - They facilitates all of the necessary interactions between the
VM
: Webassembly VMPlatform/ABI
: Environment in which the VM exists(platform).
- The webassembly runtimes in browsers (Eg.
V8
) - These connect to the underlying
host environment
to falicitate what they falicitate - There are 2 primary category here. Since the popular ABI’s are web-browser and wasi, the runtimes are mostly devided by these two.
-
The browser side
- This is usually
V8
(See Javascript Runtime and Browser) - For the
host environment
that is the web-browseR - These are for both browser & non-browser but JS runtimes. Usually overseen by WinterCG.
- In this case, you use it via a js runtime/shim loading it, you can also use it alongside javascript. Compilers will also allow you to generate wasm standalone, in which case you’ll be rolling the ball towards the WASI side from the web side of things.
- This is usually
-
The wasi side
- webassembly code can also run outside browsers as-well either via
wasi
ordirectly
or something likeassemblyscript
. - Eg. wasmer, wasmtime, WAVM, wazero, nodejs etc. see list of webassembly runtimes
- NOTE: From the news, there’s some bad rep about wasmer’s ceo
- Vendors can extend wasi in their runtimes, and try to provide what’s missing in WASI such as HTTP requests etc.
- webassembly code can also run outside browsers as-well either via
Host environment / Runtime targets
- Web Browsers (v8)
- Unix systems (wasi)
- Edge
- These can either be running a wasi or v8 runtime
- Does cf worker fit here?
- cf workers run on v8
- cf might be either mocking existing browser APIs, or create their own interfaces that are compatible with the JavaScript shim generator.
- not sure if cf supports standalone wasm
Toolchain
-
Compilers
- Some languages are harder to compile because they have GC and wasm itself does not etc.
- Compilers compile the code of certain language to support certain
platforms
. So in some cases, you’d need to write code keeping theplatform
in mind. - Example. Go compiles down to the
Go ABI platform
(which required a js execution environment) but with TinyGo you can compile things down toWASI
. But recently go added native WASI support. - https://github.com/leaningtech/cheerp-meta
-
Emscripten (Tool)
- It leverages LLVM (Low Level Virtual Machine) as a backend for the compilation process and generates LLVM IR and then uses Binaryen for the final WebAssembly (Wasm) code generation.
- Why do we need Binaryen-IR? · Issue #1520 · WebAssembly/binaryen · GitHub
-
wasm-bindgen
- For the web platform
- wasm-bindgen or wasm-pack don’t work with wasm32-wasi either
History
- In 2010, work kicked off on Emscripten, a C/C++ to JavaScript compiler.
- In 2011, Fabrice Bellard released JSLinux: a Linux operating system and virtual machine compiled to JavaScript using a patched version of his QEMU software.
- Finally, in 2013, Alon Zakai released asm.js
- Emscripten then started compiling to
asm.js
- In 2015, all interested parties concluded that
asm.js
pointed in the right direction, that a language like asm.js should be encoded as distributable bytecode. Google got on board with the effort, dropping the NaCl/Pepper/PNaCl project, and WebAssembly was born.
Languages & WebAssembly
See Cross Compilation | appcypher/awesome-wasm-langs
Javascript
- See Javascript
- To run JavaScript code, the runtime is compiled to WebAssembly, with your code running within the WebAssembly-hosted interpreter. This approach, which might sound inefficient, is surprisingly practical and increasingly popular.
- You sacrifice speed, but gain isolation.
Python
- See Python
- Python code itself does not compile down to WebAssembly. You compile a Python interpreter like CPython to WebAssembly and have that run your Python code. So when I talk about compiling in this blog post, I’m referring to compiling CPython to WebAssembly, not your personal Python code.
- As mentioned in PEP11, webassembly is in tier 3 support with
wasm32-unknown-emscripten
andwasm32-unknown-wasi
Golang
- See The Go WebAssembly ABI at a Low Level - Xe Iaso
- Go defines it own ABI.
- Go’s WebAssembly support also was made before WASI even came out
- If you want to adapt a Go program to use WebAssembly or make a new program with WebAssembly in mind
-
- Compile go to WebAssembly
- Take a reasonable
subset of Go programs
and run them in browsers alongside JavaScript. - Go doesn’t support WASI at all. TinyGo does though. Go’s WebAssembly port mostly targets browsers, not Unix systems.
- The ABI is described at
/syscalls/js
: Gives access to the WebAssembly host environment when using the js/wasm architecture. This gives references to JS object in analogy to how File Descriptors are opaque handles to kernel objects in Unix.- Go to JavaScript interoperability uses NaN-space numbers to encode object ids in the same way that Unix uses numerical file descriptors to encode kernel objects.
- “WebAssembly has a stack, but it’s not compatible with how goroutine stacks work. Go works around this by putting goroutine stacks in memory and passing around the stack pointer as a hot potato.”
FAQ
Tips on writing code to support webassembly
- Just wrap the core functionality into a ‘pure’ WASM module which doesn’t need to access ‘system APIs’, and then if needed write two thin wrappers, one for the ‘web personality’ and one for the ‘WASI personality’.
- You can instantiate and load the WASM binary for every call, similar to a CLI call, but it would be an expensive operation. The best way is to run it in the background (observe the use of a channel to keep it running) while developers call it through a JavaScript interface.
Security
RE
- See Reverse Engineering
- wasm is harder to RE than Java Bytecode but much easier than native code. I did some WASM reverse engineering prior to cloudflare and I wouldn’t put anything sensitive in a binary destined to be ran client side.
Proposals
Threads support
- Currently in the browser, among other problems, each worker has to instantiate all the host javascript objects and instantiate its own copy of the wasm module, which has its own unique copy of the imports and the function pointer table. It’s definitely way less elegant than real threads and potentially creates performance and stability issues.
- Don’t know about the server side, but I’ve been using threads on the browser for ~2 months, I didn’t hit any bug specific to it yet. I use it both with Rust (wasmbindgen with async/await) and C (Emscripten with pthread support). HTTPS with some headers is required for `SharedArrayBuffer`. I still build a single-threaded binary for Firefox, and fallback to it if `SharedArrayBuffer` is `undefined` or if the `WebAssembly.Memory` constructor fails (some iOS devices might throw when `shared` is `true` due to a bug).
- For this case it’s complicated because some runtime supports https://github.com/WebAssembly/threads which mostly contains things like the spec for atomic but not the actual “threads” specs and then some runtimes (i.e wasmtime) also supports https://github.com/WebAssembly/wasi-threads which is one version of the threads. But a new proposal came into play https://github.com/abrown/thread-spawn so … it’s complicated.
Garbage Collection
After WASM GC, the language won’t need to ship it’s own GC
Usecases
See Pay attention to WebAssembly | Harshal Sheth
- Sandboxing
- “I use WASM instances as isolated containers for processing data in my pet project. Once you compiled the WASM module and kept it in memory, spawning instances is incredibly fast!”
- Firefox used a library called RLBOX to convert common libraries into wasm code, then reconvert them into heavily sandboxed c code. It allows mozilla to ship potentially dangerous versions of libraries like hunspell or ogg without having a flaw in them carrying over to the firefox codebase, since they’ve been converted into a more secure form.
- Plugin system
- One use case to run JS inside a Wasm VM is Shopify Functions. Shopify allows their customers to customize things like checkout flow by writing code compiled to Wasm which gets executed during the checkout process. They want their customers to be able to write JS as well as other languages. (See Bringing Javascript to WebAssembly for Shopify Functions (2023))
- Interop
- You are running JS in the browser but there’s a library that does something in Go but that library doesn’t have a JS port.
- “The main difference that makes me excited is not having to change languages. I was able to take a developer CLI tool written in Rust, split it into a library and CLI tool, and then compile the library into wasm and make a web form which served the same purpose as the CLI tool so that SREs didn’t need to download, build, and run the CLI tool or need to know how to do any of that.”
- Figma makes use of a low-level C++ library called Skia for some graphics algorithms rather than building their own or porting them to JavaScript.7
- My favorite chess server, lichess.org, runs the world-class Stockfish chess engine in users’ browsers, saving them the computational burden of running it server-side.
- Google Earth and Adobe Photoshop ported their C++ codebases to the web using Wasm.
- Virtualization
- CheerpX, a WebAssembly powered Virtual Machine whose goal is to safely and efficiently run unmodified X86 binary code in the browser.