Ported my C game to WASM, here's every bug that I hit
Posted by birdculture 4 days ago
Comments
Comment by hexer303 1 day ago
It was a 20-year-old codebase from my old game in win32 and DirectX 9.
I first ported it to native and also switched to bgfx for rendering. This was the bulk of the work - converting all of the old DirectX fixed function pipeline code to shaders. Luckily all modern shaders can simulate all of the old fixed-function DX pipeline features with little effort. Including the coordinate system. Loading DDS textures didn't present a major challenge either.
Had similar native asset loading as yours - no deserializer. It loaded an entire asset file into a preallocated memory block, used packed structures and converted file offsets to pointers after loading. I had to convert it to 64bit for native first.
The most surprising thing: I had no idea WASM is 32bit until I read your article! Once I ported to 64bit, I then ported to WASM and I didn't even encounter any arch related bugs. In hindsight I guess it's because most of the original code was 32bit and the asset file format is still 32bit format. When I ported to 64bit I used a deserializer, so I guess that's why it all worked out in the end.
For native audio I ended up using SoLoud library, but for emscripten I #ifdef'd it out to use inline JS instead. I figured there is no point in having all that extra audio library code compiling to WASM when modern browsers natively support playing audio, oggvorbis, etc. It worked out ok, but there's still a minor bug where the music doesn't loop perfectly. You can hear a split second gap between end/start. I haven't looked deeply into it yet.
Originally when we wrote the game we had banned ourselves from using C++ Exception handling and RTTI. The decision likely paid off as it makes the generated binary smaller and faster. Although I haven't had time to measure. Supposedly C++ exceptions introduce a much heavier overhead in Emscripten.
You can see the port in action at https://scorchedplanets.com
Comment by flohofwoe 1 day ago
WASM(32) is a hybrid 32/64 bit architecture. The address range (and thus pointer size) is 32 bits, but it has native 64-bit integers. E.g. it's similar to the Linux x32 ABI.
There is also a 'true' 64-bit wasm, but that's still too recent to be used in real-world code:
https://caniuse.com/wf-wasm-memory64
(but wasm64 doesn't really make sense unless you really need an address space greater than 32 bits, because the downside is slower performance)
Comment by jltsiren 1 day ago
Or unless you need to use integer types that depend on pointer size (such as size_t or usize), but your integers are too large to fit in 32 bits. That's a pretty common occurrence in bioinformatics. I've been waiting for years for Wasm to become usable, but it looks like Apple is still holding it back.
Comment by senfiaj 1 day ago
Comment by diath 1 day ago
Comment by jstimpfle 1 day ago
Comment by leni536 1 day ago
Comment by jstimpfle 1 day ago
Comment by jcranmer 1 day ago
In practice, C doesn't do any padding shenanigans, but C++ does (but only for non-POD structs, and then you discover there's several slightly different definitions that mean basically "POD", so have fun predicting which one is the one that actually matters for your use case).
Comment by RossBencina 1 day ago
C++ "standard layout type" is the modern equivalent of "POD" I think.
Comment by leni536 16 hours ago
Comment by flohofwoe 1 day ago
Technically that's not true at least for booleans and enums, the C standard doesn't define specific sizes for those (bools are commonly 1 byte though, but for enums at least MSVC likes to disagree with Clang and GCC).
Using a direct struct memory layout for persistency and then expecting it to work across compilers, CPUs and ABIs is almost guaranteed to cause problems.
Comment by DanielHB 1 day ago
You really need a serializer for this sort of thing because it can also include forwards compatibility of your data structures.
Comment by jstimpfle 1 day ago
Comment by edflsafoiewq 1 day ago
Comment by arcadialeak 1 day ago
Comment by pjmlp 1 day ago
Comment by pajko 17 hours ago
Comment by genxy 1 day ago
Comment by yjftsjthsd-h 1 day ago
Comment by DonHopkins 1 day ago
UCSD Pascal:
https://archive.org/details/UCSD_Pascal_1.1_1
Wizardry:
Comment by DonHopkins 1 day ago
https://en.wikipedia.org/wiki/SWEET16
https://techwithdave.davevw.com/2024/05/running-sweet-16-ste...
Comment by casey2 1 day ago
Is ActiveX platform independent? No. it's exclusive to windows. Is it sandboxed? Nope, digital signing and prayer, does it implement a virtual machine? Nope. Compromises out the wazoo? efficiency, data orientation, or predictable performance? You betcha. ActiveX is closer to a DOM sandbox escape exploit than a real piece of engineering. Why do we need WASM when we've have GET since 1990?
Don't confuse the map for the territory, implementation details matter, just labeling something "Mars Colonial Transporter" doesn't mean it actually flew to mars.
Comment by pjmlp 1 day ago
All those "look Python on the browser!" were already done by ActiveState with Perl, Python and Tcl.
Comment by frollogaston 1 day ago
Comment by flohofwoe 1 day ago
That's currently only not possible because nobody wants to do the work to create something like wasi-gfx (https://wasi-gfx.dev/), but for native UI frameworks instead of 3D APIs.
The inconvenient truth is that even "native" cross-platform applications hardly ever go through the trouble to target the platform-native UI framework (and instead they go through non-native frameworks like Qt or a webview wrapper).
Comment by tracker1 1 day ago
Would be cool to get some standardization on at least a few APIs for default fonts, light/dark mode, background and accent colors, etc... so that apps are a little less alien in practice. I'm really not even the idea of Tauri or similar to use a native browser engine, but better skinning APIs so you can get something like Material, but tuned to better match the desktop you're on.
For that matter, a wasi component package would be nice as well. Harder for accessibility though.
Comment by frollogaston 1 day ago
Comment by gspr 1 day ago
I'm a bit disappointed though:
* There's still no way to do DOM manipulation. So then it's tempting to just grab a canvas and draw everything yourself, which of course wreaks on things like accessibility. I'm no fan of the web, but at least it comes with a somewhat agreed-upon way to display graphical stuff – it's a bit of a shame if we're all gonna just treat it like a surface for pixels.
* WASI still leaves something to be desired. Why can't I have raw sockets and file access and stuff, in a POSIX-like way? I understand that sandboxing is important, so this can all be on a per-request-basis, but still. This "just another platform" is still too far from just that.
* The amount of JS glue needed to actually load WASM stuff in the browser is annoying. The idea of needing a bunch of magic "bundlers" is sad.
Comment by flohofwoe 1 day ago
In the end the web is just another platform, but a platform that is quite a bit different from the UNIX/Windows duopoly we're used to.
Comment by samiv 1 day ago
Of course architecturally (also regarding your file access) it's better to use the wasm for logic as much as possible where the web (HTML/JS) provides the UI and IO, data flows into wasm for work and results flow back to the web.
This also has the benefit that you can keep your original C/C++ source code much more platform agnostic which helps reusability and testing.
Comment by frollogaston 1 day ago
Comment by gspr 1 day ago
Well sure. But for me, the promise of WASM was to make the browser "just another platform". Now it's "this special platform where you have to access some of the most important functionality through FFI interop with a very high-level, very opinionated language".
> Of course architecturally (also regarding your file access) it's better to use the wasm for logic as much as possible where the web (HTML/JS) provides the UI and IO, data flows into wasm for work and results flow back to the web.
OK, but like, I wanted the browser to be "just another platform". I don't want to use JS, and I consider HTML orthogonal to my logic. I realize that's not where we're at, but that's what I dreamt of. Hence my disappointment. Which is OK, I don't matter :)
> This also has the benefit that you can keep your original C/C++ source code much more platform agnostic which helps reusability and testing.
It feels the opposite to me.
Comment by jayd16 1 day ago
Is it just a matter of WASM being too new to have full featured wrappers and APIs for your language of choice?
Comment by frollogaston 1 day ago
Web is "just another platform" with its own specifics, and the advantage is multiple OSes can run that platform pretty much the same way.
Comment by trumpdong 1 day ago
Comment by postalrat 1 day ago
Comment by tracker1 1 day ago
Something akin to raw sockets over a host interface (or WSS bridge) could be cool... similar for sandboxed FS access, which browsers are starting to improve upon.
Yes, fully WASI/WASM would be nicer than some of the JS glue... but it's still useful all the same.
Comment by muvlon 1 day ago
FWIW, that's exactly what they shipped first, with WASI preview 1 (wasip1). You can still use this today, and all runtimes with any level of WASI support will be able to run it.
Comment by phickey 1 day ago
Comment by muvlon 1 day ago
Notably, listen and connect are missing. But sockets themselves were in there.
Comment by trumpdong 1 day ago
Comment by gspr 1 day ago
At any rate: this doubly makes my point.
Comment by thewavelength 1 day ago
> Web is 32-bit. Your 64-bit structs will break. This was the root cause of most of my bugs. WASM is 32-bit address space, pointers are 4 bytes not 8.
Comment by whizzter 1 day ago
2: iirc WASM was initially designed to be shimmable via Asm.JS to force laggards(Apple, Google) to implement it, Asm.JS in turn relied on specific rules in JS to get reliable 32bit arithmetic (but impossible for 64bit).
Wasm64 is implemented and works in Chrome and Firefox.. Apple is lagging again with Safari.
Comment by thewavelength 1 day ago
1: True, although it also limits the addressable memory and the typical 4GB limit seems less these days. I’m thinking of large apps like Figma running in the browser.
2: Will existing 32-bit WASM binaries break on WASM64 engines or does the binary have a flag for compatibility?
Comment by whizzter 1 day ago
2: Most runtimes are 64bit already, A runtime detecting a wasm32 binary will just continue to generate code with the current JIT compiler whilst WASM64 will require another JIT (and perhaps memory system since WASM32 runtimes are often based on "hacks" where 4gb of address space is reserved but not given real memory so that the JIT compiler gets an easier job without security implications).
Comment by dathinab 1 day ago
the thing is in WASM "memory" is more or less a resizable ArrayBuffer
and while each has an effective 4GiB limit wasm does allow passing more then one such buffer to any specific wasm "execution/thread"(1) you can then reference them in load/store instructions to load/store from other "memories" then the default one
As general purpose languages tend to not model that this isn't that easy to take advantage of but it is still useful for all kind of "tricks", like (non exhaustive):
- working around 4GiB size limit
- persistent memory between otherwise clean restarts and/or software updates (like what you can get from systemds file descriptor store and other means)
- easier handling of pre-populated memory (think large perfect hashmaps, trie, or similar)
- memory isolation, WASM memory can be shared, but for security and fault tolerance reasons it is often preferable if different workers have their own memory array as well as an additional shared memory array.
- This also allows stuff like security proxies where A->B have a shared memory IPC mechanism and B->C have that too, but A->C can directly communicate at all. Not that relevant in the browser and more for server side WASM usage.
- and more
Anyway IMHO the main point for WASM64 is more the convenience benefits then the 32bit memory limitations. Like porting is easier, most software is 64bit today. Like it's what people are used to. There are a lot of ways where overflows can happen with 32bit but are practically impossible for 64bit. E.g. overflowing 0u64 with +=1 at 6e9 ops/s takes decades, but for 0u32 it's <1s. Stuff like that means you need far more sanity&safety checks in 32bit and it's easier to mess up edge cases.
Comment by koolala 1 day ago
Comment by PhilipRoman 1 day ago
Comment by Findecanor 1 day ago
Comment by flohofwoe 1 day ago
https://spidermonkey.dev/blog/2025/01/15/is-memory64-actuall...
TL;DR: wasm64 requires explicit heap bounds checks, while in wasm32 the memory mapping hardware does it for free.
E.g. quote:
"The only reason to use Memory64 is if you actually need more than 4GB of memory.
Memory64 won’t make your code faster or more “modern”. 64-bit pointers in WebAssembly simply allow you to address more memory, at the cost of slower loads and stores."
Comment by ape4 1 day ago
Comment by koolala 1 day ago
Comment by senfiaj 1 day ago
Comment by koolala 1 day ago
Comment by groundzeros2015 1 day ago
The real mistake is requiring pointer to be 64 bit when most programs don’t use it.
Comment by DonHopkins 1 day ago
Comment by groundzeros2015 1 day ago
For reference 4 GB is 8x more than a ps3.
Comment by frollogaston 1 day ago
Comment by frollogaston 1 day ago
Comment by unwind 2 days ago
Since this is one of the bugs, I always recommemd writing
game->boardPieces = swAlloc(sizeof(ThingHandle*) * row * column);
Like this instead: game->boardPieces = swAlloc(sizeof *game->boardPieces * row * column);
It's not 100% better, but it cuts out a few tokens which helps readability and moves the significant asterix further left where I think it's easier to spot.Comment by jstimpfle 1 day ago
But ACSHUALLY, how you write allocation is like this
#define sane_alloc(type, count) ((type *) malloc(sizeof (type) * (count)))
game->boardPieces = sane_alloc(BoardPiece, row * column);
The kernel people seem to finally have figured out this one in 2026.Comment by unwind 1 day ago
Comment by jstimpfle 1 day ago
Comment by DonHopkins 1 day ago
Array indexing in C is just pointer arithmetic wearing Groucho Marx Glasses.
C combines the flexibility and power of assembly language with the user-friendliness of assembly language.
Comment by jstimpfle 1 day ago
I just had a look at your HN profile page and was struck by the irony of seeing your Forth vs Lisp vs Postscript code examples there. Now consider that I've never written code like 4["Foo!"], even though I know it's possible, but in other languages you constantly have to do mental gymnastics to get any real work done, and those are allegedly so much saner !???
Comment by DonHopkins 1 day ago
Comment by quietbritishjim 1 day ago
game->boardPieces = swAlloc(sizeof game->boardPieces * row * column);
Maybe I find this harder to parse because I'm not used to sizeof without brackets (though I know it's valid). But I think the bigger deal is that your version has a bug if the star is missing whereas there's has a bug if the star is present; it's easier to spot something extra than it is to spot something missing.Comment by ErroneousBosh 1 day ago
I like the word "everybug" :-D
Comment by Joker_vD 1 day ago
Yes, I know that C technically allows rather heterogenous representations for pointers to different types, but in practice there is difference only between object pointers and function pointers.
Comment by Someone 1 day ago
I’m surprised that that works in WASM. Wouldn’t a tiny change in your memory usage (say if you toggle your “log startup progress” flag) load data at a different address?
Comment by mwkaufma 1 day ago
Comment by Panzerschrek 1 day ago
Comment by xydone 1 day ago
Comment by sestep 1 day ago
Comment by xydone 1 day ago
Comment by senfiaj 1 day ago
Comment by flohofwoe 1 day ago
This would be similar to how NaCl/PNaCl communicated with the JS side (via message passing), and that really sucked and would also be prohibitively slow for talking to 'high frequency APIs' like WebGL2 or WebGPU (or the DOM heh).
Comment by trumpdong 1 day ago
Comment by Narishma 1 day ago
I don't think that ever had much, if any, adoption and it looks like it will be removed in the next few releases.
Comment by flohofwoe 1 day ago
https://spidermonkey.dev/blog/2025/01/15/is-memory64-actuall...
TL;DR: wasm64 has slower memory load/store operation because it requires 'software bounds checking', so unless you absolutely need more than 4 GB RAM, wasm32 is the better choice.
Comment by hiccuphippo 1 day ago
Comment by fyrn_ 1 day ago
Comment by flohofwoe 1 day ago
https://marketplace.visualstudio.com/items?itemName=ms-vscod...
This allows to setup an IDE-like 'press F5 to build and start into a debug session' in VSCode, with the debuggee running in Chrome.
E.g. see:
Comment by nhinck3 1 day ago
Comment by deadbabe 1 day ago
Comment by pioh 2 days ago
Comment by rvz 1 day ago
[0] https://soft.vub.ac.be/Publications/2022/vub-tr-soft-22-02.p...
Comment by koolala 1 day ago
Comment by rvz 1 day ago
Comment by koolala 1 day ago
Comment by jedisct1 1 day ago
Comment by pjmlp 1 day ago
Comment by yjftsjthsd-h 1 day ago
Comment by pjmlp 1 day ago
The bounds checking story is only on the external limits of linear memory segments.
If memory gets corrupted inside a linear memory segment, it can equally well be exploited to change execution behaviour, which for many scenarios is already good enough for the attacker.
Yet these kind of attack vectors usually are dropped from blog posts selling WebAssembly as a revolutionary bytecode.
It is only yet another one since various others that came and went since UNCOL became an idea.
Comment by koolala 1 day ago
Comment by pjmlp 1 day ago
Comment by koolala 1 day ago
Comment by pjmlp 1 day ago
Burroungs (1961),
https://en.wikipedia.org/wiki/Burroughs_Large_Systems
"In fact, all unsafe constructs are rejected by the NEWP compiler unless a block is specifically marked to allow those instructions. Such marking of blocks provide a multi-level protection mechanism."
"NEWP programs that contain unsafe constructs are initially non-executable. The security administrator of a system is able to "bless" such programs and make them executable, but normal users are not able to do this. (Even "privileged users", who normally have essentially root privilege, may be unable to do this depending on the configuration chosen by the site.) While NEWP can be used to write general programs and has a number of features designed for large software projects, it does not support everything ALGOL does."
CLR (2001)
https://learn.microsoft.com/en-us/dotnet/framework/tools/pev...
"Normally, code that is not verifiably type safe cannot run, although you can set security policy to allow the execution of trusted but unverifiable code."
IBM i (nee AS/400)
https://medium.com/@dhemanthc/ibm-i-architecture-how-timi-an...
"SLIC enforces IBM i’s unique object-based model. Rather than managing raw memory locations or file descriptors, all resources (programs, files, queues, data areas, libraries) are managed as named objects with properties, ownership, and permissions. This object model permeates everything in IBM i, from file systems to program calls."
Aka capabilities, and what CHERI project is pushing for as means to fix C and C++ code at hardware level.
Comment by koolala 1 day ago
Comment by pjmlp 18 hours ago
Now selling yet another bytecode format as some security wonder.
It gets the pass on the browser, as it replaced the existing plugins model, that's it.
Comment by DonHopkins 1 day ago
I agree with the article's main lessons: wasm32 pointer size, don't serialize structs with pointers, debug native 32-bit when you can, WebGL/WebGPU is stricter than desktop GL, Emscripten export flags still bite. I hit some of the same categories; the parts that were actually tricky for Micropolis are below.
Svelte 5 runes ($state, $derived, etc.) work in plain .ts modules, not just .svelte templates. That matters because the WASM bridge is a reactive module the HUD, command bus, and Vitest all import -- not a component-only trick. The file has to be MicropolisReactive.svelte.ts so runes compile under the same Vite/SvelteKit pipeline as the app; plain .ts breaks in Node with "$state is not defined".
Embind API surface -- what to expose and what to leave out:
https://github.com/SimHacker/MicropolisCore/blob/main/packag...
// This file uses emscripten's embind to bind C++ classes,
// C structures, functions, enums, and contents into JavaScript,
// so you can even subclass C++ classes in JavaScript,
// for implementing plugins and user interfaces.
//
// Wrapping the entire Micropolis class from the Micropolis (open-source
// version of SimCity) code into Emscripten for JavaScript access is a
// large and complex task, mainly due to the size and complexity of the
// class. The class encompasses almost every aspect of the simulation,
// including map generation, simulation logic, user interface
// interactions, and more.
The comments in that file go on to describe the strategy for wrapping: Core Simulation Logic, Memory and Performance Considerations, Direct Memory Access, User Interface and Rendering, Callbacks and Interactivity, and Optimizations.The engine callback virtual interface bridged C++ to JS via JSCallback:
https://github.com/SimHacker/MicropolisCore/blob/main/packag...
In the old NeWS/Hyperlook, TCL/Tk/X11, SWIG/Python/PyGTK, and SWIG/Python/TurboGears/AMF/Flash versions, this callback interface used to be a stringly typed general purpose event callback interface, which I tightened up into a strict C++ interface and corresponding typescript interface, so embind could help me integrate it safely and cleanly with TypeScript and Svelte Runes.
TypeScript handlers that update rune-backed state (sendMessage, didTool, budget hooks, etc.):
https://github.com/SimHacker/MicropolisCore/blob/main/apps/m...
Simulator attach/detach, singleton engine load, wiring JSCallback into Micropolis:
https://github.com/SimHacker/MicropolisCore/blob/main/apps/m...
The pattern: C++ fires callbacks with enough context for the UI; TS updates $state; components read micropolisReactive (peek / poke / memory / getSnapshot) instead of calling Embind or touching HEAP* directly. That is where the rubber hits the road for interactivity.
Heap access is its own footgun. Emscripten may expose Module.wasmMemory, HEAPU16, or neither until init; some getters throw if you read too early. Centralized helper:
https://github.com/SimHacker/MicropolisCore/blob/main/apps/m...
Bridge design, Vitest against real WASM, teardown order with Embind lifetimes:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Map rendering: WebGPU tile renderer with canvas fallback (legacy WebGL frozen, now reimplementing in WebGPU). The renderer reads 16 bit flags + tile indices from direct simulator memory views into WASM linear memory (mapData / mopData), not per-frame Embind copies.
https://github.com/SimHacker/MicropolisCore/blob/main/packag...
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
City saves are a defined binary format (.cty), not fwrite of engine structs. Live map data is views into WASM linear memory (mapData / mopData), not embedded native pointers -- same idea as the article's side-table fix, but that is how this codebase is already structured.
Why I find this stack interesting: original SimCity engine lineage, narrow Embind surface on purpose, reactive TS facade so automation and UI share one sim without reviving the old Python/SWIG/pyGTK path. Sprites (trains, choppers, generic orange monsters wrecking chaos and havoc -- definitely not Godzilla [TM], but possibly Trump adjacent) simulate in C++; compositing them in the WebGPU path is still work in progress.
The WebGPU renderer is being built as a general stack with pluggable layers, including Sims content rendering (characters, animations, terrain, objects, walls, floors, ui effects, etc).
Character animation demo:
VitaMoo code:
https://github.com/SimHacker/MicropolisCore/tree/main/packag...
Unified WebGPU Renderer:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Render Core Package:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Renderer Plugin Roadmap:
https://github.com/SimHacker/MicropolisCore/blob/main/docume...
Live Micropolis tile renderer and simulator demo (no other ui yet, work in progress):
Demo of the simulator, cellular automata, and tile engine to Jerry Martin's music:
https://www.youtube.com/watch?v=319i7slXcbI
Repo:
Comment by haeseong 1 day ago
Comment by senfiaj 1 day ago