Using Rust to Build a $1 Handheld Gaming Console

The CH32V003 is an interesting little chip - not just because it’s based on the up-and-coming RISC-V architecture, or because it comes in a tiny package, but mostly because you can buy it for $0.09 in bulk. Hard to believe this is a 32-bit microcontroller... albeit one with only 2KiB of RAM and 16KiB of flash.

It puts the CH32V003 in the same ballpark as the classic Atmel ATmega328P - the heart of the original Arduino - but with significantly more grunt: 48MHz clock speed and a full 32-bit core.

I was inspired by this project, which built a full games console on the CH32V003 using a tiny SSD1306 OLED display (the same one you’ll find on countless cheap breakout boards).

I’d just gotten into Embedded Rust on the ESP32, using libraries like Embassy and esp-hal. When I found out there was an active effort (the ch32-hal project) to bring Rust support to the CH32V003, I had to try it. There was something deeply appealing about using a modern, safe language like Rust on such a brutally constrained device. Not to mention the sheer utility of having a chip cheaper than a 555 timer for quick, dirty, and fun hacks.

I set out to build a simple side-scrolling adventure game that would look great on a 128x64 pixel display. My first move? Build a platform-agnostic “game engine” that I could run as a desktop app using minifb. The idea: develop all the game logic on my Mac, then port over the rendering layer to the chip once everything else was solid.

It almost worked.

Getting the SSD1306 display driver working in Rust was surprisingly easy. But I had to skip embedded_graphics - there just wasn’t enough RAM for a framebuffer. Instead, I used the ssd1306 crate directly, sending raw draw commands to the display. The good news? The SSD1306 is persistent - it keeps its pixels between updates. So even without a RAM framebuffer, I effectively had one... built into the display itself. Nice.

I needed to understand how the display organized its 1-bit-per-pixel buffer. Since each byte holds 8 vertical pixels (one column), a single byte = one 8-pixel-tall tower. The next byte is the tower to its right. To fill the full 128-pixel width, you need exactly 128 bytes.

I could spare 128 bytes out of my precious 2048-byte RAM budget - and still have room for game logic. Perfect.

let mut buf = [0u8; 128];   // Only 128 bytes of RAM for the entire display!

for row in 0..8 {
    render_row(&mut scene, &mut buf, row);
    display.draw(&buf).unwrap();
}

Now, for each of those 8 strips (each 128x8 pixels), I needed to render only the objects that intersected that slice. Iterating over everything every frame would be too slow. So I implemented a bounding-box culling system: for each strip, check which objects’ hitboxes overlap it, then only render those.

Each object held a reference to an 8x8 tile texture stored in ROM, which I’d blit into the buffer at the object’s position.

It worked - but the first prototype on real hardware crawled along at 1 FPS. Was I asking too much?

Turns out, it wasn’t the rendering that was slow. It was the physics.

I’d assumed physics would be the easy part. Just a few collision checks per frame, right? A 48MHz chip should eat that for breakfast.

But I’d been using floating-point maths. Big mistake.

The CH32V003 runs the minimal RV32EC instruction set - no hardware floating-point unit. Every f32 operation gets emulated in software. That means up to a 100x slowdown.

How do you fix this? By ditching floats and use fixed-point maths.

You take your 32-bit integers and mentally split them: i.e., 22 bits for the integer part, 10 bits for the fractional part. Now you can represent values down to 1/1024th of a unit - plenty for velocity, position, and acceleration.

And guess what? The fixed crate makes it stupidly easy. With operator overloading, you treat these fixed-point numbers like regular numbers. No magic, no weird syntax - just let a = b + c;.

Suddenly, I was hitting 30+ FPS.

Still needed a map. The player had to jump on platforms and hills, so collision and rendering had to be fast and efficient. Since I was already using 8x8 tiles for characters and objects, it made sense to use them for the terrain too.

So I encoded the map as an array of u64 values. Each 4-bit nibble represents a tile index. One u64 holds 16 tiles vertically - so the entire map is stored sideways, as a slice literal:

pub static MAP: [u64; 218] = [
    0x0000000100000001,
    0x0000000100000001,
    0x0000000100000001,
    0x0000000000000008,
    // ...
];

Physics checks hit this map constantly - but because it’s stored in ROM and accessed via simple bit shifts and masks, it’s lightning fast.

The display is just a sliding window over this map. Using the player’s position, I calculate which map tiles are visible, then loop through them and blit the corresponding tile textures into the buffer.

Still getting ~25 FPS. And it’s playable.

On a $0.09 chip.

Rust wasn’t just possible - it was perfect. Safe, fast, expressive... and now I’ve got a working game on hardware cheaper than a chocolate bar.

Sometimes, the best projects are the ones you never thought you could build.

Source code for both real device and simulator. Give it a try!

https://github.com/cjdell/ch32-game-rust