An introduction

Zero Board Computer

Working printf, fopen, and a real filesystem
on any CPU — real, emulated, or yet to be designed.

On day one. Before any hardware exists.

First, the name

Why “Zero Board Computer”?

A Single Board Computer — Raspberry Pi, BeagleBone — is a whole computer on one circuit board.

A Zero Board Computer needs zero boards.

The board is nowhere

The peripherals aren't on a board — they're borrowed from a host.
The only “hardware” the guest sees is a tiny memory window.
It's the working computer you have before any board exists.

A console, a filesystem, a clock — and nothing to bring up.

Here's the answer. The peripherals aren't on a board at all — they're borrowed from a host machine. Your laptop. The emulator. The workstation driving your prototype. The guest doesn't own a disk; it borrows the host's disk. It doesn't own a terminal; it borrows the host's terminal. From the guest's side there is no peripheral hardware — just a small window of memory it reads and writes, and everything real happens on the far side of that window. So what you end up with is a fully usable computer — storage, a console, a clock — with zero boards of support hardware for you to design, fabricate, or debug. And that matters most precisely when you don't have any of that hardware yet. Which is the situation I want to talk about next, because every one of you has been stuck in it.

The six-week tax

Every new chip, board, or prototype starts the same way:

Can't debug firmware until the serial port works.
Can't test the filesystem until the flash driver works.
Can't validate the timer until you can read the clock back.

The first six weeks go to fighting your way to a working printf.

Not the product. Not the silicon. Just trying to see what your code is doing — redone by every team, on every project, for decades.

And it all collapses into one blunt sentence: the first six weeks go to fighting your way to a working printf. “A working printf” is really shorthand for the first moment the machine talks back — the first time you can see what your code is doing. Until then you're flying completely blind. And none of that six weeks is the product. It isn't validating your silicon, it isn't exercising your design, it isn't testing your algorithm. Worse, it doesn't get cheaper — it isn't a problem somebody solves once. It gets re-paid, in full, by every team, on every chip, for decades. So the honest question is: why are we all still paying this? What if printf, and fopen, and a clock were simply there, on the first power-on? To explain how, I have to introduce one old idea.

What is semihosting?

Borrow the host's I/O instead of building your own.

The program on the target doesn't open files or print characters itself — it asks a host machine (your laptop, the emulator, a debugger) to do it on its behalf.

That idea is semihosting. The definition is simple: instead of the target doing its own input and output, it borrows the host's. The program on the chip doesn't open the file itself, doesn't push characters to a screen itself — it asks a host machine, one that already has a real filesystem and a real console and a real clock, to do it on its behalf. The name tells you the shape of it: “semi,” as in half a host. The target is only half a computer. It runs the code, but it hands the I/O up to a full machine that does the actual work. Concretely, that covers exactly what the bottom of a C library needs: open, close, read, write, and seek a file; read and write the console; ask the time; exit with a status. And I want to be clear — this is not new or exotic. Semihosting is decades old and completely mainstream. ZBC didn't invent it. ZBC fixes how it's delivered.

ARM did this in 1993

It works — but it assumes:

only ARM chips,
only with a hardware debug probe attached,
only via ARM-specific trap instructions a debugger intercepts.

To see what needed fixing, look at the canonical version. ARM defined semihosting in the early nineties, and it's genuinely good — it's still carrying printf for Cortex-M developers today. But it makes three assumptions. First, it only works on ARM chips, because the mechanism is defined in terms of ARM's instruction set — it simply doesn't exist for a 6502 or a RISC-V core. Second, it needs a hardware debug probe physically attached, because the request gets caught by a debugger over JTAG. No probe, no semihosting. And third, the way the chip signals the host is a special trap instruction — a deliberate breakpoint that halts the CPU so the debugger can step in. Now, in 1993, every one of those was free. Every target was ARM, every engineer had a probe on the bench, and silicon came before simulation. The design didn't get worse over thirty years. The world changed around it — and every one of those three assumptions broke.

All three constraints come from how the target signals the host: trap instructions.

Replace the trap with a
32‑byte memory‑mapped device.

If the CPU can read and write memory — and every CPU can, by definition — it works.

So here's the insight that frees it. Look at where all three of ARM's constraints come from. They don't come from the I/O services — those are fine. They all come from one design choice: the signaling mechanism. The trap instruction is what drags in the architecture dependence, the debugger, and the privileged mode. So replace the trap. Instead of a special instruction, put a small device on the bus — thirty-two bytes of registers, sitting in the memory map like a UART or a clock chip. The chip signals the host by writing to memory instead of by executing a magic instruction. And why is that the right primitive? Because reading and writing memory is the one thing every CPU can do, by definition. An eight-bit micro, a sixty-four-bit server chip, a processor that won't tape out until next year — there is no CPU that runs code but can't do a load and a store. That's how an idea that was ARM-only becomes genuinely universal. This is the hinge of the whole talk.

What that buys you

No special instructions to define.

No debug probe to attach.

No privileged execution mode required.

And look at what falls out of that one substitution — it's the exact mirror of ARM's three constraints. There's nothing architecture-specific to define, so the same device works for every CPU family, and porting to a new architecture is zero protocol work. There's no debug probe, because the host isn't a debugger catching a trap — it's a peripheral answering a memory access, so it works in a plain software simulator with nothing attached, and on real hardware with an empty JTAG header. And there's no privileged mode, because loads and stores are just ordinary instructions — no supervisor state, no exception handler to make a request. The mental model to hold onto is this: the host is a device on the bus, not a debugger interrupting the CPU. The processor stays in control the entire time. Which is exactly why a single request is so simple — let me show you one.

The doorbell handshake

Guest builds a small message in its own RAM.
Writes the buffer address into RIFF_PTR.
Writes any value to DOORBELL — the signal.
Host does the real I/O, writes the response back into the same buffer.
Host sets RESPONSE_READY; guest polls, then reads its answer.

This is the entire protocol — five steps, and then it repeats. First, the guest builds a small message in its own RAM. That detail matters: the guest owns the buffer, the device never allocates anything, and that's how the client library can promise it never touches the heap. Second, it writes that buffer's address into the RIFF_PTR register, so the host knows where to look. Third, it writes any value at all to the DOORBELL register — that single write is the “go” signal; the value doesn't matter, the act of writing does. Fourth, the host reads the request out of guest memory, performs the real file or console operation, and writes the answer back into that same buffer. And fifth, the host sets a “response ready” bit in the status register; the guest, which has been polling that bit, sees it and reads its result. The thing to notice is that the host is completely passive. It does nothing until the doorbell rings, and it never seizes the CPU. Compare that to ARM, where the trap halts the processor. Here, the CPU is always driving.

The whole device: 32 bytes

Offset	Register	Meaning
0x00	SIGNATURE	ASCII `"SEMIHOST"` — how guests find it
0x08	RIFF_PTR	16 bytes: address of the message buffer
0x18	DOORBELL	write any value → trigger
0x19	STATUS	bitmask: TIMER / RESPONSE_READY / PROTO_ERROR
0x1A	ERROR_CODE	protocol errors, without touching guest RAM

And here is that device in full. This table is the entire hardware surface — thirty-two bytes. That's the whole thing a chip designer or an emulator author has to build. Let me walk it. SIGNATURE is eight ASCII bytes spelling “SEMIHOST” — it's how a guest discovers the device: read this address, see the magic string, and you know ZBC is present. Remember that one; it comes back later. RIFF_PTR is where the message lives, and it's a full sixteen bytes wide on purpose — that's room for a 128-bit pointer. A 6502 just writes the low two bytes and ignores the rest. The register is sized for the widest CPU imaginable, not for the machine the implementer happens to own. DOORBELL is the trigger we just saw. STATUS is a bitmask, not one flag: a timer bit, a response-ready bit, and a protocol-error bit. And ERROR_CODE is a small safety feature — when a request is so broken the host can't even parse it, the host reports that here, in its own register, instead of scribbling into the guest's memory. A malformed request can't corrupt the guest. So that's the hardware. Now — what's actually inside that buffer?

The message is RIFF

A tagged container — the same family as WAV and AVI files.


RIFF  ....  SEMI                 <- container, form type "SEMI"
  CNFG  int_size ptr_size endian  <- guest declares its architecture
  CALL  opcode = 0x05 (SYS_WRITE) <- the request
    PARM  fd = 1
    DATA  "Hello, world!\n"
    PARM  length = 14

Host overwrites CALL with a RETN (result + errno), in place.

The message is RIFF — the same tagged container format behind WAV and AVI files. I want to justify that choice for a second, because reusing a boring old format is the point. RIFF is just a stream of chunks: each one has a four-character tag, a length, and a payload. It's self-describing and trivially extensible — add a new chunk type and old parsers simply skip what they don't recognize. There's no clever new binary format to get subtly wrong. Read the example top to bottom. The outer RIFF container is tagged “SEMI,” for semihosting. The first chunk is always CNFG — that's the guest announcing who it is: my integer size, my pointer size, my byte order. Everything after gets interpreted through that declaration, and it's the setup for the most important idea in this talk, two slides from now. Then CALL carries the opcode — here, five, which is write. Inside it: a parameter for the file descriptor, a data chunk with the actual bytes, and a parameter for the length. When the host finishes, it overwrites that CALL chunk with a RETN chunk — the result and an error number — right there in the same buffer. Request in, response out, same memory, no allocation anywhere.

One subtlety: endianness

RIFF headers

Always little-endian — that's RIFF's own rule.

Payload values

The guest's declared endianness, from CNFG.

The container is fixed. The contents speak the guest's language.

One subtlety is worth thirty seconds, because it's exactly where a cross-architecture protocol usually goes wrong: endianness. There are two different rules here, on purpose. The RIFF frame itself — the chunk tags and the length fields — is always little-endian. That's RIFF's own rule, inherited from WAV and AVI, and ZBC keeps it unchanged, so the structural bytes look the same no matter who's talking. But the values inside — the actual integers and pointers, the return values — use the guest's own declared endianness, the one it announced in CNFG. A big-endian 68000 encodes its file descriptor big-endian; a little-endian 6502 encodes its little-endian. The picture to hold is: the envelope is written in one fixed language so any host can route it, but the letter inside is written in the guest's native language. A big-endian guest byteswaps the envelope through shared helpers — it never has to reformat its own data. Container fixed, contents native. That's the whole rule.

The thesis

Width neutrality is the product

The guest declares its int size, pointer size, and byte order.

The host honors what the guest says.

If you remember one slide from today, make it this one. Everything else — the device, the RIFF format — is mechanism. This is the point. The protocol never assumes a width. It doesn't bake in thirty-two bits, or sixty-four, or any number at all. Instead the guest declares its parameters — integer size, pointer size, byte order — and the host simply honors whatever the guest said. Width is negotiated, never assumed. And that's the reason this is a protocol and not just a library. A library you could fork and tweak. But declared widths mean a guest written today and a host written by someone else, years later, in another language, still interoperate — because the width travels on the wire, from the guest, not from anyone's source code. The project states this as a hard rule: any part of the system that assumes a specific bit width is treated as a defect to be removed. The only legitimate sources of a width are the wire format's own fields and the guest's declaration. Never the implementer's habits. Never “thirty-two bits is plenty.” Let me make that concrete.

Same protocol, different sizes

8-bit 6502

int = 2 · ptr = 2 · little-endian

64-bit, big-endian

int = 4 · ptr = 8 · big-endian

They speak the same wire protocol. They just fill in different fields.

Take two machines that could hardly be more different. On the left, a 6502 — a 1975 eight-bit chip with sixteen-bit pointers, little-endian. Its CNFG says: integer two bytes, pointer two bytes, little-endian. On the right, a modern sixty-four-bit core that happens to be big-endian. Its CNFG says: integer four bytes, pointer eight bytes, big-endian. And notice there that the integer and the pointer are different sizes — the protocol handles that fine, because every parameter knows whether it's an integer or a pointer and is sized accordingly. Here's the payoff: those two machines — forty-some years apart, a factor of eight in width, opposite byte orders — speak the identical wire protocol. Not a small-chip dialect and a big-chip dialect. The same protocol. They just write different numbers into the same CNFG fields, and every host reads those fields and adapts. And this isn't hypothetical — the project runs live on-target tests at both ends of that range, so the interoperability is continuously proven, not just promised. And taking width seriously buys you something you might not expect.

Why it matters, concretely

A 16-bit guest can stream sequentially through a file larger than its own address space.

That's a core use case — not an edge case.

Here's the surprise. Take a sixteen-bit machine. Its entire address space is sixty-four kilobytes — it physically cannot form a pointer to byte one hundred thousand. And yet it can open a multi-gigabyte file on the host and stream sequentially all the way through it. How? Because the file offset isn't a guest pointer — it's a value in the protocol, and the protocol carries it at full sixty-four-bit width no matter how small the guest is. The guest reads a chunk into its little buffer, processes it, asks for the next chunk at the next offset. It never has to address the whole file — only to name positions in it, and sixty-four-bit positions fit on the wire even when they don't fit in the CPU. This is exactly why narrowing is forbidden. If some implementer had stored that offset in a thirty-two-bit integer because “who has files that big,” they'd have silently capped what every guest can do — and the tiny guest, the one that needs streaming the most, is the first one to break. So a small chip walking a huge file isn't a clever edge case someone allowed. It's a core use case the design protects on purpose.

Standing on ARM's shoulders

Op	Name
0x01	SYS_OPEN
0x05	SYS_WRITE
0x06	SYS_READ
0x0A	SYS_SEEK
0x18	SYS_EXIT

ZBC keeps ARM semihosting's operation numbers.
picolibc and newlib ports work nearly unchanged.
Only the transport changed: traps → memory. The vocabulary is the same.

Now, earlier I made ARM's trap the villain. But ARM got something else exactly right, and ZBC keeps it: the vocabulary. ARM defined a numbering for the operations — open is one, write is five, read is six, and so on. That numbering is excellent, and ZBC uses it unchanged. And that's a strategic choice, not a lazy one. The bottom layer of real C libraries — picolibc, newlib — already knows how to speak these exact call numbers. So by keeping them, ZBC lets that battle-tested library code work nearly unchanged. You're not rewriting libc; you swap only the handful of instructions at the very bottom that actually deliver the request. So the one-line summary is: only the transport changed — traps became memory accesses — and the vocabulary stayed the same. You get the universality of the new transport and decades of accumulated library work, for free. And keep that opcode compatibility in mind, because it pays off again in a few slides in a way you won't see coming.

It's not one wire

One binary, many hosts

The invariant is the guest-facing API, not the wire underneath it.

Build once against zbc_call() — the same binary runs across very different machines, each with its own transport.

Okay — change of gear. Everything so far has been one device, one wire: the RIFF doorbell peripheral. Now I want to widen the picture. The RIFF device is still the canonical wire, but it turns out to be just one of several ways a guest can reach a host. And here's the key idea: the thing that stays constant isn't the wire — it's the guest-facing API. Your firmware calls the same functions regardless. Underneath that call sits a swappable “transport” that decides how the request actually gets to a host. So the same compiled binary — not recompiled, the same bytes — can run on machines that have nothing in common, because at startup the library detects which transport is available and uses it. If you've ever written a program that opens “a file” without caring whether it's a local disk or a network share or a pipe — same call, different plumbing, chosen at runtime — that's exactly the move here. So what are the transports?

Three ways to reach the host

RIFF device — the 32-byte doorbell peripheral. MAME, a patched QEMU, real silicon.

virtio — the guest carries drivers. Stock QEMU, unmodified. (next slides)

native trap — ARM-style semihosting traps. Build-time option, where the target supports it.

There are three, and each one is at home somewhere different. The first is the RIFF device — the doorbell peripheral we just spent the whole talk on. That's what you get on MAME, on a QEMU that's been patched to include the device, and on real silicon. Whenever a native ZBC device exists, it wins. The second is virtio. When a host has no ZBC device at all, the guest can instead carry standard virtio drivers and talk to the host's existing virtio devices — and that is what makes plain, unmodified QEMU work, which is the next two slides. The third is native trap — on architectures that already have trap-based semihosting, a build option just uses it, bringing us full circle back to ARM. And one sign this is real engineering: the transports compose. A composite can send console traffic over one and file traffic over another, with a base layer underneath that safely reports “not implemented.” Now, with multiple wires you'd reasonably worry they behave differently — so there's an equivalence test suite. The same operation script runs over every transport, and the results, the error numbers, the file-descriptor behavior all have to match. It's the client-side twin of the host conformance test.

Runs on stock QEMU — today

No ZBC device, no QEMU patch. The guest brings the drivers:

virtio-9p

File operations, via QEMU's built-in 9P file server. No external daemon.

virtio-console

Console I/O, on the same virtqueue core.


-fsdev local,id=fs0,path=$SHARE,security_model=none \
-device virtio-9p-device,fsdev=fs0,mount_tag=zbc

And this is the headline of the second half: it runs on the QEMU you already have installed. No patched fork, no custom build, no kernel. Stock upstream QEMU. Here's how. Stock QEMU has no ZBC device — so instead of the host carrying the device, the guest carries small standard drivers and talks to devices QEMU already ships. For files, that's virtio-9p. 9P is the Plan 9 filesystem protocol, and QEMU has a built-in 9P server, so the guest gets real access to a host directory with no external daemon running. For the console, virtio-console, riding the same underlying machinery. And the reason it's 9P specifically: 9P is a strictly synchronous, one-request-one-reply protocol, which maps almost perfectly onto ZBC's own synchronous call model — so the translation layer stays tiny. Look at the bottom of the slide — that's the entire host-side setup. One line to point at a directory, one line to expose it as a device. You can paste those two lines and be running in about two minutes.

If it speaks virtio, it works

Any machine exposing modern virtio-mmio — the library scans for it at startup.

QEMU machine	Arch	virtio-mmio window
`virt`	ARM / AArch64	0x0a000000
`virt`	RISC-V (rv32 & rv64)	0x10001000
`microvm`	x86	0xfeb00000

Now let me say that more broadly, because I said “QEMU” but the real claim is bigger: if a machine exposes a standard virtio device over memory-mapped I/O, ZBC's guest drivers can use it. virtio is the de-facto standard for paravirtual devices — so this is less a QEMU integration than a virtio integration that QEMU happens to provide. The table is just “the major architectures, covered”: ARM and AArch64 on the virt machine, thirty-two and sixty-four-bit RISC-V on virt, and x86 on microvm. The hex column is simply where each machine parks its virtio register window. And discovery ties right back to that SIGNATURE register from earlier. At startup the library probes in order: first it reads the device base and looks for the “SEMIHOST” signature; if that's missing, it scans the virtio window for a live console or filesystem device; and if it finds nothing at all, it falls back to a transport where every call fails immediately and cleanly. It never hangs, and it never guesses — the whole thing is signature-driven, which is exactly what that register was for. One quick note in case you're wondering why x86 says microvm and not the usual pc: virtio over PCI would drag in bus enumeration, exactly the per-platform complexity this project refuses to grow, so microvm — which exposes virtio over memory — is the supported x86 machine.

Or QEMU's own trap semihosting

On ARM, AArch64, RISC-V, MIPS, m68k, Xtensa, a build-time shim routes everything through QEMU's -semihosting.

Covers the entire operation set with zero guest driver code.
~10 instructions per architecture — opcode + pointer in registers, then the trap.
Build-time only: a trap on a non-semihosting host faults, so it's never auto-probed.

And here's the full-circle moment I promised. We opened by treating ARM's trap as the villain. But because ZBC kept ARM's opcode numbers, QEMU's own built-in trap semihosting just becomes another transport we can plug in for free. The very thing we replaced, we can also reuse wherever it's already there. It applies to the architectures QEMU implements traps for — ARM, AArch64, RISC-V, MIPS, m68k, Xtensa. On those, you start QEMU with one flag, and a tiny per-architecture shim — about ten instructions — drops the opcode and a pointer into the right registers, executes the trap, and reads the result back. Its big advantage is coverage: it handles the entire operation set, files and console and time and exit, with zero guest driver code and nothing to configure beyond that flag. The one honest caveat: unlike the others, this transport can't be safely auto-detected. To test whether trap semihosting is present, you'd have to execute the trap — and if it isn't present, that raises a CPU exception. Probing for it could crash you. So it's a deliberate build-time choice, never part of the runtime probe.

This is not a sketch

Two reference hosts, proven equal

C90 (zero heap, statically verified) and C++17 — byte-for-byte conformance-tested against each other on every CI run.

200+ machines in MAME

Drop a binary into an emulated 6502, Z80, or 68000 and it reads files off your laptop. No driver work at all.

Let me switch from pitch to evidence, because you've heard the idea — now I want to show it's not vaporware. There are two reference host libraries: one in C90, one in C++17, written independently. And a conformance test feeds both the same requests and checks that their responses are identical, byte for byte, on every CI run. That's how you know the specification is real and unambiguous — two separate implementations come out provably interchangeable. The C one also allocates zero heap, statically verified, so it fits on an eight-bit micro with a few kilobytes of RAM. And here's the detail that tends to make people sit up: MAME already ships ZBC machines for over two hundred vintage systems — home computers, arcade boards, classic minis. You can take a cross-compiled binary, drop it into an emulated 6502 or Z80 or 68000, and that 1979 machine reads files off your 2026 laptop. No serial driver, no storage driver, no filesystem code — none of the bring-up tax from the start of this talk. And it's tested at both ends of the spectrum at once.

Tested at both extremes

One test binary, two emulators, many transports

The same program runs on MAME's 6502 & i386 (RIFF) and QEMU's RISC-V, ARM, AArch64 & x86 microvm (virtio) — live, every push.

Hardened

Continuous RIFF-parser fuzzing, ASan/UBSan, optional seccomp sandbox, directory-jail backends.

This is the slide that ties the whole talk together. One test program, written once against the client API, runs unmodified across all of this: MAME's eight-bit 6502 and thirty-two-bit i386 over the RIFF transport, and QEMU's RISC-V, ARM, AArch64, and x86 microvm over the virtio transport. One binary, two emulators, multiple transports, eight-bit through sixty-four-bit — and it runs live on every push, not as a one-time demo. That matrix is the thesis made executable: width neutrality and transport independence stop being slogans the moment the same program keeps passing across that whole spread automatically. Now I should address the obvious worry — security. Semihosting is powerful by definition; guest code is reaching into host files. So the project takes that seriously. The RIFF parser, which is the part that eats untrusted input, is continuously fuzzed. The sanitizers gate every push. There's an optional seccomp sandbox that strips the host process down to just the syscalls semihosting needs. And the file backend can be jailed to a single directory with path-traversal blocked. You pick a backend and a policy to match your threat model — from “trusted test code, full access” all the way to “untrusted guest, locked box.”

Is this for you?

Bringing up a new board

Add 32 bytes of decode to your bus. Your firmware has printf, fopen, time() before the board is back from the factory.

Hobbyist with an old chip

Soldered a 6502 to perfboard? Give it a real filesystem this afternoon — the same binary runs on the physical board later.

So — is this for you? Let me get specific, because there are a few different people in this room. If you're bringing up a new board, you're the one dreading the first month. You don't have to write the serial driver, or the flash driver, or the throwaway logging hack you always end up building. Add thirty-two bytes of address decode to your bus — or just link the host library into your emulator — and your firmware has printf and fopen and time of day before the physical board is even back from fabrication. Silicon stops blocking software. And if you're a hobbyist — you soldered a 6502 or a Z80 onto perfboard and you want it to do something real today — run it in MAME, give it a real filesystem this afternoon, and here's the part that matters: when your physical board is ready, you point the client library at the address you wired in, and the same binary just runs. There's no “emulator version” and “hardware version.” One binary, emulator first, hardware later, no divergence. But there's a third audience where the payoff is measured in calendar quarters.

Especially: a new chip program

Firmware starts day one against the simulator. Silicon arrival becomes an integration milestone, not a starting gun.

A quarter of calendar time recovered. Nothing proprietary introduced.

And this is the one for whoever owns the schedule. On a new-chip program, firmware is normally blocked on silicon for the first quarter. The team waits for tape-out before they can really begin — that's a quarter of idle calendar time baked right into the plan. ZBC changes that. Firmware development starts on day one, against the simulator, with full file access and a real console. Silicon arrival stops being the starting gun for firmware and becomes just an integration milestone — the same firmware, the same tests, the same logs you've already been running, now on real parts. And the reason that's low-risk is the most important technical point here: the entire interface is thirty-two bytes of memory-mapped I/O. The surface is so small that the simulator and the manufactured chip can't meaningfully diverge across it. You're not building an elaborate model that drifts from reality — you're matching a thirty-two-byte contract. So the business case is simple: about a quarter of calendar time recovered, and nothing proprietary introduced. It's MIT-licensed, the client is tiny C90, there's no vendor SDK and nothing to sign.

The wire format is the contract

One spec file is the single source of truth, byte for byte.
Future hosts — an FPGA core, a vendor's emulator — conform to the same format.
A binary written today still runs twenty years from now against any conforming host.

Semihosting stops being a per-platform reinvention.

Let me zoom out to the deepest point, because it's the one that outlasts everything else I've shown you. The C library, the C++ library, MAME, QEMU — all of those are implementations. The actual product is the contract: a single specification file that defines the wire format, byte for byte. The libraries are reference implementations of that spec, not the other way around. And that ordering is what makes this durable. Because the authority lives in a wire format and not in any one codebase, anyone can write a new conforming host — Rust bindings, an FPGA core burned into a prototype, a chip vendor's clean-room emulator — and a guest written today just works against it. Implementations come and go; the contract is the fixed point. Which lets me make a claim I actually believe: a guest binary you compile today, against this spec, will still run twenty years from now against whatever conforming host exists then. That's the line between a tool and infrastructure. And it's the real fix for the bring-up tax — the reason bring-up has been a perennial cost is that semihosting kept getting reinvented per platform. Pin it to one stable, width-neutral contract, and that reinvention stops.

Try it this afternoon

Vintage chip, in MAME


mame -listfull zbc*
mame zbcz80 -quik hello.bin

Modern target, in stock QEMU


qemu-system-riscv32 -machine virt \
  -device virtio-9p-device,...

Docs & spec: johnwbyrd.github.io/zbc · Wiki: zeroboardcomputer.com
License: MIT · client is C90, never calls malloc

If it can load and store, it can run ZBC.

So let me leave you with something you can actually do, because the barrier here really is one afternoon. There are two on-ramps. If you have MAME, list the ZBC machines with the first command there, pick one, and quickload a binary — you'll have a Z80 or a 6502 doing real file I/O in minutes. If you have QEMU, boot the virt or microvm machine with a virtio-9p device pointed at a directory — same client API, different transport. And to clear away the usual objections: it's MIT-licensed, about as permissive as it gets; the client is small C90 that never calls malloc; you don't sign anything and you don't take on a vendor SDK. The spec lives on the docs site, and the tutorials and compatibility tables live on the wiki. And I'll close where I started, because it's still the whole story in one line: if it can load and store, it can run ZBC. Thank you — I'd love to take your questions.