Chapter 12: The Minimal Machine Model

Every x86_64 virtual machine carries baggage from 1981. The IBM PC used a Zilog 8253 timer, an Intel 8259A interrupt controller, and a National Semiconductor 16550 UART. Later revisions cascaded a second 8259A for more IRQ lines and added an Intel 8042 to manage keyboards. These chips persisted through PC/AT, through the ISA bus, through PCI, through ACPI, and into the UEFI era. Modern Linux kernels probe for them on every x86_64 boot. They are not optional: the kernel's TSC calibration path gates on the presence of a legacy PIC, and the early console driver writes to the UART before any device tree or ACPI table has been parsed.

So the question for a VMM trying to offer the smallest safe attack surface is not "which devices do we want?" but "which devices can we remove without breaking the kernel?" For Firecracker the answer is: no PCI bus, no ACPI power management, no BIOS or firmware — but yes to a 16550A UART at COM1, yes to an i8042 stub that can reset the CPU, and yes to the three interrupt controllers KVM provides in-kernel for free. This chapter traces why those choices are forced, how the VMM routes I/O to the right emulator, and where QEMU's microvm machine type arrives at the same destination via a different road.


The Devices the Kernel Still Expects

16550A UART: The Console That Runs Before the Console

Linux's earlycon driver is the first piece of kernel code that can write to a human-readable output. It runs before the page allocator, before printk's ring buffer, and before any PCI or USB enumeration. The only hardware it can target is a memory-mapped or I/O-port UART at a fixed, known address. For x86_64, that address is 0x3F8 — COM1, as it has been since the IBM PC.

The kernel boot parameter earlycon=uart8250,io,0x3f8 tells the early console driver to use PIO mode at that address. The kernel's serial console, activated later by console=ttyS0,115200, uses the same register layout. Neither path negotiates the UART's location: they assume COM1, assume GSI 4, and start writing to the THR register at offset 0 from the base address.

Firecracker satisfies this expectation with the vm-superio crate (rust-vmm/vm-superio), which emulates the 16550A register file. The constants are in src/vmm/src/device_manager/legacy.rs: SERIAL_PORT_ADDRESS = 0x3f8, SERIAL_PORT_SIZE = 0x8, and COM1_GSI = 4.

The emulated UART presents itself as a 16550A by setting bits 7:6 of the Interrupt Identification Register (IIR) to 0b11 on every IIR read (IIR_FIFO_BITS = 0b1100_0000 in vm-superio/src/serial.rs). That is exactly the check serial8250_config_port() in 8250_port.c performs — it tests IIR bits 7:6 for 0b11 to classify the device as 16550A-compatible. The internal FIFO is 64 bytes (FIFO_SIZE = 0x40). The default baud divisor is DEFAULT_BAUD_DIVISOR_LOW = 0x0C with DEFAULT_BAUD_DIVISOR_HIGH = 0x00: divisor 12 at the 1.8432 MHz base clock gives 9600 bps, though the guest kernel ignores the actual baud rate because the host-side serializer does not depend on it.

The register map is eight bytes wide:

Offset Name Direction Notes
0 RBR / THR R / W Receive buffer / transmit holding; DLAB_LOW when LCR bit 7 set
1 IER / DLAB_HIGH R / W Interrupt Enable (bits 3:0 valid); divisor high byte when DLAB set
2 IIR / FCR R / W IIR on read: bits 7:6 = 11 signals 16550A
3 LCR R / W Line Control; bit 7 = DLAB (Divisor Latch Access Bit)
4 MCR R / W Modem Control (5 bits)
5 LSR R Line Status; bit 5 = THR empty (earlycon polls this before each byte)
6 MSR R Modem Status: CTS=0x10, DSR=0x20, RI=0x40, DCD=0x80
7 SCR R / W Scratch Register

The earlycon path never enables interrupts. It polls LSR bit 5 ("THR empty") before writing each byte to offset 0. The interrupt-driven path — used once the full serial driver loads — sets IER_RDA_BIT = 0b0000_0001 to be notified of received data and IER_THR_EMPTY_BIT = 0b0000_0010 to be notified when the transmit holding register drains. Interrupts are delivered via a Trigger trait backed by a Linux eventfd; KVM translates the eventfd signal into an IRQ injection on GSI 4.

One production wrinkle: Firecracker's design document notes that the serial console is disabled in production builds because it may expose guest data through timing side channels. The emulated register file remains, but guest-side output is not forwarded to the host. Development and debugging images re-enable it via the boot configuration. From Firecracker v1.8.0 onward, the UART is also described in the ACPI DSDT as _SB_.COM1 with EISA HID PNP0501, so an ACPI- aware guest can discover it without a kernel command-line hint.

i8042: A Controller Present Only to Reboot

The Intel 8042 was the PS/2 keyboard controller in every PC/AT. It scanned keys, debounced signals, and shared the CPU IRQ line (GSI 1) with the PS/2 mouse. In a microVM, none of that matters. No human is typing into a Firecracker instance. The i8042 is emulated for one reason: the Linux kernel, when asked to reboot with the kernel parameter reboot=k, resets the CPU by writing command 0xFE to the i8042 command port at 0x64.

Firecracker's i8042 implementation is in src/vmm/src/devices/legacy/i8042.rs. It occupies five bytes starting at I8042_KDB_DATA_REGISTER_ADDRESS = 0x060: the data register at 0x60 and the status/command register at offset 4 (OFS_STATUS = 4), which maps to 0x64. IRQ GSI 1 (KBD_EVT_GSI = 1) is wired for the keyboard port; the PS/2 mouse (IRQ 12) is not emulated.

The command set is minimal:

Constant Value Effect
CMD_READ_CTR 0x20 Read the control register
CMD_WRITE_CTR 0x60 Write the control register
CMD_READ_OUTP 0xD0 Read the output port
CMD_WRITE_OUTP 0xD1 Write the output port
CMD_RESET_CPU 0xFE Signal CPU reset; triggers the reset_evt eventfd

When CMD_RESET_CPU arrives, Firecracker signals a reset_evt eventfd. The VMM's event loop receives that signal and shuts down the VM process gracefully. From the guest's perspective, the CPU stops; from the host's perspective, the firecracker binary exits cleanly. The control register keeps two bits set permanently: CB_POST_OK = 0x04 (Power-On Self Test passed) and CB_KBD_INT = 0x01 (keyboard interrupt enabled), which is sufficient to prevent the Linux keyboard driver from looping waiting for POST completion.

The only scan codes the emulator produces are for Ctrl+Alt+Del: KEY_CTRL = 0x0014, KEY_ALT = 0x0011, KEY_DEL = 0xE071. The emulator is not a general PS/2 keyboard; it cannot type. From v1.8.0 onward, it appears in the ACPI DSDT as _SB_.PS2_ with HID PNP0303, with I/O resources at 0x0060 (size 1) and 0x0064 (size 1, the latter described in the source as "Fake a command port so Linux stops complaining").

PICs, PIT, and LAPIC: KVM's In-Kernel Devices

Three more legacy devices are present in every Firecracker VM, but Firecracker does not emulate them itself — KVM does, in the kernel, before any userspace instruction executes. Firecracker's own design document describes them this way: "In addition to the Firecracker provided device models, guests also see the Programmable Interrupt Controllers (PICs), the I/O Advanced Programmable Interrupt Controller (IOAPIC), and the Programmable Interval Timer (PIT) that KVM supports."

Creating the interrupt fabric. A single ioctl, KVM_CREATE_IRQCHIP (_IO(0xAE, 0x60)), instantiates two cascaded i8259A PICs — master at ports 0x20/0x21, slave at ports 0xA0/0xA1 — and an IOAPIC. It also arranges for every subsequently-created vCPU to have a Local APIC. After this call, GSIs 0–15 are routed to both the PIC and the IOAPIC; GSIs 16–23 go to the IOAPIC only. Critically, KVM_CREATE_IRQCHIP must be called before KVM_CREATE_VCPU on x86_64; the ordering is enforced by KVM and documented in the KVM API under §4.24.

Firecracker checks for the required KVM capabilities — KVM_CAP_PIT2 and KVM_CAP_PIT_STATE2 — during startup in src/vmm/src/arch/x86_64/kvm.rs. Missing either capability aborts VM creation. Then setup_irqchip() in src/vmm/src/arch/x86_64/vm.rs calls create_irq_chip() followed immediately by create_pit2(kvm_pit_config { flags: KVM_PIT_SPEAKER_DUMMY, ..Default::default() }).

The PIT. KVM_CREATE_PIT2 (_IOW(0xAE, 0x77, struct kvm_pit_config)) creates the i8254 Programmable Interval Timer. Counter 0 at port 0x40 drives system timer IRQ 0; counter 1 at 0x41 is vestigial (originally DRAM refresh); counter 2 at 0x42 gates the PC speaker via port 0x61. The tick rate is PIT_TICK_RATE = 1193182 Hz (defined in include/linux/timex.h). The KVM_PIT_SPEAKER_DUMMY flag in the flags field tells KVM to emulate the speaker port at 0x61 in-kernel, avoiding a userspace VM exit every time Linux probes it.

Why the PIT matters for TSC calibration. The PIT is not just a timer; it is the ruler against which the kernel measures the TSC's frequency. arch/x86/kernel/tsc.c contains two calibration functions, pit_calibrate_tsc() and quick_pit_calibrate(). Both gate counter 2 via port 0x61, program the measurement latch via port 0x43, then poll the PIT MSB in a tight loop (up to CAL_PIT_LOOPS = 1000 or CAL2_PIT_LOOPS = 5000 iterations) measuring elapsed TSC ticks against a 10 ms window (CAL_MS = 10). The formula is: kHz = ((t2 - t1) * PIT_TICK_RATE) / (latch * 1000) where CAL_LATCH = PIT_TICK_RATE / (1000 / CAL_MS).

Both functions guard on has_legacy_pic(). quick_pit_calibrate() returns 0 immediately when no legacy PIC is present. pit_calibrate_tsc() instead falls back to a udelay-based wait but still skips the PIT measurement loop. Either way, removing the PIT pushes the kernel onto CPUID-based or udelay-based frequency estimation — slower and less precise. Preserving the PIT preserves the fast, accurate TSC calibration path.

Snapshot and restore. Because the interrupt controllers are in-kernel, saving their state requires ioctls. Firecracker's save_state() calls get_irqchip() with KVM_GET_IRQCHIP three times — once each for KVM_IRQCHIP_PIC_MASTER, KVM_IRQCHIP_PIC_SLAVE, and KVM_IRQCHIP_IOAPIC — writing into a struct kvm_irqchip { chip_id; chip }. The restore path calls set_irqchip() with KVM_SET_IRQCHIP three times in the same order. This symmetric protocol is one reason Firecracker's snapshotting is fast: no in-process state machine needs serializing, only three kernel structures.

Fixed MMIO addresses. The IOAPIC sits at IOAPIC_ADDR = 0xFEC0_0000 and the LAPIC at APIC_ADDR = 0xFEE0_0000 (from src/vmm/src/arch/x86_64/layout.rs). KVM also needs a protected TSS at 0xFFFB_D000 (the KVM TSS region), which is placed outside any guest-accessible memslot.

GSI allocation. With the full interrupt fabric in place, Firecracker allocates GSIs as follows (from layout.rs):

Range Owner
GSI 0–4 Reserved; COM1 = GSI 4, i8042 keyboard = GSI 1, timer IRQ 0 = GSI 0
GSI 5–23 virtio-mmio device slots
GSI 24–4095 MSI (PCIe, since v1.13.0)

The MMIO Bus and Device Routing

Having enumerated the legacy devices, the next question is mechanical: when the guest executes IN 0x3F8, AL or writes to a virtio queue-notify register, how does the instruction reach its emulator?

Exits to Userspace

Under VMX, two classes of instruction produce exits that land in the VMM. An IN or OUT instruction generates exit reason 30 (EXIT_REASON_IO_INSTRUCTION). A guest memory access to an address with no backing EPT mapping generates an EPT violation (exit reason 48) or EPT misconfiguration (exit reason 49); if no in-kernel handler claims the fault, KVM promotes it to a KVM_EXIT_MMIO and returns to userspace.

Both classes surface through KVM_RUN (_IO(0xAE, 0x80)). After KVM_RUN returns, the VMM reads kvm_run.exit_reason. For I/O port accesses it is KVM_EXIT_IO = 2; for MMIO accesses it is KVM_EXIT_MMIO = 6. The sub-structs in the kvm_run page describe what happened:

/* KVM_EXIT_IO */
struct {
    __u8  direction;   /* KVM_EXIT_IO_IN=0, KVM_EXIT_IO_OUT=1 */
    __u8  size;        /* operand size: 1, 2, or 4 bytes */
    __u16 port;        /* I/O port number */
    __u32 count;       /* repetition count for INS/OUTS */
    __u64 data_offset; /* byte offset into kvm_run mapping to the data buffer */
} io;

/* KVM_EXIT_MMIO */
struct {
    __u64 phys_addr;  /* guest physical address */
    __u8  data[8];    /* up to 8 bytes of data */
    __u32 len;        /* access width in bytes */
    __u8  is_write;   /* 1 = write, 0 = read */
} mmio;

Source: include/uapi/linux/kvm.h in torvalds/linux.

MMIO space is not configured explicitly. Any guest physical address (GPA) range not covered by a KVM_SET_USER_MEMORY_REGION memslot (_IOW(0xAE, 0x46, struct kvm_userspace_memory_region)) produces KVM_EXIT_MMIO on access. Firecracker calls KVM_SET_USER_MEMORY_REGION to map RAM (and, from v1.8.0, the page holding the RSDP); everything else is MMIO by omission.

Firecracker's Software Bus

The exit reason alone does not route the access. The VMM needs a data structure that maps port or address ranges to device emulators. Firecracker implements its own Bus in src/vmm/src/vstate/bus.rs. It is a RwLock<BTreeMap<BusRange, Weak<dyn BusDeviceSync>>>: a sorted tree mapping half-open address ranges to device references. Lookup uses the B-tree's predecessor operation — range(..= BusRange::new(addr, 1)).next_back() — and then checks that the address falls within the range's end; the whole lookup is O(log n) in the number of registered devices.

Each vCPU holds two buses: pio_bus: Option<Arc<Bus>> for I/O port accesses (x86_64 only) and mmio_bus: Option<Arc<Bus>> for MMIO accesses. The vCPU run loop dispatches exits like this:

VcpuExit::IoIn(addr, data) -> pio_bus.read(u64::from(addr), data) VcpuExit::IoOut(addr, data) -> pio_bus.write(u64::from(addr), data) VcpuExit::MmioRead(addr, data) -> mmio_bus.read(addr, data) VcpuExit::MmioWrite(addr, data) -> mmio_bus.write(addr, data)

(Source: src/vmm/src/vstate/vcpu.rs.)

Devices are stored as Weak<dyn BusDeviceSync> — the bus does not keep devices alive; the VMM's owner struct holds the Arc. The BusDevice trait exposes read(&mut self, base: u64, offset: u64, data: &mut [u8]) and write(&mut self, base: u64, offset: u64, data: &[u8]) -> Option<Arc<Barrier>>. An unresolved address is logged as a warn! but returns VcpuEmulation::Handled, so an out-of-range access does not crash the VM.

The relationship between the exit path, the bus, and the device emulators looks like this:

flowchart TD guest["Guest instruction
(IN/OUT or memory access)"] subgraph kernel_trap["KVM in-kernel — no VM exit"] pic["i8259 PIC
ports 0x20/0x21, 0xA0/0xA1"] pit["i8254 PIT
ports 0x40–0x43"] end kvm["KVM_RUN returns
exit_reason = KVM_EXIT_IO or KVM_EXIT_MMIO"] vcpu["vCPU run loop
in vcpu.rs"] subgraph pio_bus["pio_bus (PIO) — userspace emulation"] uart["16550A UART
0x3F8–0x3FF"] i8042["i8042 stub
0x60–0x64"] end subgraph mmio_bus["mmio_bus (MMIO) — userspace emulation"] vtmmio["virtio-mmio slots
0xC000_0000+"] ioapic["IOAPIC
0xFEC0_0000"] apic["LAPIC
0xFEE0_0000"] end guest -- "PIC/PIT port access" --> kernel_trap guest -- "other IN/OUT or MMIO" --> kvm kvm --> vcpu vcpu -- "IoIn / IoOut" --> pio_bus vcpu -- "MmioRead / MmioWrite" --> mmio_bus

The Fast Path: KVM_IOEVENTFD

Virtio queue kick notifications would generate a KVM_EXIT_MMIO on every VIRTIO_MMIO_QUEUE_NOTIFY write (register offset 0x050 from the device's base address). At high I/O rates, round-tripping through KVM_RUN for each kick is expensive. KVM_IOEVENTFD (_IOW(0xAE, 0x79, struct kvm_ioeventfd)) bypasses this. It registers an eventfd with KVM for a specific address range; when the guest writes to that range, KVM signals the eventfd in-kernel and immediately re-enters the guest. The device backend thread wakes from a read(eventfd_fd) and processes the queue without the VMM loop ever seeing the exit. This is how Firecracker wires virtio queue kicks: driver writes VIRTIO_MMIO_QUEUE_NOTIFY → KVM signals eventfd → backend thread drains the queue → no userspace round-trip.

The Physical Address Map

With RAM, MMIO, and the fixed IOAPIC and LAPIC addresses laid out, the guest physical address space for a typical Firecracker VM looks like this:

flowchart TB
  subgraph gpa["Guest Physical Address Space (x86_64)"]
    lo["0x0000_0000 – RAM<br/>(up to ~3 GiB)"]
    mmio32["0xC000_0000 – 32-bit MMIO window (1 GiB)<br/>virtio-mmio slots, 4 KiB each"]
    ioapic_block["0xFEC0_0000 – IOAPIC"]
    lapic_block["0xFEE0_0000 – LAPIC"]
    hi_ram["0x1_0000_0000 – High RAM<br/>(if guest > 3 GiB)"]
    mmio64["256 GiB – 64-bit MMIO window (256 GiB)"]
  end

Each virtio-mmio slot is 4 KiB (MMIO_LEN = 0x1000), starting at BOOT_DEVICE_MEM_START = 0xC000_0000 for the first (boot) device and MEM_32BIT_DEVICES_START = 0xC000_1000 for subsequent ones. There is no PCI configuration space, no PCIe ECAM window, and no Option ROM area — just RAM, virtio slots, and the two APIC regions.


What Firecracker Drops

No BIOS

Classical x86 boot requires the CPU to start in 16-bit real mode, read a boot sector from disk, hand control to a bootloader, and eventually transition to 64-bit protected mode — all mediated by a BIOS ROM that QEMU or other VMMs typically provide as bios.bin or OVMF.fd. Firecracker skips the entire stack.

Instead, Firecracker constructs a struct boot_params (the "zero page" of the Linux x86 boot protocol, currently version 2.15 as of kernel 5.5) and places it at guest physical address 0x7000. It sets the %rsi register to point to that address and jumps directly to the 64-bit kernel entry point at the load address plus 0x200 — the standard offset for a bzImage's 64-bit entry stub. No real-mode code executes, no BIOS ROM is mapped, and the boot_params fields for legacy I/O devices are not used; the serial console, for instance, is configured entirely via kernel command line, not via setup_header.

Alternatively, Firecracker can use the Xen PVH Direct Boot ABI (added in Firecracker v1.12.0). In the PVH path, Firecracker writes an hvm_start_info structure at PVH_INFO_START = 0x6000 with magic value XEN_HVM_START_MAGIC_VALUE = 0x336e_c578 in %rbx. The kernel must be compiled with CONFIG_PVH=y, available since Linux 5.0; the ELF binary then contains a PVH entry point in a PT_NOTE segment. Both paths eliminate firmware entirely; they differ only in the handshake structure the kernel expects to find before its first instruction.

No PCI Bus

Firecracker's virtio devices use the MMIO transport defined in virtio specification §4.2 (OASIS virtio 1.2, Committee Specification 01), not the PCI transport. There is no PCI host bridge, no PCI configuration space mechanism (neither CF8/CFC port-IO nor PCIe ECAM MMIO), and no PCI enumeration.

The MMIO transport has no self-describing enumeration: a device at a given address does not announce its type or existence on the bus. Discovery happens through side channels. Before Firecracker v1.8.0, this meant kernel command- line slugs of the form virtio_mmio.device=512@0xC0001000:6, one per device, injected into the kernel command line by the VMM at boot. From v1.8.0 onward, an ACPI DSDT table enumerates virtio devices with their MMIO addresses and assigned GSIs, so the guest kernel does not need the command-line hint.

PCI was added as an opt-in in Firecracker v1.13.0 via --enable-pci. When enabled, VirtIO devices use a PCI VirtIO transport instead; MMIO remains the default. Skipping the PCI host bridge and configuration space mechanism removes a substantial slice of emulated attack surface and eliminates the enumeration overhead that PCI scanning adds to early boot.

No ACPI Power Management

Firecracker added basic ACPI table support in v1.8.0: an FADT, XSDT, MADT (for the LAPIC and IOAPIC), and DSDT describing virtio and legacy devices. The RSDP pointer sits at RSDP_ADDR = 0x000E_0000. But the FAQ states the boundary explicitly: "Firecracker does not virtualize power management (e.g. there is no ACPI PM support)."

ACPI S3 (suspend to RAM), S4 (hibernate), and S5 (soft-off) are not available. Reboot is handled by the i8042 CMD_RESET_CPU = 0xFE path described above. Shutdown initiated from inside the guest — for instance, poweroff — does not trigger a clean power-off sequence. The Firecracker process continues running until an external caller sends the SendCtrlAltDel API event, which injects a Ctrl+Alt+Del scan code sequence into the guest, ultimately causing the kernel to reboot via the i8042 path. Before v1.8.0, device enumeration used an MPTable; from v1.8.0 that path is deprecated, with removal planned for v2.0.

The attack surface that remains is deliberately auditable: the UART, i8042, and PIT emulators each fit in a few hundred lines of Rust. Chapter 13 covers the jailer process, which further constrains what the VMM can reach even if one of those emulators is compromised.


QEMU microvm: The Same Destination, Different Road

QEMU's microvm machine type (-machine microvm) was introduced in QEMU 4.2 in late 2019. The QEMU documentation describes it as "a machine type inspired by Firecracker and constructed after its machine model," and the structural similarity is clear: a single ISA bus, no PCI by default, no ACPI in the original release, legacy devices kept only where necessary. The differences are mostly of degree, and understanding them sharpens what is genuinely necessary in any minimal machine model versus what is a Firecracker-specific choice.

The Bus Fabric

QEMU microvm's only bus is a single ISA bus. Onto that bus, a set of legacy devices can be optionally attached: the i8259 PIC pair, the i8254 PIT, an MC146818 RTC, and one ISA serial port. The LAPIC and IOAPIC are always present when KVM is in use; kernel-irqchip=split is the default KVM irqchip mode for microvm, meaning the LAPIC lives in KVM's kernel module while the IOAPIC is handled by QEMU userspace.

virtio-mmio Slots

QEMU microvm provides 8 virtio-mmio transport slots by default (mms->virtio_num_transports = 8 in hw/i386/microvm.c). Each slot is 512 bytes wide (smaller than Firecracker's 4 KiB per slot) at a base address of VIRTIO_MMIO_BASE = 0xfeb00000. Slot i sits at 0xfeb00000 + i * 512. The default IRQ base is mms->virtio_irq_base = 5, so slots 0–7 use GSIs 5–12.

With a secondary IOAPIC (ioapic2), the virtio IRQ base moves to IO_APIC_SECONDARY_IRQBASE (24), the slot count grows to IOAPIC_NUM_PINS (24), and PCIe (when enabled) takes IRQs 12–15. Other fixed MMIO addresses from include/hw/i386/microvm.h: the ACPI Generic Event Device (GED) at 0xfea00000 on IRQ 9, optional xHCI USB at 0xfe900000 on IRQ 10, and the PCIe ECAM window at 0xe0000000 (size 256 MiB) with a MMIO window at 0xc0000000 (size 512 MiB) when PCIe is on.

Firmware

This is where QEMU microvm and Firecracker diverge most sharply. QEMU microvm supports direct kernel loading via -kernel — the QEMU documentation describes it as a machine type that "needs to be run using a host-side kernel and, optionally, an initrd image." But the firmware stub still executes. In hw/i386/microvm.c, x86_bios_rom_init() is called unconditionally as long as IGVM mode is not active: with ACPI disabled it maps qboot.rom, with ACPI enabled it maps bios-microvm.bin. The -kernel flag tells QEMU where to load the kernel image, but it does not suppress the ROM. The guest CPU starts in the firmware stub, which then hands off to the kernel.

Firecracker takes the opposite approach: the VMM constructs boot_params directly, sets %rsi to point to it, and jumps to the 64-bit kernel entry point. No ROM is mapped, no firmware code executes, and the guest CPU's first instruction is the kernel's own. qboot is purpose-built for speed — it typically adds only a few tens of milliseconds — but it still represents firmware-controlled code running in the guest before the kernel. Firecracker eliminates that phase entirely.

ACPI

QEMU 4.2 shipped microvm without ACPI. QEMU 5.2 added it. The tables are compact: APIC at 78 bytes, DSDT at 482 bytes, FACP at 268 bytes — under 1 KiB total, growing to roughly 3,130 bytes for the DSDT when PCIe is enabled (per Gerd Hoffmann's 2020 blog post on kraxel.org). The DSDT declares each active virtio-mmio slot with its MMIO address and GSI, so no command-line slugs are needed.

When ACPI is disabled, QEMU handles device discovery by patching the guest kernel command line automatically: microvm_get_mmio_cmdline() in hw/i386/microvm.c appends virtio_mmio.device=512@0x<addr>:<irq> for each active slot. This behavior is controlled by the machine option auto-kernel-cmdline (on by default).

Shutdown

Without ACPI PM and without a PS/2 keyboard (both of which are optional in microvm), there is no standard shutdown path. QEMU microvm's recommended approach is a CPU triple-fault, which QEMU treats as a reboot or shutdown trigger. The kernel parameter reboot=t prioritizes the triple-fault path. This is the mirror image of Firecracker's reboot=k strategy: both avoid ACPI PM, but Firecracker routes through the i8042 while QEMU microvm, when the i8042 is absent, routes through a deliberate fault.

Side by Side

Property QEMU microvm Firecracker
PCI bus None (QEMU docs describe microvm as having no PCI/PCIe) None by default; optional PCIe (v1.13.0+)
ACPI Added in QEMU 5.2; includes PM framework Added in v1.8.0; no PM
Firmware qboot.rom (no ACPI) or bios-microvm.bin (with ACPI) None; kernel loaded directly
virtio transport virtio-mmio; 8 slots at 0xfeb00000, 512 B each virtio-mmio; 4 KiB slots from 0xC000_0000
ISA serial Optional; always firmware-visible Present; disabled in production builds
Shutdown CPU triple-fault (reboot=t) i8042 CMD_RESET_CPU=0xFE (reboot=k)
Device enumeration Command-line injection or ACPI DSDT Command-line injection, MPTable (deprecated), or ACPI DSDT (v1.8.0+)
Introduced QEMU 4.2, late 2019 Open-sourced November 2018 (v0.11.0)

The structural gap is the firmware stub — a few tens of milliseconds of vendor-controlled code that QEMU microvm runs before every kernel, and that Firecracker never maps at all. The next chapter examines what the jailer does with the attack surface that remains after the machine model has been stripped this far down.


Sources And Further Reading