Chapter 1: The Virtualization Stack Map

Running lsmod | grep kvm on a modern Linux host shows three modules: kvm, and either kvm_intel or kvm_amd. Those three lines are a complete hypervisor. Knowing what each one does, and what each one deliberately does not do, is the prerequisite for everything that follows.

People describe KVM as a type-2 hypervisor because a host OS is present; others call it type-1 because it runs in ring 0. Firecracker is described as "lightweight QEMU," which is technically wrong in almost every dimension that matters. QEMU is treated as inevitable, when it is one of four VMMs that share the same bottom two layers. The stack has a precise shape, and the vocabulary gets much less slippery once you can see the shape.

The Hardware Foundation

The first problem with running a guest operating system on a host is rings. An operating system assumes it owns ring 0. It writes control registers, flushes TLBs, modifies the GDT. If two operating systems try to do those things on the same hardware simultaneously, they collide. Classic x86 made this worse: many instructions behave differently in ring 0 than in ring 3 without trapping — they silently fail or return host state to the guest — which means a naive hypervisor cannot intercept them. The 1974 Popek-Goldberg paper in Communications of the ACM proved that virtualization requires all sensitive instructions (those whose behavior depends on privilege level) to be a subset of privileged instructions (those that trap in user mode). Classic x86 violated that condition for eighteen instructions.[^robin-irvine]

[^robin-irvine]: John D. Robin and Cynthia E. Irvine, "Analysis of the Intel Pentium's Ability to Support a Secure Virtual Machine Monitor," USENIX Security Symposium, 2000. The paper identifies eighteen sensitive, unprivileged IA-32 instructions.

Intel VT-x and AMD-V broke through that impasse not by fixing the ring semantics but by adding an orthogonal dimension above them. Intel's answer is two execution contexts: VMX root mode, where the host OS and KVM run, and VMX non-root mode, where the guest OS runs. Rings 0 through 3 continue to exist inside each mode. A guest kernel executes at VMX non-root ring 0; its applications at VMX non-root ring 3. The silicon, not software, enforces the boundary. When guest code executes something that requires host attention, the hardware performs a VM exit and transfers control to KVM's exit handler. The guest resumes when KVM issues VMRESUME. The Intel SDM (Vol. 3, Chapter 23) is the primary specification; AMD's equivalent mode is SVM (Secure Virtual Machine), enabled by the EFER.SVME bit.

Per-vCPU state on Intel lives in the VMCS (Virtual Machine Control Structure), a 4 KiB in-memory structure accessible only through the VMREAD and VMWRITE instructions, which are executable only in VMX root mode. The VMCS has six logical regions: a guest-state area saved on every VM exit and restored on every VM entry; a host-state area restored on exit; VM-execution control fields that specify which guest events trigger exits; VM-exit control fields; VM-entry control fields; and VM-exit information fields that record what happened. The field VM_EXIT_REASON lives at VMCS encoding 0x00004402. The EPT page-table pointer is at EPT_POINTER = 0x0000201a. These are not abstractions — they are the numeric encodings in arch/x86/include/asm/vmx.h, read by KVM with VMREAD after every exit.

AMD's counterpart is the VMCB (Virtual Machine Control Block), a packed struct defined in arch/x86/include/asm/svm.h:

struct __attribute__((__packed__)) vmcb {
    struct vmcb_control_area control;
    struct vmcb_save_area    save;
};

The entry instruction is VMRUN, which takes the guest VMCB's physical address in RAX. On VMEXIT, only RIP, RSP, and RAX are automatically restored to host values; all other general-purpose registers still hold guest values and must be explicitly saved by the hypervisor, typically with VMLOAD. The control area carries exit_code, exit_info_1, exit_info_2, and next_rip; the save area preserves the full segment and register state the guest had on entry.

Memory presents a parallel problem. A guest manages page tables that map its virtual addresses to what it believes are physical addresses. Those "physical" addresses are not host physical addresses — the guest does not own the machine's memory. Intel EPT (Extended Page Tables) and AMD NPT (Nested Page Tables) solve this by adding a second hardware translation: guest-physical address → host-physical address, walked by the MMU automatically on every guest memory access. KVM calls this TDP, two-dimensional paging. Without EPT or NPT, KVM falls back to shadow page tables: a struct kvm_mmu_page containing 512 shadow PTEs with role.level (1 = 4 KiB, 2 = 2 MiB, 3 = 1 GiB) and a role.direct flag. Shadow tables require KVM to write-protect all guest page tables and synchronize shadow PTEs on every guest modification — expensive enough that the fallback is for ancient or constrained environments. The KVM MMU documentation at docs.kernel.org/virt/kvm/x86/mmu.html documents both paths; on any server CPU built after roughly 2008, EPT or NPT is the operative path.

The KVM Kernel Module

The hardware extensions provide the mechanism. KVM makes it programmable.

KVM ships as three kernel modules. kvm.ko is the common base: it creates /dev/kvm, owns the ioctl dispatch table, manages memory slots, runs the in-kernel irqchip (LAPIC, IOAPIC, PIT, HPET via kvm_io_bus), and sets up EPT or NPT. kvm-intel.ko implements the VMX code path and requires the vmx CPU feature bit. kvm-amd.ko implements the SVM path and requires svm. Architecture-specific operations flow through the kvm_x86_ops function-pointer struct (struct kvm_x86_ops); the common module calls whichever vendor implementation is loaded. Minimum requirements: Linux 4.14 or later with CONFIG_KVM=m and CONFIG_KVM_INTEL=m or CONFIG_KVM_AMD=m.

/dev/kvm is a character device with mode 0660, owned root:kvm. Its major number is dynamically assigned and visible in /proc/devices. Any process in the kvm group can open it; everything else is mediated through the ioctl API above it.

The Three-Level File Descriptor API

The KVM API is organized as three nested file descriptors, each narrowing scope from the kernel-wide to the per-CPU:

flowchart TD A["open('/dev/kvm') → system fd"] --> B["ioctl(kvmfd, KVM_CREATE_VM, 0) → VM fd"] B --> C["ioctl(vmfd, KVM_CREATE_VCPU, n) → vCPU fd"]

The system fd talks to KVM as a whole. KVM_GET_API_VERSION (_IO(0xAE, 0x00)) returns the constant 12, a value frozen at Linux 2.6.22; any value other than 12 indicates a broken kernel and must be rejected. KVM_CREATE_VM (_IO(0xAE, 0x01)) returns a VM fd; the parameter is the machine type, 0 for the default x86 type.

The VM fd talks to one virtual machine. KVM_SET_USER_MEMORY_REGION (_IOW(0xAE, 0x46, ...)) maps a range of host virtual memory into the guest's physical address space:

struct kvm_userspace_memory_region {
    __u32 slot;            /* region index, bits 0–15 */
    __u32 flags;           /* KVM_MEM_LOG_DIRTY_PAGES | KVM_MEM_READONLY */
    __u64 guest_phys_addr;
    __u64 memory_size;     /* bytes; set to 0 to delete the slot */
    __u64 userspace_addr;  /* host virtual address of backing memory */
};

The userspace_addr field is the insight about guest "physical" memory: it is a host virtual address, typically obtained with mmap. EPT then walks from that host virtual address to the true host-physical page. Slots must not overlap in guest physical space. KVM_CREATE_VCPU (_IO(0xAE, 0x41)) takes a vCPU ID (which becomes the APIC ID on x86) and returns a vCPU fd.

The vCPU fd talks to one virtual CPU. KVM_SET_REGS (_IOW(0xAE, 0x82, ...)) initializes general-purpose registers; KVM_SET_SREGS (_IOW(0xAE, 0x84, ...)) sets segment registers, control registers, and EFER. KVM_RUN (_IO(0xAE, 0x80)) enters the guest and blocks the calling thread until a VM exit requires userspace attention.

kvm_run and Exit Reasons

Before the first KVM_RUN, the VMM maps the struct kvm_run shared region:

mmap(NULL, mmap_size, PROT_READ|PROT_WRITE, MAP_SHARED, vcpufd, 0)

where mmap_size comes from KVM_GET_VCPU_MMAP_SIZE on the system fd. When KVM_RUN returns, kvm_run.exit_reason tells the VMM why. The exit codes are defined in include/uapi/linux/kvm.h and are part of the stable ABI:

Code	Name	Meaning
0	`KVM_EXIT_UNKNOWN`	Unhandled exit
2	`KVM_EXIT_IO`	Guest I/O port access; `io.direction`, `io.port`, `io.size`, `io.count`, `io.data_offset` populated
5	`KVM_EXIT_HLT`	Guest executed `HLT` with no pending work
6	`KVM_EXIT_MMIO`	Guest MMIO access; `mmio.phys_addr`, `mmio.data`, `mmio.len`, `mmio.is_write` populated
8	`KVM_EXIT_SHUTDOWN`	Guest triple-faulted or requested shutdown
17	`KVM_EXIT_INTERNAL_ERROR`	KVM internal error
24	`KVM_EXIT_SYSTEM_EVENT`	Reboot or poweroff

Not every VM exit reaches userspace. KVM's exit handler in arch/x86/kvm/vmx/vmx.c dispatches on EXIT_REASON through a table of per-reason handlers. EXIT_REASON_CPUID = 10 is handled entirely in kernel; the handler synthesizes the correct leaf values and returns 1, causing KVM to re-enter the guest immediately without returning to the VMM process. EXIT_REASON_IO_INSTRUCTION = 30 goes in-kernel if the kvm_io_bus has a handler registered for that port; otherwise KVM_RUN returns with KVM_EXIT_IO. EXIT_REASON_EPT_VIOLATION = 48 is in-kernel for page-table faults KVM can resolve itself; if the access targets a region the VMM must emulate, KVM_RUN returns with KVM_EXIT_MMIO. The handler returning 1 is the fast path; 0 or a negative value sends the exit to userspace.

What KVM Owns, What the VMM Owns

The split is enforced by KVM, not advisory:

KVM owns: CPU context save and restore through the VMCS or VMCB; hardware interrupt injection; EPT and NPT management; the VM-exit dispatch loop; and the in-kernel irqchip (LAPIC, IOAPIC, PIT, HPET).

The VMM owns: device models; guest memory layout and the mmap calls that back it; firmware or its deliberate absence; and policy for vCPU thread scheduling. One threading constraint shapes every VMM: KVM_RUN and all other vCPU ioctls must be issued from the same thread that called KVM_CREATE_VCPU. VM-level ioctls must come from the same process that created the VM. KVM enforces this; it is not advisory.

Type 1, Type 2, and Why the Labels Break Down

The type-1/type-2 taxonomy that opens every virtualization talk traces to Robert P. Goldberg's 1972 Harvard PhD thesis, not to Popek and Goldberg's 1974 Communications of the ACM paper. The 1974 paper defines three VMM properties — equivalence/fidelity, resource control/safety, and efficiency/performance — and proves the formal conditions for VMM construction. It never uses the terms "Type 1" or "Type 2." Those labels came from the thesis: a Type I VMM runs on bare hardware; a Type II VMM runs on top of a host OS, an "extended machine."

KVM satisfies both definitions simultaneously, which is the problem. The KVM module runs at VMX root ring 0, the same privilege level as the host Linux kernel — both are in VMX root mode at ring 0. That is Goldberg's Type I condition: supervisor privilege on hardware. But a host OS runs alongside and beneath the VMM; the same machine runs host applications concurrently with guest VMs. That is the structural situation the Type II label was invented to describe.

KVM maintainer Paolo Bonzini addressed the question directly in a thread on LKML: "I would just ignore it. To some extent, the modern usage of the type-1 and type-2 terms is really more about VMware and Xen trying to bash KVM, than anything else." Kernel developer Christoph Hellwig, on the same thread, called the arbitrary classification something that "doesn't make any sense at all."

The structural reason both are right: VMX root mode is orthogonal to the ring hierarchy. There is no longer a meaningful "above the OS" or "below the OS" position. The host kernel and the KVM module coexist at VMX root ring 0; guests run at VMX non-root ring 0. The 1972 taxonomy predates this hardware entirely. Xen with Dom0 is structurally identical to KVM with QEMU — hypervisor core at bare-metal privilege, a privileged user-mode component for device emulation — and neither label fits either system cleanly.

The only accurate description is hybrid: Type I in hardware privilege, Type II in host coexistence. This book treats it that way and does not return to the question.

The VMM Landscape

With the hardware and KVM layers established, the VMMs are easier to place. All of them are userspace processes that open /dev/kvm, call the three-level ioctl API, back guest memory with mmap, and loop on KVM_RUN. What they differ in is what they model after that — which devices, which transports, which security boundaries, which guest configurations they serve.

flowchart TB hw["Host hardware\n(VT-x / AMD-V + EPT / NPT)"] kvm["kvm.ko + kvm-intel.ko or kvm-amd.ko\n/dev/kvm"] qemu["qemu-system-x86_64\n~1.4M lines C"] fc["firecracker\n~50k lines Rust"] ch["cloud-hypervisor\nRust / KVM + MSHV"] cr["crosvm\nRust / per-device sandbox"] guest["Guest kernel + workload\nVMX non-root ring 0 / ring 3"] hw --> kvm kvm --> qemu kvm --> fc kvm --> ch kvm --> cr qemu --> guest fc --> guest ch --> guest cr --> guest

QEMU

Fabrice Bellard released the first QEMU preview in 2003. The codebase now stands at approximately 1.4 million lines of C, supports more than 30 guest ISAs, and runs under two acceleration modes. TCG (Tiny Code Generator) is a software JIT that cross-compiles guest instructions to host instructions at runtime; it is the default and requires no hardware support. Hardware acceleration — KVM on Linux, Hypervisor.Framework on macOS, WHPX on Windows, Xen — is selected explicitly. When you run qemu-system-x86_64 -accel kvm, the guest vCPUs execute directly on hardware through /dev/kvm; TCG does none of the work.

QEMU's breadth is its defining characteristic. It emulates PCI buses, USB controllers, AHCI storage, GPU variants, ISA buses, legacy interrupt controllers, BIOS and UEFI firmware, and hundreds of individual device models. That breadth is also the attack surface. QEMU does ship a minimal machine type called microvm (selected with -machine microvm), directly inspired by Firecracker: it drops PCI and ACPI, exposes up to eight virtio-MMIO devices, one optional ISA serial port, a LAPIC, an IOAPIC, kvmclock (when using KVM), and fw_cfg. It cannot hotplug devices or live-migrate across QEMU versions.

Intel's earlier attempt at the same goal, the NEMU project (github.com/intel/nemu), was archived on April 14, 2021. Its archived README reads: "Cloud Hypervisor is the successor."

Firecracker

AWS open-sourced Firecracker on November 27, 2018, at version 0.11.0. It is written in approximately 50,000 lines of Rust — a 96% reduction from QEMU's codebase — and is licensed Apache 2.0. It supports x86_64 and aarch64, and runs exclusively on KVM with no TCG fallback.

Firecracker is a direct descendant of crosvm. The crosvm documentation states this explicitly: "Soon after crosvm's code was public, Amazon used it as the basis for their own VMM named Firecracker." Cloud Hypervisor's README likewise acknowledges that "a large part of Cloud Hypervisor code is based on either the Firecracker or crosvm implementations."

The process structure is fixed: one process per microVM. Inside that process, three thread types exist: an API thread running an in-process HTTP/REST server over a Unix domain socket, a VMM thread handling device emulation and I/O rate limiting, and one KVM_RUN-looping vCPU thread per guest vCPU. The device set is intentionally small. Firecracker emulates virtio-net, virtio-block, virtio-vsock, virtio-balloon (added in v1.4.0), a serial console (16550A UART), and an i8042 keyboard controller stub. All virtio devices use the virtio-over-MMIO transport defined in OASIS virtio spec §4.2, not virtio-over-PCI. Interrupt controllers — two 8259 PICs and an IOAPIC — are required for x86 interrupt delivery and are emulated via KVM. On aarch64, a PL031 RTC is also present. No USB, no display, no audio, no GPU.

That fixed structure enables Firecracker's SPECIFICATION.md to make enforceable quantitative claims — bounds that run as integration tests on every pull request and main-branch merge. VMM start to API socket availability: at most 8 CPU milliseconds. Guest /sbin/init reachable from InstanceStart: at most 125 ms. VMM memory overhead for a 1-vCPU, 128 MiB guest: at most 5 MiB. Guest CPU compute performance: above 95% of bare-metal equivalent. These are not marketing targets; they are pass/fail CI gates.

Cloud Hypervisor

Intel announced Cloud Hypervisor in May 2019, grown from NEMU. It targets cloud VM workloads — not ultra-minimal serverless — and its feature set reflects that: CPU and memory hotplug, VFIO device passthrough for SR-IOV NICs and NVMe drives, virtio-fs, virtio-pmem, vhost-user for device backend offload, and Intel TDX confidential-VM support. It supports 64-bit Linux guests and Windows 10 and Windows Server 2019 guests.

Cloud Hypervisor runs on both Linux KVM and the Microsoft Hypervisor (MSHV), enabling deployment on Azure and Windows Hyper-V hosts. Primary architectures are x86-64 and AArch64 (production, requiring GICv3); riscv64 support is experimental.

crosvm

Google's crosvm was built for ChromeOS (the Crostini Linux container runtime) and Android (ARCVM). It is written primarily in Rust and runs primarily on KVM.

crosvm's defining architectural choice is a process-per-device security model. Each virtio device backend runs as a sandboxed child process, jailed with minijail — Google's library wrapping Linux namespaces and seccomp-BPF. A compromised device process is contained by a seccomp policy restricting its available syscalls; it cannot reach the rest of the VMM. This is structurally distinct from Firecracker's model, where seccomp-BPF filters are installed per-thread within a single process. crosvm's device set is broader: block, networking, memory ballooning, virtio-fs, vsock, virtio-pmem, USB, Wayland display output, experimental video, and vhost-user.

The rust-vmm Common Foundation

Firecracker, Cloud Hypervisor, and crosvm all independently implement the same pieces: KVM ioctl wrappers, guest memory management, virtio queue logic, legacy device models. The rust-vmm project, founded in December 2018 by engineers from Amazon, Google, Intel, and Red Hat — with Alibaba, CrowdStrike, and Linaro joining later — exists to factor those pieces into shared crates. All three Rust VMMs are named primary consumers.

The key crates, all published on crates.io: kvm-ioctls (safe Rust wrappers over the three-level KVM fd API); kvm-bindings (FFI bindings to the KVM UAPI headers); vm-memory (v0.17.1, October 2025, providing GuestMemoryMmap and GuestAddress, decoupling memory consumers from the allocation strategy); linux-loader (loads vmlinux ELF, bzImage, and PE images into guest memory; parses boot_params and hvm_start_info); vm-superio (emulates the UART 16550A, the i8042 PS/2 controller, and the ARM PL031 RTC); seccompiler (compiles seccomp-BPF filter policies and installs them per-thread via apply_filter()); virtio-queue (split-ring virtqueue implementation from the virtio spec). When you read Firecracker source and find the serial port coming from vm-superio and the KVM calls going through kvm-ioctls, you are looking at the shared layer.

Where Firecracker Sits and What It Leaves Out

Firecracker's design document states the purpose directly: "purpose-built for running serverless functions and containers safely and efficiently, and nothing more." Every omission reduces either boot latency, memory footprint, or attack surface — usually all three.

No BIOS, No Firmware

A conventional x86 boot starts in 16-bit real mode with firmware code from a ROM. The firmware probes hardware, initializes memory, runs the bootloader, and eventually hands off to the kernel. On a good day this takes hundreds of milliseconds and is necessary only to accommodate the diversity of hardware a general-purpose machine might encounter. A microVM has none of that diversity. Firecracker bypasses firmware entirely using the Linux x86 direct-boot protocol.

The protocol, documented at docs.kernel.org/arch/x86/boot.html, requires the VMM to write a populated struct boot_params — the "zero page" — into guest memory, set %rsi to point to it, and jump directly to the protected-mode kernel entry point. No 16-bit real-mode code executes. The setup_header embedded in boot_params must carry the magic value 0x53726448 ("HdrS") at byte offset 0x202, the protocol version at 0x206, the bootloader type at 0x210, load flags at 0x211, the physical address of the kernel command line at 0x228, and the linear memory required during init at 0x260. Firecracker places the zero page at guest physical address 0x7000; the linux-loader rust-vmm crate handles the mechanics of populating it.

Firecracker v1.12.0 (PR #5048) added support for the Xen PVH boot protocol. In PVH mode, the kernel must be compiled with CONFIG_PVH=y (available since Linux 5.0); the ELF binary carries a PVH entry point in a PT_NOTE segment, and the VMM passes an hvm_start_info structure with the memory map. The ACPI RSDP is placed at guest physical 0x000E_0000. PVH is a second firmware-free boot path for kernels that prefer the cleaner memory-map handoff it provides.

Amazon's own description is blunt: "Firecracker doesn't implement traditional devices like a BIOS or PCI bus."

No PCI Bus

Firecracker has no PCI bus controller. All virtio devices use the virtio-over-MMIO transport (virtio 1.2 spec §4.2), not virtio-over-PCI. Each device occupies a fixed MMIO address range. The MMIO register map is defined by the spec: a MagicValue register at offset 0x000 containing 0x74726976 ("virt" in little-endian ASCII), a Version register at 0x004 holding 2 (the non-legacy virtio 1.x value), DeviceID at 0x008, VendorID at 0x00C, DeviceFeatures at 0x010, and QueueNotify at 0x050.

Unlike PCI, the MMIO transport has no generic discovery mechanism. Device locations must be communicated to the guest out-of-band. Historically Firecracker used kernel command-line slugs of the form virtio_mmio.device=<size>@<addr>:<irq>, which requires CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES in the guest kernel. Since Firecracker v1.8.0, ACPI DSDT AML tables describe each virtio device, making the command-line approach unnecessary for ACPI-capable guests. A community initiative for PCIe passthrough exists (GitHub discussions #4845) but has no production release date; internal Firecracker team meetings on the topic were paused as of early 2025.

ACPI: Added Late, Not Original

ACPI was absent from Firecracker's original design. The first hardware topology mechanism was the legacy MultiProcessor Table (MPTable), sufficient for SMP discovery. Firecracker v1.8.0 (PR #4428) added basic ACPI tables for x86_64: a MADT describing vCPU and interrupt topology, and a DSDT with AML describing virtio and legacy devices. MPTable is now deprecated, with removal planned for v2.0 or later.

Even with ACPI present, there is no ACPI power management. Firecracker's FAQ states this directly: "Firecracker does not virtualize power management (e.g. there is no ACPI PM support)." Running poweroff inside a Firecracker guest shuts the guest OS down but leaves the firecracker process running, because the guest has no channel to signal power-off to the host.

The i8042 Stub

The i8042 keyboard controller appears in Firecracker's device list not to accept keyboard input but to field guest reboot requests. The FAQ specifies that reboot works only when the guest boots with reboot=k, which instructs Linux to signal reset through the i8042 reset line. It is on the path to removal once an ACPI S5 shutdown path is fully wired. Its presence illustrates the pattern: every legacy device in Firecracker is present because the guest kernel expects it at that address, not because the use case requires it.

The Guest Side

The guest runs in VMX non-root mode and cannot directly observe that it is virtualized — sensitive instructions either trap to KVM or, for most memory accesses, are handled transparently by EPT without a VM exit at all. Firecracker's guest requirements are: 64-bit Linux 4.14 or later, with CONFIG_VIRTIO_MMIO=y for device drivers. Even though no PCI bus is present, CONFIG_PCI=y and CONFIG_PCI_MMCONFIG=y may be required for ACPI initialization code paths; the kernel command line should include pci=off to suppress PCI enumeration. Firecracker's kernel policy document tracks the full set of required and recommended configuration options.

The next chapter works through the KVM ioctl API by building the smallest possible VMM that can boot a kernel — which is also the fastest way to see exactly where the hardware layer ends and the VMM layer begins.

Sources And Further Reading

Popek & Goldberg, "Formal Requirements for Virtualizable Third-Generation Architectures," Communications of the ACM 17(7), July 1974. Proves the formal construction conditions; does not define Type 1/2. https://dl.acm.org/doi/10.1145/361011.361073
Wikipedia: Popek and Goldberg virtualization requirements — confirms the 1974 paper does not use "Type 1" or "Type 2"; traces the labels to Goldberg's 1972 Harvard PhD thesis. https://en.wikipedia.org/wiki/Popek_and_Goldberg_virtualization_requirements
Intel Software Developer's Manual, Vol. 3, Chapter 23: VMX non-root operation, VMCS structure, VM-exit behavior.
KVM API documentation — definitive reference for the three-level fd model, ioctl encodings, kvm_run, and exit codes. https://docs.kernel.org/virt/kvm/api.html
"Using the KVM API," LWN.net — C-code walkthrough of the three-level fd API and struct kvm_run. https://lwn.net/Articles/658511/
include/uapi/linux/kvm.h — KVMIO ioctl encodings and the full KVM_EXIT_* enum. https://github.com/torvalds/linux/blob/master/include/uapi/linux/kvm.h
arch/x86/include/asm/vmx.h — VMCS field encodings (EPT_POINTER = 0x0000201a, VM_EXIT_REASON = 0x00004402). https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/vmx.h
arch/x86/include/uapi/asm/vmx.h — EXIT_REASON_* numeric constants used by KVM's exit handler. https://github.com/torvalds/linux/blob/master/arch/x86/include/uapi/asm/vmx.h
arch/x86/include/asm/svm.h — VMCB struct layout (vmcb_control_area, vmcb_save_area). https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/svm.h
arch/x86/include/uapi/asm/svm.h — SVM exit codes (SVM_EXIT_NPF = 0x400). https://github.com/torvalds/linux/blob/master/arch/x86/include/uapi/asm/svm.h
KVM shadow MMU vs. EPT/NPT TDP documentation. https://docs.kernel.org/virt/kvm/x86/mmu.html
KVM nested VMX documentation — vmcs01/vmcs12/vmcs02 model; enabled by default since Linux v4.20. https://docs.kernel.org/virt/kvm/x86/nested-vmx.html
LKML thread — Bonzini and Hellwig on the type-1/type-2 classification. https://lkml.kernel.org/kvm/[email protected]/T/
Firecracker open-source announcement (November 27, 2018, v0.11.0). https://aws.amazon.com/blogs/opensource/firecracker-open-source-secure-fast-microvm-serverless/
Amazon Science: how Firecracker virtual machines work — omissions (BIOS, PCI, USB, display, audio), ~50k lines Rust. https://www.amazon.science/blog/how-awss-firecracker-virtual-machines-work
Firecracker design document — thread model (API thread, VMM thread, vCPU threads), device set, isolation layers, seccomp threat model. https://github.com/firecracker-microvm/firecracker/blob/main/docs/design.md
Firecracker SPECIFICATION.md — quantitative SLIs enforced as CI integration tests. https://raw.githubusercontent.com/firecracker-microvm/firecracker/main/SPECIFICATION.md
Firecracker FAQ — device list, ACPI PM absent, i8042 reboot, reboot=k. https://raw.githubusercontent.com/firecracker-microvm/firecracker/main/FAQ.md
Firecracker CHANGELOG — v1.4.0 virtio-balloon, v1.8.0 ACPI (PR #4428), v1.12.0 PVH boot (PR #5048), MPTable deprecation. https://github.com/firecracker-microvm/firecracker/blob/main/CHANGELOG.md
Firecracker jailer documentation — cgroup v1/v2, CLONE_NEWPID, setns(CLONE_NEWNET), pivot_root. https://github.com/firecracker-microvm/firecracker/blob/main/docs/jailer.md
Linux x86 boot protocol — zero page, HdrS magic at offset 0x202, cmd_line_ptr at 0x228. https://docs.kernel.org/arch/x86/boot.html
LWN Firecracker article — direct boot, no BIOS, no firmware. https://lwn.net/Articles/775736/
OASIS virtio 1.2 specification, §4.2 — MMIO transport register map (MagicValue, Version, QueueNotify). https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.html
QEMU documentation — TCG vs. KVM, supported guest ISAs. https://www.qemu.org/docs/master/system/introduction.html
QEMU microvm machine type — "inspired by Firecracker." https://www.qemu.org/docs/master/system/i386/microvm.html
NEMU archived repository (archived April 14, 2021) — "Cloud Hypervisor is the successor." https://github.com/intel/nemu
Cloud Hypervisor repository — KVM and MSHV backends, VFIO, hotplug, TDX, architecture support. https://github.com/cloud-hypervisor/cloud-hypervisor
Cloud Hypervisor announcement, May 2019 — Intel, NEMU lineage. https://thenewstack.io/intel-releases-cloud-hypervisor-based-on-same-components-as-amazons-firecracker/
crosvm official documentation — per-device minijail sandbox, device set, Android ARCVM. https://crosvm.dev/book/introduction.html
crosvm rust-vmm documentation — Firecracker lineage from crosvm. https://chromium.googlesource.com/chromiumos/platform/crosvm/+/master/docs/rust-vmm.md
rust-vmm community README — founding (December 2018), member organizations, crate inventory. https://github.com/rust-vmm/community/blob/main/README.md
vm-memory crate (v0.17.1, October 2025): https://github.com/rust-vmm/vm-memory
kvm-ioctls crate documentation. https://docs.rs/kvm-ioctls/latest/kvm_ioctls/
Intel VT-x, KVM, and QEMU internals — VMX instructions, VMCS structure, EPT walkthrough. https://binarydebt.wordpress.com/2018/10/14/intel-virtualisation-how-vt-x-kvm-and-qemu-work-together/