Commit graph

3 commits

Author SHA1 Message Date
6b543cb17f
firecracker: adopt firecracker-jailer for VM launch (Phase B)
Each VM's firecracker now runs inside a per-VM chroot dropped to the
registered owner UID via firecracker-jailer. Closes the broad ambient-
sudo escalation surface that survived Phase A: the helper still needs
caps for tap/bridge/dm/loop/iptables, but the VMM itself no longer
runs as root in the host root filesystem.

The host helper stages each chroot up front: hard-links the kernel
and (optional) initrd, mknods block-device drives + /dev/vhost-vsock,
copies in the firecracker binary (jailer opens it O_RDWR so a ro bind
fails with EROFS), and bind-mounts /usr/lib + /lib trees read-only so
the dynamic linker can resolve. Self-binds the chroot first so the
findmnt-guarded cleanup can recurse safely.

AF_UNIX sun_path is 108 bytes; the chroot path easily blows past that.
Daemon-side launch pre-symlinks the short request socket path to the
long chroot socket before Machine.Start so the SDK's poll/connect
sees the short path while the kernel resolves to the chroot socket.
--new-pid-ns is intentionally disabled — jailer's PID-namespace fork
makes the SDK see the parent exit and tear the API socket down too
early.

CapabilityBoundingSet for the helper expands to add CAP_FOWNER,
CAP_KILL, CAP_MKNOD, CAP_SETGID, CAP_SETUID, CAP_SYS_CHROOT alongside
the existing CAP_CHOWN/CAP_DAC_OVERRIDE/CAP_NET_ADMIN/CAP_NET_RAW/
CAP_SYS_ADMIN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:38:07 -03:00
d73efe6fbc
firecracker: drop sudo sh -c, race chown against SDK probe in Go
Replace the shell-string launcher in buildProcessRunner with a direct
exec.Command. The previous sh -c wrapper relied on shellQuote escaping
for every MachineConfig field that flowed into the launch script; any
future field that ever carried an attacker-controlled value would have
become RCE-as-root. The new path passes binary path and flags as
separate argv entries, so there is no shell to interpret anything.

The wrapper also did two things the shell can no longer do for us:

  1. umask 077 — moved to syscall.Umask in cmd/bangerd/main.go so every
     firecracker child (and any other file the daemon creates) inherits
     0600 by default. Single-user dev sandbox state should be private.

  2. chown_watcher — the SDK's HTTP probe inside Machine.Start connects
     to the API socket the moment it appears. Under sudo the socket is
     created root-owned and the daemon's connect(2) gets EACCES, so the
     post-Start EnsureSocketAccess never runs. The shell papered over
     this with a backgrounded chown loop. Replaced by
     fcproc.EnsureSocketAccessForAsync: same race-window guarantee, in
     pure Go, kicked off in LaunchFirecracker right before Start and
     awaited right after.

Tests updated: shell-substring assertions replaced with cmd-arg
assertions, plus a new fcproc test pinning the async chown sequence.
Smoke (full systemd two-service install + KVM scenarios) passes.
2026-04-27 20:14:01 -03:00
59e48e830b
daemon: split owner daemon from root helper
Move the supported systemd path to two services: an owner-user bangerd for
orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop,
and Firecracker ownership. This removes repeated sudo from daily vm and image
flows without leaving the general daemon running as root.

Add install metadata, system install/status/restart/uninstall commands, and a
system-owned runtime layout. Keep user SSH/config material in the owner home,
lock file_sync to the owner home, and move daemon known_hosts handling out of
the old root-owned control path.

Route privileged lifecycle steps through typed privilegedOps calls, harden the
two systemd units, and rewrite smoke plus docs around the supported service
model.

Verified with make build, make test, make lint, and make smoke on the
supported systemd host path.
2026-04-26 12:43:17 -03:00