Replace the shell-string launcher in buildProcessRunner with a direct
exec.Command. The previous sh -c wrapper relied on shellQuote escaping
for every MachineConfig field that flowed into the launch script; any
future field that ever carried an attacker-controlled value would have
become RCE-as-root. The new path passes binary path and flags as
separate argv entries, so there is no shell to interpret anything.
The wrapper also did two things the shell can no longer do for us:
1. umask 077 — moved to syscall.Umask in cmd/bangerd/main.go so every
firecracker child (and any other file the daemon creates) inherits
0600 by default. Single-user dev sandbox state should be private.
2. chown_watcher — the SDK's HTTP probe inside Machine.Start connects
to the API socket the moment it appears. Under sudo the socket is
created root-owned and the daemon's connect(2) gets EACCES, so the
post-Start EnsureSocketAccess never runs. The shell papered over
this with a backgrounded chown loop. Replaced by
fcproc.EnsureSocketAccessForAsync: same race-window guarantee, in
pure Go, kicked off in LaunchFirecracker right before Start and
awaited right after.
Tests updated: shell-substring assertions replaced with cmd-arg
assertions, plus a new fcproc test pinning the async chown sequence.
Smoke (full systemd two-service install + KVM scenarios) passes.
Move the supported systemd path to two services: an owner-user bangerd for
orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop,
and Firecracker ownership. This removes repeated sudo from daily vm and image
flows without leaving the general daemon running as root.
Add install metadata, system install/status/restart/uninstall commands, and a
system-owned runtime layout. Keep user SSH/config material in the owner home,
lock file_sync to the owner home, and move daemon known_hosts handling out of
the old root-owned control path.
Route privileged lifecycle steps through typed privilegedOps calls, harden the
two systemd units, and rewrite smoke plus docs around the supported service
model.
Verified with make build, make test, make lint, and make smoke on the
supported systemd host path.
Every non-happy branch in fcproc was zero-covered before this. Given
that EnsureSocketAccess gates the firecracker control plane on the
daemon's ability to chown the API + vsock sockets off root, those
failure paths are exactly the ones we need pinned.
New file internal/daemon/fcproc/fcproc_test.go adds a local scripted
Runner (fcproc is a leaf package — can't pull the daemon's
scriptedRunner in) and six tests:
waitForPath:
- TestWaitForPathReturnsDeadlineExceededWhenSocketNeverAppears —
timeout branch wraps context.DeadlineExceeded with the label,
and waits at least one poll tick before giving up
- TestWaitForPathReturnsOnceSocketAppears — happy path with a
mid-wait file creation via goroutine
- TestWaitForPathRespectsContextCancellation — ctx.Done() beats
the poll interval so a cancelled request doesn't stall
EnsureSocketAccess:
- TestEnsureSocketAccessChownFailureBubbles — chown error surfaces
untouched; chmod not attempted when chown fails
- TestEnsureSocketAccessChmodFailureBubbles — chmod error surfaces
after chown succeeds
- TestEnsureSocketAccessTimesOutBeforeTouchingRunner — ordering
contract: no sudo calls when the socket never materialises
Package function coverage moved 55.2% → 62.1%.
Integration-level chown-race test was considered (run a real shell
that exercises buildProcessRunner's script with a fake firecracker
binary) but skipped — requires `sudo -n` in the test env and makes
CI fragile. The socket-ownership regression this slice is meant to
guard against is covered at the unit level here; the
manual-smoke in the plan's verification section remains the
end-to-end check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the daemon socket, per-VM firecracker API socket, and vsock
socket were transiently world-exposed on hosts without XDG_RUNTIME_DIR:
the runtime directory landed in /tmp at 0755, Firecracker ran with
umask 000 (mode 0666 sockets), and only a follow-up chown/chmod in
EnsureSocketAccess tightened them. A local attacker could race into
bangerd.sock or the firecracker API socket during that window.
Three changes:
- internal/paths/paths.go: RuntimeDir is now created (and re-chmod'd if
stale) at 0700 unconditionally. When XDG_RUNTIME_DIR is unset and we
fall back to /tmp/banger-runtime-<uid>, Ensure() now verifies the
parent dir is owned by the current uid and 0700 mode — refusing to
place sockets inside a directory someone else created. Symlink swaps
rejected via Lstat.
- internal/firecracker/client.go: launch firecracker with umask 077
instead of umask 000 so the API socket is mode 0600 from birth. The
chown in fcproc.EnsureSocketAccess still transfers ownership from
root to the invoking user afterwards.
- internal/daemon/fcproc/fcproc.go: EnsureSocketDir now creates (and
re-chmod's) the runtime socket directory at 0700.
Tests cover the tightening path — an existing 0755 RuntimeDir is
re-chmod'd on Ensure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the host-side firecracker primitives — bridge setup, socket dir,
binary resolution, tap creation, socket chown, PID lookup, resolve,
ctrl-alt-del, wait-for-exit, SIGKILL — plus the shared
ErrWaitForExitTimeout sentinel and a small waitForPath helper into
internal/daemon/fcproc.
Manager is stateless beyond its runner + config + logger. The daemon
package keeps thin forwarders (d.ensureBridge, d.createTap, etc.) so no
call site or test changes. A d.fc() helper builds a Manager on demand
from Daemon state, which lets tests keep constructing &Daemon{...}
literals without wiring fcproc explicitly.
This unblocks Phase 4 (imagemgr extraction): imagebuild.go's dependence
on d.createTap/d.firecrackerBinary/etc. can now be satisfied by
importing fcproc instead of reaching back to *Daemon.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>