banger

Author	SHA1	Message	Date
Thales Maciel	3805b093b4	roothelper: tie kill/signal authorization to banger-launched firecracker validateFirecrackerPID was a substring check on /proc/<pid>/cmdline: "contains 'firecracker'". Good enough to refuse init/sshd/the test binary, but on a shared host where multiple users run firecracker the helper would happily SIGKILL someone else's VM. The owner-UID daemon could weaponise the helper as an arbitrary "kill any firecracker on this box" primitive. Replace the substring gate with two stronger acceptance modes: * Cgroup match (the supported path): /proc/<pid>/cgroup contains bangerd-root.service. systemd assigns every direct child of the helper unit into that cgroup at fork; the kernel keeps it there for the process's lifetime, so no daemon-UID code can forge it. Other users' firecracker processes live in different cgroups (user@<uid>.service, foreign service slices) and fail this check. Also robust across helper restarts: KillMode=control-group on the unit kills children when the service goes down, so an "orphan banger firecracker in some other cgroup" is rare by construction. * --api-sock fallback: cmdline carries `--api-sock <path>` with the path under banger's RuntimeDir. Covers the legacy direct (no-jailer) launch path, and gives daemon reconcile a way to clean up the rare orphan that lands outside the service cgroup after a hard helper crash. Tried /proc/<pid>/root first — pivot_root semantics make jailer'd firecracker read its root as "/" from any namespace, so the symlink is useless as a banger-managed fingerprint. Cgroup is the right signal. Also added a signal allowlist: priv.signal_process now rejects anything outside {TERM, KILL, INT, HUP, QUIT, USR1, USR2, ABRT} (case-insensitive, with or without SIG prefix). STOP/CONT, real-time signals, and numeric forms are refused — the helper running as root must not be a generic "send arbitrary signal to my pid" primitive. priv.kill_process is unaffected (it always sends KILL). Tests: validateSignalName covers allowlist + numeric/STOP/RTMIN rejection; extractFirecrackerAPISock pins the three flag forms (--api-sock VAL, --api-sock=VAL, -a VAL); pathIsUnder gets a small table; existing TestValidateFirecrackerPID still rejects PID 0, PID 1, and the test process itself. Doctor's non-system-mode test gained a t.TempDir-backed install path so it stops being environment-dependent on machines that happen to have /etc/banger/install.toml. Smoke at JOBS=4 still green — every banger-launched firecracker sails through the cgroup match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 16:00:41 -03:00
Thales Maciel	4a56e6c7d6	roothelper: walk validateManagedPath components, reject symlinks validateManagedPath was textual-only: filepath.Clean + dest-prefix match. That stopped `..` escapes but not the symlink-bypass attack that motivated this fix — a daemon-UID attacker can write into StateDir/RuntimeDir (it's their UID), so they can plant `<StateDir>/redirect -> /etc` and any helper RPC that then operates on `<StateDir>/redirect/...` resolves through the symlink at the kernel and lands at /etc/... on the host. Concretely the leaks this closed: * priv.create_dm_snapshot: rootfs/cow paths fed to losetup — losetup follows the symlink and attaches a host block device. * priv.launch_firecracker: kernel/initrd paths hard-linked into the chroot via `ln -f` — link(2) on Linux follows source symlinks, hard-linking host files into the jail. * priv.read_ext4_file / priv.write_ext4_files: image paths fed to debugfs / e2cp as root. * validateLaunchDrivePath: drive paths mknod'd or hard-linked. * validateJailerOpts: chroot base. Fix: after the existing prefix match, walk every component below the matched root with Lstat. Any existing symlink — leaf or intermediate — fails the validator. ENOENT is tolerated because several callers pass paths firecracker/the helper materialise later (sockets, log files, kernel hard-link targets); whoever materialises them goes through the same validation when the helper-side primitive runs. Subsumes most of validateNotSymlink's coverage but the explicit call sites (methodEnsureSocketAccess, methodCleanupJailerChroot) keep their belt-and-braces check — those paths must EXIST and not be symlinks, which validateNotSymlink enforces strictly while the broadened validateManagedPath tolerates ENOENT. Race-free in practice: helper RPCs are short and the validator fires on the same kernel state the next syscall sees. The helper loop processes RPCs serially per-connection, and the validator plus the syscall both run as root within microseconds of each other. Four new tests cover symlink leaf, symlink intermediate, missing leaf (must pass), and the plain happy path. Smoke at JOBS=4 still green — every legitimate daemon-supplied path passes the walk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:26:56 -03:00
Thales Maciel	0a079277ef	imagepull: reject symlink ancestors during OCI flatten safeJoin previously did textual cleaning + dest-prefix check only. That's enough to catch `../escape`, but not the symlink-ancestor attack: a malicious OCI layer plants `etc -> /tmp/probe`, a later layer writes/deletes/hardlinks against `etc/anything`, and the kernel silently dereferences the symlink so the operation lands at `/tmp/probe/anything` on the host. The daemon runs flatten as the owner UID, so anywhere that UID can write becomes a write target; anywhere it can delete (e.g. its own home) becomes a delete target. Whiteouts and hardlinks make this worse — a whiteout for `etc/.wh.victim` would `RemoveAll` the host file `/tmp/probe/victim`, and a TypeLink would expose host files inside the extracted rootfs. safeJoin now Lstat-walks every intermediate component of the joined path against the already-extracted tree, refusing if any ancestor is a symlink. Walking is race-free against the extraction loop because we process tar entries serially. Leaf components stay caller-owned (TypeSymlink writes legitimately want a symlink leaf; TypeReg RemoveAll's any prior leaf before opening; etc.). Three new tests pin the protection: write through a symlinked ancestor, whiteout through a symlinked ancestor, and hardlink target through a symlinked ancestor — each must fail and leave the host probe path untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:20:46 -03:00
Thales Maciel	8bfa525568	test: cover imagemgr + dmsnap helpers Both packages had zero tests before this change. The helpers in them are pure (imagemgr) or scripted-runner-friendly (dmsnap), so they're cheap to pin and worth catching regressions on. imagemgr/paths_test.go: * DebianBasePackages returns a defensive copy (mutating the result can't poison subsequent calls — important because hashPackages digests this list). * BuildMetadataPackages stays in lockstep with DebianBasePackages. * hashPackages is order-sensitive and includes a trailing newline in its canonical join (regression guard for any future "sort the list before hashing" temptation that would invalidate every on-disk hash). * StageOptionalArtifactPath returns "" for empty/whitespace input and joins by name otherwise. * WritePackagesMetadata writes <rootfs>.packages.sha256 with the expected hash, no-ops on empty rootfs path or empty package list. * DebianBasePackages contains the small critical-package floor (ca-certificates, curl, git) so a future apt-list trim can't silently drop them. dmsnap/dmsnap_test.go: * Create runs losetup base, losetup cow, blockdev getsz, dmsetup create in that order, with a snapshot table referencing the loops in (base, cow) order — a swap would corrupt every VM. * Create's failure path unwinds with losetup -d on cow then base. * Cleanup tears down dmsetup before losetup (otherwise dmsetup sees EBUSY against vanished backing devices). * Cleanup falls back to DMDev when DMName is empty. * Cleanup tolerates "No such device" on losetup -d (idempotent re-run after a partial cleanup). * Cleanup surfaces non-missing losetup errors (the tolerance is narrow on purpose). * Remove returns nil on a missing target and surfaces non-retryable errors immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:13:49 -03:00
Thales Maciel	45826f0db0	docs: add config.md reference for the daemon TOML schema README previously punted on the config schema with a "full key list in internal/config/config.go" pointer. New docs/config.md walks every TOML key the daemon reads — top-level, [vm_defaults], [[file_sync]] — with type, default, and a one-sentence description per row, plus a copy-pasteable example at the bottom. Sourced 1:1 from internal/config/config.go's fileConfig (and the defaults in load() + internal/model/types.go), so it stays accurate as long as those structs are the schema source of truth. README's existing config section now points at docs/config.md, and the "Further reading" list gets it as the first bullet. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:11:18 -03:00
Thales Maciel	7d7c15a370	docs: fix config-file path in privileges.md The filesystem-mutations table referred to `~/.config/banger/banger.toml`, but the daemon reads `~/.config/banger/config.toml` (per internal/config/config.go and README.md). Bring privileges.md in line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:11:06 -03:00
Thales Maciel	0c77b042ed	build: add pre-commit hook gating lint + test + build `.githooks/pre-commit` runs `make lint test build` on every commit, catching unformatted Go (`gofmt -l`), `go vet` regressions, shellcheck errors on scripts/, broken unit tests, and broken builds before they reach the index. Activate per-clone with `make install-hooks`, which points `core.hooksPath` at `.githooks/`. Bypass for in-flight WIP commits with `git commit --no-verify`. The hook directory is tracked in git (unlike .git/hooks/) so a clone + `make install-hooks` is enough to opt in; no per-machine hand-installation. .PHONY and the help line both list the new target. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:08:41 -03:00
Thales Maciel	6b4e1922b0	model: gofmt VMRecord struct alignment Stats and Workspace fields landed in `6b543cb` with column alignment that gofmt wants to pull tighter; rerun gofmt so the new pre-commit hook's `gofmt -l` gate passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:08:12 -03:00
Thales Maciel	3e6d0cee89	doctor: surface security-posture drift in `banger doctor` `docs/privileges.md` now documents what the install promises (helper + daemon services active, sockets at 0600 ownerUID, units carrying the hardening directives, firecracker root-owned + non-writable). Doctor verifies the running install matches: drift between the doc and the filesystem would silently weaken the trust model otherwise. In system mode (install.toml present): * helper service / owner daemon service: `systemctl is-active`. * helper socket / daemon socket: stat-and-compare mode + uid against the registered owner. * helper unit hardening / daemon unit hardening: scan the rendered unit for NoNewPrivileges, ProtectSystem=strict, ProtectHome (=yes for the helper, =read-only for the daemon), RestrictSUIDSGID, LockPersonality, and the helper's CapabilityBoundingSet line. The daemon unit also pins User=<registered owner>. * firecracker binary ownership: regular file, not a symlink, mode not group/world writable, executable, owned by uid 0 — same constraints validateRootExecutable enforces at launch, surfaced once at doctor time so a misconfigured binary fails fast with a clearer error than the helper's open-time rejection. In non-system mode (no /etc/banger/install.toml) doctor emits a single WARN row pointing at docs/privileges.md > 'Running outside the system install'. A PASS would imply guarantees the install isn't actually providing. Tests cover both branches: the non-system warn pins its message substrings; system-mode pins that every check name shows up; and the helpers (socket-perms, unit-hardening, executable-ownership) have direct table-style negative tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 14:58:34 -03:00
Thales Maciel	853249dec2	roothelper: tighten input validation across privileged RPCs Defence-in-depth pass over every helper method that touches the host as root. Each fix narrows what a compromised owner-uid daemon could ask the helper to do; many close concrete file-ownership and DoS primitives that the previous validators didn't reach. Path / identifier validation: * priv.fsck_snapshot now requires /dev/mapper/fc-rootfs-* (was "is the string non-empty"). e2fsck -fy on /dev/sda1 was the motivating exploit. * priv.kill_process and priv.signal_process now read /proc/<pid>/cmdline and require a "firecracker" substring before sending the signal. Killing arbitrary host PIDs (sshd, init, …) is no longer a one-RPC primitive. * priv.read_ext4_file and priv.write_ext4_files now require the image path to live under StateDir or be /dev/mapper/fc-rootfs-. priv.cleanup_dm_snapshot validates every non-empty Handles field: DM name fc-rootfs-, DM device /dev/mapper/fc-rootfs-, loops /dev/loopN. * priv.remove_dm_snapshot accepts only fc-rootfs-* names or /dev/mapper/fc-rootfs-* paths. * priv.ensure_nat now requires a parsable IPv4 address and a banger-prefixed tap. * priv.sync_resolver_routing and priv.clear_resolver_routing now require a Linux iface-name-shaped bridge name (1–15 chars, no whitespace/'/'/':') and, for sync, a parsable resolver address. Symlink defence: * priv.ensure_socket_access now validates the socket path is under RuntimeDir and not a symlink. The fcproc layer's chown/chmod moves to unix.Open(O_PATH\|O_NOFOLLOW) + Fchownat(AT_EMPTY_PATH) + Fchmodat via /proc/self/fd, so even a swap of the leaf into a symlink between validation and the syscall is refused. The local-priv (non-root) fallback uses `chown -h`. * priv.cleanup_jailer_chroot rejects symlinks at both the leaf (os.Lstat) and intermediate path components (filepath.EvalSymlinks + clean-equality). The umount sweep was rewritten from shell `umount --recursive --lazy` to direct unix.Unmount(MNT_DETACH \| UMOUNT_NOFOLLOW) per child mount, deepest-first; the findmnt guard remains as the rm-rf safety net. Local-priv mode falls back to `sudo umount --lazy`. Binary validation: * validateRootExecutable now opens with O_PATH\|O_NOFOLLOW and Fstats through the resulting fd. Rejects path-level symlinks and narrows the TOCTOU window between validation and the SDK's exec to fork+exec time on a healthy host. Daemon socket: * The owner daemon now reads SO_PEERCRED on every accepted connection and refuses any UID that isn't 0 or the registered owner. Filesystem perms (0600 + ownerUID) already enforced this; the check is belt-and-braces in case the socket FD is ever leaked to a non-owner process. Docs: * docs/privileges.md walked end-to-end. Each helper RPC's Validation gate row reflects what the code actually enforces. New section "Running outside the system install" calls out the looser dev-mode trust model (NOPASSWD sudoers, helper hardening bypassed) so users don't deploy that path on shared hosts. Trust list updated to include every new validator. Tests added: validators (DM-loop, DM-remove-target, DM-handles, ext4-image-path, iface-name, IPv4, resolver-addr, not-symlink, firecracker-PID, root-executable variants), the daemon's authorize path (non-unix conn rejection + unix conn happy path), the umount2 ordering contract (deepest-first + --lazy on the sudo branch), and positive/negative cases for the chown-no-follow fallback. Verified end-to-end via `make smoke JOBS=4` on a KVM host. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 14:39:41 -03:00
Thales Maciel	6b543cb17f	firecracker: adopt firecracker-jailer for VM launch (Phase B) Each VM's firecracker now runs inside a per-VM chroot dropped to the registered owner UID via firecracker-jailer. Closes the broad ambient- sudo escalation surface that survived Phase A: the helper still needs caps for tap/bridge/dm/loop/iptables, but the VMM itself no longer runs as root in the host root filesystem. The host helper stages each chroot up front: hard-links the kernel and (optional) initrd, mknods block-device drives + /dev/vhost-vsock, copies in the firecracker binary (jailer opens it O_RDWR so a ro bind fails with EROFS), and bind-mounts /usr/lib + /lib trees read-only so the dynamic linker can resolve. Self-binds the chroot first so the findmnt-guarded cleanup can recurse safely. AF_UNIX sun_path is 108 bytes; the chroot path easily blows past that. Daemon-side launch pre-symlinks the short request socket path to the long chroot socket before Machine.Start so the SDK's poll/connect sees the short path while the kernel resolves to the chroot socket. --new-pid-ns is intentionally disabled — jailer's PID-namespace fork makes the SDK see the parent exit and tear the API socket down too early. CapabilityBoundingSet for the helper expands to add CAP_FOWNER, CAP_KILL, CAP_MKNOD, CAP_SETGID, CAP_SETUID, CAP_SYS_CHROOT alongside the existing CAP_CHOWN/CAP_DAC_OVERRIDE/CAP_NET_ADMIN/CAP_NET_RAW/ CAP_SYS_ADMIN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 14:38:07 -03:00
Thales Maciel	d73efe6fbc	firecracker: drop sudo sh -c, race chown against SDK probe in Go Replace the shell-string launcher in buildProcessRunner with a direct exec.Command. The previous sh -c wrapper relied on shellQuote escaping for every MachineConfig field that flowed into the launch script; any future field that ever carried an attacker-controlled value would have become RCE-as-root. The new path passes binary path and flags as separate argv entries, so there is no shell to interpret anything. The wrapper also did two things the shell can no longer do for us: 1. umask 077 — moved to syscall.Umask in cmd/bangerd/main.go so every firecracker child (and any other file the daemon creates) inherits 0600 by default. Single-user dev sandbox state should be private. 2. chown_watcher — the SDK's HTTP probe inside Machine.Start connects to the API socket the moment it appears. Under sudo the socket is created root-owned and the daemon's connect(2) gets EACCES, so the post-Start EnsureSocketAccess never runs. The shell papered over this with a backgrounded chown loop. Replaced by fcproc.EnsureSocketAccessForAsync: same race-window guarantee, in pure Go, kicked off in LaunchFirecracker right before Start and awaited right after. Tests updated: shell-substring assertions replaced with cmd-arg assertions, plus a new fcproc test pinning the async chown sequence. Smoke (full systemd two-service install + KVM scenarios) passes.	2026-04-27 20:14:01 -03:00
Thales Maciel	c4e1cb5953	daemon: tighten concurrency around pulls, cleanup, and handle persistence Four targeted fixes from a race-condition audit of the daemon package. None change behaviour on the happy path; each closes a window where a concurrent or interrupted RPC could strand state on the host. - KernelDelete now holds the same per-name lock as KernelPull / readOrAutoPullKernel. Without it, a delete racing a concurrent pull could remove files mid-write or land between the pull's manifest write and its first use. - cleanupRuntime no longer early-returns on an inner waitForExit failure; DM snapshot, capability, and tap teardown always run and every error is folded into the returned errors.Join. EBUSY against a still-alive firecracker is benign and surfaces in the joined error rather than stranding kernel state across daemon restarts. - Per-name image / kernel pull locks switch from *sync.Mutex to a 1-buffered chan struct{}. Acquire is a select on ctx.Done(), so a peer waiting behind a pull whose RPC was cancelled can bail out instead of blocking forever on a pull nobody is consuming. - setVMHandles writes the per-VM scratch file before updating the in-memory cache. A daemon crash between the two now leaves disk ahead of memory (recoverable: reconcile re-seeds the cache from the file on next start) rather than memory ahead of disk (lost handles → stranded DM/loops/tap). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 19:32:43 -03:00
Thales Maciel	777b597a1e	smoke: smol VMs by default + JOBS auto-detects nproc Three quality-of-life improvements now that the daemon-side races that gated parallel mode are fixed: 1. Smol VMs by default. Smoke installs a tuned config.toml at /etc/banger/config.toml between `system install` and `system restart` so the respawned daemon picks up: vcpu = 2 memory_mib = 1024 disk_size = "2G" system_overlay_size = "2G" Smoke scenarios assert behavior, not capacity — they don't need 4 vCPU / 8 GiB / 8 GiB / 8 GiB. Per-VM RAM cost drops from 8 GiB to 1 GiB; nominal disk drops from 16 GiB to 4 GiB (sparse, so actual use is small either way, but the new ceiling is gentler on hosts that can't overcommit). Scenarios that test reconfiguration (vm_set's --vcpu 2 → 4) still pass --vcpu explicitly, so this default doesn't perturb their assertions. 2. JOBS defaults to nproc. The Makefile resolves JOBS to `$(shell nproc)` if unset; the smoke script's existing cap of 8 keeps the parallel pool sane on bigger hosts. The script always passes --jobs N now, so behavior is consistent. Override with `make smoke JOBS=1` for a fully serial run. 3. Help text catches up. --help no longer flags parallelism as experimental (the underlying daemon races are fixed) and now describes the small-VM default. `make help` mentions the new default and how to override. Verified: `make smoke` (no JOBS) on a 32-core box auto-runs with JOBS=8, smol VMs, 21/21 PASS in 172s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 17:36:17 -03:00
Thales Maciel	72882e45d7	daemon: serialise concurrent image/kernel pulls + atomic-rename seed refresh Three concurrency bugs surfaced by `make smoke JOBS=4` that all stem from `vm.create` paths assuming single-caller semantics: 1. Kernel auto-pull manifest race. Parallel `vm.create` calls that each need to auto-pull the same kernel ref both run kernelcat.Fetch in parallel against the same /var/lib/banger/kernels/<name>/. Fetch writes manifest.json non-atomically (truncate + write); the peer reads it back mid-write and trips "parse manifest for X: unexpected end of JSON input". Fix: per-name `sync.Mutex` map on `ImageService` (kernelPullLock). `KernelPull` and `readOrAutoPullKernel` both acquire it and re-check `kernelcat.ReadLocal` after the lock so a peer who finished while we waited is treated as success — `readOrAutoPullKernel` does NOT call `s.KernelPull` because that path errors with "already pulled" on a peer-success, which would be wrong for auto-pull. Different kernels stay parallel. 2. Image auto-pull race. Same shape as the kernel race but on the image side: parallel `vm.create` calls both run pullFromBundle / pullFromOCI for the missing image (each ~minutes of OCI fetch + ext4 build). The publishImage atom under imageOpsMu only protects the rename + UpsertImage commit, so the loser does all the work only to fail at the recheck with "image already exists". Fix: per-name `sync.Mutex` map on `ImageService` (imagePullLock). `findOrAutoPullImage` acquires it, re-checks FindImage, and only then calls PullImage. Loser short-circuits with the freshly-published image instead of redoing minutes of work. PullImage's own publishImage recheck stays as defense-in-depth for callers that bypass the auto-pull path. 3. Work-seed refresh race. When the host's SSH key has rotated since an image was last refreshed, `ensureAuthorizedKeyOnWorkDisk` triggers `refreshManagedWorkSeedFingerprint`, which rewrote the shared work-seed.ext4 in place via e2rm + e2cp. Peer `vm.create` calls doing parallel `MaterializeWorkDisk` rdumps observed a torn ext4 image — "Superblock checksum does not match superblock". Fix: stage the rewrite on a sibling tmpfile (`<seed>.refresh.<pid>-<ns>.tmp`) and atomic-rename. Concurrent readers either have the file open (kernel keeps the pre-rename inode alive) or open after the rename (see the new inode) — never observe a partial state. Two parallel refreshes are idempotent (same daemon, same SSH key) so unique tmp names are enough; whichever rename lands last wins, with identical content. UpsertImage runs after the rename so the recorded fingerprint always matches what's on disk. Plus one smoke harness fix: reclassify `vm_prune` from `pure` to `global`. `vm prune -f` removes ALL stopped VMs system-wide, not just the ones the scenario created — so a parallel peer scenario that happens to have its VM in `created`/`stopped` momentarily gets wiped. Moving prune to the post-pool serial phase keeps it from racing with in-flight scenarios. After all four fixes, `make smoke JOBS=4` passes 21/21 in 174s (serial baseline 141s; the small overhead is the buffered-output and `wait -n` semaphore cost — well worth the parallelism for fast-iter work on a 32-core box). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 17:24:11 -03:00
Thales Maciel	115eec8576	smoke: discoverable scenarios + selectable runs + parallel dispatch `scripts/smoke.sh` was a 600-line linear script: no way to see what it covers without reading the whole thing, and no way to run a single scenario when iterating. Every iteration paid the full ~5-10 min suite, which made fast feedback loops painful enough to avoid the suite. Refactor into a registry + per-scenario functions: - Top-of-file SMOKE_SCENARIOS (ordered) + SMOKE_DESCS (one-line desc per scenario) + SMOKE_CLASS (pure / repodir / global) drive both listing and dispatch. The 21 existing scenario blocks become scenario_<name> functions. Bodies are the inline blocks verbatim, modulo the workspace fixture move described below. - New CLI: --list (cheap discovery, no install / no env-vars), --scenario NAME (or NAME,NAME,...), --jobs N (parallel dispatch), -h / --help. - New setup_fixtures runs once after the install/doctor/restart preamble and produces the throwaway git repo at $repodir that 'repodir'-class scenarios consume. Lifted out of scenario_workspace_run so single- scenario invocations (e.g. --scenario workspace_dryrun) get the fixture even when the scenario that historically built it isn't selected. - Wipe ~/.local/state/banger/ssh/known_hosts in the install preamble. `system uninstall --purge` clears /var/lib/banger but the user-side known_hosts persists by design — and smoke creates VMs that reuse guest IPs (172.16.0.2 etc.) with fresh host keys every run, so a leftover entry trips StrictHostKeyChecking and the daemon's wait- for-ssh sees only timeouts. This was the real cause of the "guest ssh did not come up" flakes that surface across smoke iterations. Parallel dispatch: - --jobs N opts into a slot-limited pool: 'pure' scenarios fan out as individual jobs; 'repodir' scenarios fuse into a single serial chain (since they mutate $repodir in registry order); 'global' scenarios run serially after the pool, one at a time. - Cap is min(N, 8) — each parallel slot runs an 8 GiB VM, so RAM is the binding constraint. - Parallel-mode stdout/stderr per scenario buffer to per-scenario logs and emit one PASS/FAIL line on completion; on FAIL the buffer is dumped. Serial mode (--jobs 1, the default) keeps stdout unbuffered exactly as before. - Parallelism is documented as experimental in --help: it surfaces real daemon-side concurrency bugs (image auto-pull manifest race, work-seed-refresh race on the shared work-seed.ext4) that don't appear in serial mode and that need their own fix in the daemon. Serial (--jobs 1) is the reliable path; --jobs N is for fast- iteration dev work where occasional re-runs are acceptable. Exit codes: 0 ok, 1 assertion failed, 2 usage error (unknown scenario, missing SCENARIO=), 77 explicit selection skipped (NAT when sudo iptables is unavailable AND nat is the only selected scenario; soft-skip otherwise). Makefile additions: - `make smoke-list` — cheap discovery, no smoke-build dep, no env vars. - `make smoke-one SCENARIO=name` — single-scenario run, full preamble. MAKECMDGOALS guard catches missing SCENARIO= before any rebuild. - `make smoke JOBS=N` — passes through to the script's --jobs N. - Help text covers all three. Verified: serial full suite passes 21/21 in ~140s on this host; make smoke-one SCENARIO=workspace_restart runs the recently-added regression test alone in ~50s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 16:56:57 -03:00
Thales Maciel	c9358ab390	daemon: sync guest over ssh before stop to preserve workspace writes VM stop has been quietly losing data freshly written via `vm workspace prepare`: stop+start of a workspace-prepared VM would come back with /root/repo wiped on the work disk. Root cause is firecracker + Debian's systemd defaults. FC's SendCtrlAltDel (the only "graceful shutdown" action FC exposes) just delivers the keystroke; what the guest does with it is its choice. Debian routes ctrl-alt-del.target -> reboot.target, so the guest reboots, FC stays alive, the daemon's 10s wait_for_exit window expires, and the SIGKILL fallback drops anything still in FC's userspace I/O path. For an idle VM that's invisible. For one that just took 100s of small writes through a workspace prepare, it's data loss. Fix is to dial the guest over SSH inside StopVM and run `sync; systemctl --no-block poweroff \|\| /sbin/poweroff -f &` before the existing SendCtrlAltDel path. The synchronous `sync` is the load-bearing piece — it blocks until every dirty page hits virtio-blk and lands in the on-host root.ext4. Whether poweroff completes before SIGKILL fires is incidental; sync has already run. SSH unreachable falls back to the old SendCtrlAltDel behaviour so a broken-network guest can't make stop hang. Bounded by a 5s SSH-dial timeout so a half-broken guest can't extend the overall stop window past gracefulShutdownWait. Also adds two smoke scenarios: - `workspace + stop/start`: prepare -> stop -> start -> assert marker survives. This is the regression that caught the bug. - `vm exec`: end-to-end coverage for `d59425a` — auto-cd into the prepared workspace, exit-code propagation, dirty-host warning, --auto-prepare resync, refusal on stopped VM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 15:41:32 -03:00
Thales Maciel	d59425adb9	feat(vm): add vm exec command with workspace dirty detection Introduces three interconnected features for persistent VM workflows: 1. `banger vm exec <vm> -- <cmd>`: runs a command in the prepared workspace, automatically cd-ing into the guest path and wrapping via `mise exec --` so mise-managed tools are on PATH. Falls back to a plain exec when mise isn't available. Exit code propagates verbatim. 2. Workspace persistence: workspace.prepare now stores the guest path, host source path, and HEAD commit into a new `workspace_json` column on the vms table (migration 3). This state survives daemon restarts and informs both dirty-checking and auto-prepare. 3. Dirty detection: `vm exec` compares the stored HEAD commit against the current host repo HEAD. When stale it warns and, with --auto-prepare, re-syncs the workspace before running. Also: - WORKSPACE column added to `banger ps` / `vm list` - `banger vm` quick reference updated with `vm exec` entry	2026-04-26 23:53:45 -03:00
Thales Maciel	c8637b0fe4	daemon: auto-trust mise configs on workspace prepare vm run ./repo (and the explicit vm workspace prepare) imports the host user's own checkout. Any .mise.toml that lands in the guest would otherwise prompt on the first guest command — 'mise trust: hash mismatch, run "mise trust"' — and stall what should be a zero-friction sandbox launch. The repo just came from the host, the guest is single-tenant root@<vm>.vm, the user already trusts this checkout: auto-trust is the right default here. After workspaceImportHook succeeds, run if command -v mise >/dev/null 2>&1; then mise trust --quiet --all <guest_path> \|\| true fi inside the guest. Best effort: a missing mise binary, a non-zero exit, or a no-op trust all log at debug only and never fail prepare. The path is shell-quoted via ws.ShellQuote so guest paths with spaces or quotes don't break the argument. Tests pin the script shape (command -v guard + --quiet --all flag + trailing `\|\| true`) and assert the script actually fires after a successful import. A path with an apostrophe round-trips via ws.ShellQuote without truncation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 23:08:41 -03:00
Thales Maciel	fa4292756d	daemon: surface previously-swallowed errors at warn Three recovery-path errors were silently dropped: - vm_lifecycle.go startVMLocked persisted the VMStateError record with `_ = s.store.UpsertVM(...)`. If the persist failed the user saw the original start error but operators had no way to find out the store had also drifted out of sync. - vm_lifecycle.go deleteVMLocked killed the firecracker process with `_ = s.net.killVMProcess(...)`. cleanupRuntime tears it down regardless, so the explicit kill is best-effort, but a permission-denied / EPERM was still worth logging. - capabilities.go cleanupPreparedCapabilities collected per-cap errors with errors.Join. Callers get the aggregated value but couldn't tell which capability failed when more than one did. All three now log Warn before the original behaviour continues. The aggregate return value, control flow, and user-visible error strings are unchanged — this is purely a "less silence in the journal" pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 22:30:51 -03:00
Thales Maciel	71a332a6a1	cli: maturity polish — color, error translation, tabwriter consistency Adds three small but high-leverage presentation tweaks for v0.1: 1. internal/cli/style is a new ~70 LOC package with Pass/Fail/Warn/ Dim/Bold helpers. Each is TTY-gated and obeys NO_COLOR. No external dep. Wired into the doctor PASS/FAIL/WARN status, the "banger:" error prefix on stderr, and the dim 'ready in <elapsed>' line. 2. internal/cli/errors translates rpc.ErrorResponse into user-facing text. operation_failed becomes invisible (the message wins); not_found, already_exists, bad_request, bad_version, unauthorized, unknown_method get short labels; unknown codes pass through. The daemon-attached op_id lands in dim parens — paste into journalctl --grep to find the daemon log line that produced the failure. 3. Tabwriter config converges on (0, 8, 2, ' ', 0) across every list/table command. The vm prune confirmation table picked up the right config; system install + system status switched from bare "key: value\n" lines to tabular form. printVMSpecLine drops its Unicode middle dot for an ASCII '\|' so terminals without UTF-8 render cleanly. Tests cover translateRPCError for every code, style helpers no-op on non-TTY and under NO_COLOR. Smoke status greps switch from "key: value" to "key value" to match the new format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 22:27:07 -03:00
Thales Maciel	e47b8146dc	daemon: thread per-RPC op_id end-to-end Today there's no way to correlate a CLI failure with a daemon log line. operationLog records relative timing but no id, two concurrent vm.start calls log indistinguishably, and the async vmCreateOperationState.ID is user-facing yet never reaches the journal. The root helper logs plain text to stderr while bangerd logs JSON, so a merged journalctl is hard to grep across the trust-boundary split. Mint a per-RPC op id at dispatch entry, store it on context, and include it as an "op_id" attr on every operationLog record. The id is stamped onto every error response (including the early short-circuit paths bad_version and unknown_method). rpc.Call forwards the context op id on requests so a daemon RPC and the helper RPCs it triggers all share one id. The helper now logs JSON to match bangerd, adopts the inbound id, and emits a single "helper rpc completed" / "helper rpc failed" line per call so operators can see at a glance how long each privileged op took. vmCreateOperationState.ID is now the same id dispatch generated for vm.create.begin — one identifier between client status polls, daemon logs, and helper logs. The wire format gains two optional fields: rpc.Request.OpID and rpc.ErrorResponse.OpID, both omitempty so older peers (and the opposite direction) ignore them. ErrorResponse.Error() now appends "(op-XXXXXX)" to its string form when set; existing callers that just print err.Error() get the id for free. Tests cover: dispatch stamps op_id on unknown_method, bad_version, and handler-returned errors; rpc.Call exposes the typed *ErrorResponse via errors.As so the CLI can read code/op_id; ctx op_id is forwarded to the server in the request envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 22:13:44 -03:00
Thales Maciel	b8c48765fb	daemon: skip fsck_snapshot on freshly-created system overlays The fsck_snapshot lifecycle step exists to repair stale bitmaps in a COW file reused from a prior aborted start — without it, the later e2cp/e2rm calls in patch_root_overlay refuse to touch the snapshot. On a freshly-created COW there are no stale bitmaps to repair, so e2fsck -fy is pure overhead. system_overlay already tracks whether it created the file this run (sc.systemOverlayCreated, used to drive the rollback path). Reuse that flag to skip e2fsck entirely on the create-fresh path. The reused-COW path keeps the fsck for safety. Saves a few hundred ms per VM create — small absolute win on top of the lazy-mkfs change, but free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 21:37:14 -03:00
Thales Maciel	74a2d064fd	system: mkfs work disks with lazy_itable_init + lazy_journal_init mkfs.ext4 zeroes the entire inode table and journal at format time unless told otherwise. On an 8 GiB work disk that's roughly 500-700ms of host CPU/IO per 'banger vm create', for a one-time small per-write penalty inside the guest the first time it touches an unwritten inode that nobody can perceive. Centralise the canonical mkfs -E option list as system.MkfsExtraOptions and use it everywhere banger calls mkfs.ext4 on a VM-internal image: the no-seed work disk, MaterializeWorkDisk, BuildWorkSeedImage, and the imagepull rootfs builder. The work-disk paths feed vm create directly; the others are one-off but still benefit from the faster format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 21:32:57 -03:00
Thales Maciel	74e5a7cedb	cli: wait for the daemon socket to answer ping after install/restart systemd's Type=simple reports a unit "active" the moment its ExecStart binary is exec()'d, which for bangerd happens well before the daemon has read its config and bound /run/banger/bangerd.sock. 'banger system install' and 'banger system restart' both returned inside that window, so the very next 'banger ...' command would hit ensureDaemon, miss on a single ping, and exit with "service not reachable; run sudo banger system restart" — the same restart that had just succeeded. Smoke tripped over this on every run. Add waitForDaemonReady: poll daemonPing for up to 15s after the restart returns. Both the system install and restart paths now block until the daemon is genuinely accepting RPCs, so the next CLI invocation can talk to it without retrying. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 21:22:31 -03:00
Thales Maciel	679cf87cfd	cli: log elapsed time after vm create reaches ready Print '[vm create] ready in <elapsed>' to stderr once the create operation completes successfully. Surfaces how long the full create-to-ready cycle took (image resolve + work disk + boot + guest agents + capability post-start), which the per-stage progress lines don't add up to in any visible way. Format adapts to scale: sub-second prints as 'NNNms', sub-minute keeps one decimal ('4.7s'), longer prints as 'MmSSs'. Always emitted (not gated on TTY) so logged and CI output carry the number too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 21:17:47 -03:00
Thales Maciel	a3a51e06c4	daemon: build the work disk fresh instead of cloning the seed file Old flow on every 'banger vm run' that hit the seeded path: CopyFilePreferClone the seed file (FICLONE attempt + io.Copy + fsync fallback), then e2fsck -fp + resize2fs to grow the FS to the spec size. On filesystems without reflink support that meant pushing 512+ MiB through the kernel followed by a full filesystem check and resize, even though the seed only carries a few KB of dotfiles — minWorkSeedBytes is 512 MiB but the actual payload is tiny. That is the minute-long stall on the 'cloning work seed' stage users see today. Replace the copy with a sized fresh ext4: truncate to WorkDiskSizeBytes, mkfs.ext4 -F -E root_owner=0:0, debugfs rdump to extract the seed's contents, then ingest each file via the sudoless ext4 toolkit (MkdirExt4 / WriteExt4FileOwned, root:root, mode preserved). Sub-second regardless of seed size or requested work-disk size; no fsck or resize needed because the FS is created at its final size from the start. Also drop the now-implementation-pinned TestEnsureWorkDiskClonesSeedImageAndResizes — its premise (a scripted e2fsck/resize2fs sequence) no longer reflects the code, and smoke covers the new flow end to end. Stage label changed from 'cloning work seed' to 'applying work seed' to match what actually happens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 20:42:10 -03:00
Thales Maciel	6c37fec17b	images: remove the docker field The 'docker' bit on model.Image was unused at runtime — every code path that branched on it had been removed earlier, leaving only the field, the SQL column, the --docker flag, and the #feature:docker sentinel that BuildMetadataPackages emitted into a hash file. None of those have callers anymore. Strip the field from the model, the API params, the SQLite column, the CLI flag, and BuildMetadataPackages's signature. Add migration 2 (drop_images_docker) so existing installs lose the column on next daemon start. ALTER TABLE ... DROP COLUMN is fine: SQLite has supported it since 3.35 (2021). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 20:28:40 -03:00
Thales Maciel	408ad6756c	system: build work-seed without sudo BuildWorkSeedImage used to mount the source rootfs and the new seed image — both via sudo. After the privilege split (`59e48e8`) the owner daemon runs without sudo and those mounts fail silently inside the image-pull pipeline (runBuildWorkSeed swallows errors), so every freshly pulled image landed in the store with an empty WorkSeedPath and 'banger doctor' kept warning that /root would be empty. Rewrite the builder around the existing sudoless toolkit: 1. RdumpExt4Dir extracts /root from the source rootfs into a host tempdir (debugfs, no mount). 2. truncate + mkfs.ext4 -F -E root_owner=0:0 produces an empty user-owned ext4 file. 3. A Go walk over the staged tree calls MkdirExt4 / WriteExt4FileOwned for every dir + regular file, forcing root:root and preserving mode bits. Symlinks and special files in /root are skipped — extremely rare on a stock distro and not part of what makes a useful seed. Fix won't retroactively populate already-pulled images: re-pull the default image (e.g. 'banger image delete debian-bookworm && banger image pull debian-bookworm') to get a seeded work-seed.ext4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 20:18:23 -03:00
Thales Maciel	3ec357090a	daemon: doctor passes vm dns when banger itself owns the port The previous check tried to bind 127.0.0.1:42069 and warned on 'address already in use' — which is exactly the state when the banger daemon is running, the case the user ran 'doctor' to confirm. The warning was actively misleading. Now, on 'address already in use', probe the listener with a *.vm DNS query that only banger's vmdns server answers authoritatively (NXDOMAIN with Authoritative=true). If the shape matches we pass; if the port is held by something else we still warn. Tests cover both branches: a real vmdns server is accepted, and a silent UDP listener on the same port is rejected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 18:57:27 -03:00
Thales Maciel	35bfac3f13	cli: rewrite help text for AI-driven discovery Frontier models tend to discover a CLI by running --help, scanning the Long description, and inferring the dominant workflow from the examples. Today's banger help reads like a man page index — every verb has a one-line Short and nothing else. This rewrites the groups (banger, vm, vm workspace, image, kernel, system, ssh-config) so each landing page answers "what is this for, what's the 80% command, what comes next" in three to ten lines, with runnable examples. Also disambiguates the near-twin lifecycle commands so a model reading the subcommand index can tell stop/kill/delete apart at a glance: start Start a stopped VM stop Stop a running VM gracefully restart Stop then start a VM kill Force-kill a VM (use when 'vm stop' hangs) delete Stop a VM and remove its disks (irreversible) vm create / vm ssh / vm logs / vm show pick up Long descriptions and examples for the same reason. No behaviour changes; help text only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 15:02:08 -03:00
Thales Maciel	41ced66a54	mise: pin go and shellcheck go 1.25.0 matches go.mod's toolchain. shellcheck is the only non-go tool make lint hard-requires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 13:11:51 -03:00
Thales Maciel	b0b1300314	docs: add the privilege model document Explain what runs as the owner user vs root, every helper RPC method and its validation gate, the on-disk paths banger writes, network mutations, and how install/uninstall work end to end. The aim is to give a reader enough information to grant or refuse the privileges banger asks for during system install with their eyes open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 12:55:18 -03:00
Thales Maciel	47d83ce4d7	gitignore: exclude the entire build directory Replace the per-subdir entries with a single /build/ to cover any new outputs Make or scripts add later (build/manual exists today; future docs/coverage variants would otherwise need new lines). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 12:55:11 -03:00
Thales Maciel	59e48e830b	daemon: split owner daemon from root helper Move the supported systemd path to two services: an owner-user bangerd for orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop, and Firecracker ownership. This removes repeated sudo from daily vm and image flows without leaving the general daemon running as root. Add install metadata, system install/status/restart/uninstall commands, and a system-owned runtime layout. Keep user SSH/config material in the owner home, lock file_sync to the owner home, and move daemon known_hosts handling out of the old root-owned control path. Route privileged lifecycle steps through typed privilegedOps calls, harden the two systemd units, and rewrite smoke plus docs around the supported service model. Verified with make build, make test, make lint, and make smoke on the supported systemd host path.	2026-04-26 12:43:17 -03:00
Thales Maciel	3edd7c6de7	daemon: build a work-seed during image pull, refresh doctor check Before this change `banger image pull` (both OCI-direct and bundle paths) shipped images with an empty WorkSeedPath — the BuildWorkSeedImage helper existed only behind the hidden `banger internal work-seed` CLI. Every pulled image hit ensureWorkDisk's no-seed branch, and the guest booted with a bare /root (no .bashrc, no .profile, none of the distro defaults). Pull now calls BuildWorkSeedImage after the rootfs is finalised (OCI) or fetched (bundle). The builder is behind a new `workSeedBuilder` test seam so existing pull tests don't accidentally demand sudo mount. The build failure is non-fatal: any error logs a warning and leaves WorkSeedPath empty — images stay publishable even if the pulled rootfs has no /root to extract. Verified end-to-end by wiping the cached smoke image and re-pulling: work-seed.ext4 lands in the artifact dir next to rootfs.ext4, and all 21 smoke scenarios pass. Also refreshes the "feature /root work disk" fallback tooling check — the no-seed path no longer touches mount/umount/cp after commit `0e28504`, so the doctor check now only requires truncate + mkfs.ext4. The warn copy updates from "new VM creates will be slower" to "guest /root will be empty", which matches the actual tradeoff post-refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 20:24:10 -03:00
Thales Maciel	02773c1cf5	daemon: delete flattenNestedWorkHome and normaliseHomeDirPerms Both helpers are stranded: commit `f068536` dropped their last callers from ensureAuthorizedKeyOnWorkDisk and seedAuthorizedKeyOnExt4Image, and commit `6ab1a2b` dropped the ensureGitIdentity / runFileSync calls that still held them up. Every on-disk-patch code path now drives the ext4 image directly via MkdirExt4 / WriteExt4FileOwned / EnsureExt4RootPerms. Also drops TestFlattenNestedWorkHomeCopiesEntriesIndividually — premise gone with the function. The sshd_config_test comment referencing normaliseHomeDirPerms now points at EnsureExt4RootPerms. Net sudo reduction across the five-commit series: work-disk creation, authsync, image seeding, git identity sync, and file_sync all drop sudo entirely against user-owned ext4 files. Remaining sudo in internal/daemon is confined to firecracker process launch, tap/dm device setup, iptables/NAT, and dmsnap/fcproc — things that legitimately need CAP_SYS_ADMIN or CAP_NET_ADMIN. MountTempDir stays on exclusively as an image-build helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:33:06 -03:00
Thales Maciel	6ab1a2b844	daemon: rewrite git identity sync + file_sync on ext4 toolkit ensureGitIdentityOnWorkDisk, writeGitIdentity, runFileSync, and copyHostDir all dropped their mount + sudo install/mkdir/chmod/chown scaffolding. Every write now goes through MkdirExt4, WriteExt4FileOwned, ReadExt4File, and the new MkdirAllExt4 helper — all sudoless against user-owned ext4 images. Net effect with the prior two commits: ensureWorkDisk, authsync, image seeding, git identity sync, and file_sync no longer mount the work disk or spawn sudo mkdir/chmod/chown/cat/install. Only the image-build path (which legitimately produces root-owned artifacts) still touches MountTempDir. The filesystemRunner test harness grew a small debugfs/e2cp/e2rm emulator so the WorkspaceService tests keep exercising their real code paths without a live ext4 image. The mock is deliberately dumb — it only implements the subset runFileSync and writeGitIdentity drive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:29:30 -03:00
Thales Maciel	f0685366ec	daemon: rewrite authsync + image seeding on ext4 toolkit ensureAuthorizedKeyOnWorkDisk and seedAuthorizedKeyOnExt4Image both drove mount + sudo mkdir/chmod/chown/cat/install to patch /.ssh/authorized_keys into a work disk or work-seed. Both now delegate to a shared provisionAuthorizedKey helper that uses the ext4 toolkit introduced in `7704396` — EnsureExt4RootPerms + MkdirExt4 + Ext4PathExists/ReadExt4File + WriteExt4FileOwned. No mount, no sudo, no host-path staging. Drops ~10 sudo call sites from the VM create and image pull flows and deletes the TestEnsureAuthorizedKeyOnWorkDiskRepairsNestedRootLayout premise (flattenNestedWorkHome will disappear entirely in the next commit — the no-seed path no longer copies /root, and the work-seed path produces flat seeds). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:21:50 -03:00
Thales Maciel	0e28504892	daemon: rewrite ensureWorkDisk no-seed path to skip the mount + cp The no-seed branch used to mount the base rootfs read-only, mount the freshly mkfs'd work disk read-write, sudo-cp /root from one to the other, then flatten any accidental /root/root/ nesting. Five sudo call sites packed into a fallback that the common image path doesn't even exercise. Replace with: `mkfs.ext4 -F -E root_owner=0:0` and nothing else. mkfs already stamps inode 2 as root:root:0755 — sshd's StrictModes walks that dir's ownership when the work disk mounts at /root in the guest, so getting it right from mkfs means authsync can just write authorized_keys without any repair pass. Tradeoff: no-seed VMs lose the base rootfs's default /root dotfiles (.bashrc, .profile). The no-seed path is explicitly the degraded fallback — `banger doctor` already warns about it — and users who want those back have two documented knobs: rebuild the image with a work-seed, or land them via [[file_sync]]. Sudo call sites removed: 5 (MountTempDir × 2, sudo cp -a, flattenNestedWorkHome's chmod/cp/rm). flattenNestedWorkHome itself stays alive for now — authsync + image_seed still call it — and gets deleted in commit 5 once its last caller goes away. While here: fix the freshly-added EnsureExt4RootPerms helper. `set_inode_field <2> mode N` overwrites the full i_mode word instead of preserving the type nibble, so the initial implementation that passed just the permission bits (0755) would reset the fs root to regular-file shape and break the next kernel mount with "Structure needs cleaning." The corrected call OR's in S_IFDIR (0o040000) explicitly. Test updated to match. Smoke: 21/21 scenarios green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:09:32 -03:00
Thales Maciel	77043966d4	system: add ext4 toolkit for non-sudo work-disk writes The daemon mounts every VM's work disk on the host via sudo, copies files in as root, chmods+chowns them, and unmounts. That's ~18 of banger's runtime RunSudo calls. The ext4 image is a regular file the daemon user owns; e2cp / debugfs can write to it directly and bake uid/gid/mode into the filesystem metadata without the caller being root. `imagepull.ApplyOwnership` already proves this works in production (OCI layer flattening writes 0/0/root-owned inodes from an unprivileged daemon). This commit adds the toolkit layer. Callers land in the next four commits: - MkdirExt4 — idempotent directory create + metadata reset, single debugfs batch - WriteExt4FileOwned — e2cp + debugfs-driven uid/gid/mode, auto- cleans the host tempfile - SetExt4Ownership — sif + set_inode_field batch for existing inodes (no mkdir implied) - EnsureExt4RootPerms — fixes inode <2> (the fs root, which is `/root` once the work disk is mounted inside the guest), the thing sshd's StrictModes walks - Ext4PathExists — yes/no probe via `debugfs -R "stat ..."` with "File not found" detection - ReadExt4File — bytes-returning wrapper around the existing ReadDebugFSText with the same path rejection Design notes: - extfsRun auto-switches Run ↔ RunSudo on imagePath's type: regular files get the unprivileged path, block devices (dm-snapshot, loops) get sudo. The same helper works for both patchRootOverlay (dm device) and work-disk writes (user-owned file). No caller flag needed — os.Stat tells us. - debugfsScript batches set_inode_field + sif + mkdir lines into one `debugfs -w -f -` stdin invocation on any Runner that implements StdinRunner (production's system.Runner does). Matches imagepull.ApplyOwnership's existing pattern; dramatically cheaper than per-call subprocesses. - Paths are escaped for debugfs on the way in: spaces get double- quoted, double-quote/backslash/newline are rejected outright (debugfs's hand-rolled parser doesn't reliably escape those and we'd rather fail fast than silently scribble over the wrong inode). Tests: seven behaviour assertions via scripted + stdin-scripted runners — existence probe (found + missing + rejection), read passthrough, mkdir batch contents (new vs. pre-existing path), write tempfile cleanup + mode line shape, root-inode addressing, and the full rejectDebugfsUnsafePath matrix. No production wiring change in this commit — the helpers land unused. `make smoke` stays green (21/21) because nothing else shifted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:31:50 -03:00
Thales Maciel	d743a8ba4b	daemon: persist teardown fallbacks and reject unsafe import paths Preserve cleanup after daemon restarts and harden OCI and tar imports against filenames that debugfs cannot encode safely. Mirror tap, loop, and dm teardown identity onto VM.Runtime, teach cleanup and reconcile to fall back to those persisted fields when handles.json is missing or corrupt, and clear the recovery state on stop, error, and delete paths. Reject debugfs-hostile entry names during flattening and in ApplyOwnership itself, then add regression coverage for corrupt handles.json recovery and unsafe import paths. Verified with targeted go tests, make lint-go, make lint-shell, and make build.	2026-04-23 16:21:59 -03:00
Thales Maciel	86a56fedb3	daemon: extract StatsService sibling; shrink VMService's surface Closes commit 3 of the god-service decomposition. VMService still owned 45+ methods after the startVMLocked extraction and RPC table landed in commits 1 and 2. Stats / ports / health / vsock-ping sit in a corner of that surface that doesn't share any state with lifecycle orchestration — nothing about "what's this VM's CPU doing" belongs in the same service as Create/Start/Stop/Delete/Set. New StatsService owns: - GetVMStats / getVMStatsLocked / collectStats (stats collection) - HealthVM / PingVM (vsock-agent health probe) - PortsVM + buildVMPorts + probeWebListener + probeHTTPScheme + dedupeVMPorts (listening-port enumeration) - pollStats (background ticker refresh) - stopStaleVMs (auto-stop sweep past config.AutoStopStaleAfter) The three VMService touch-points stats genuinely needs — vmAlive, vmHandles, the per-VM lock helpers, plus cleanupRuntime for the stale-sweep tear-down — come in as function-typed closures, not a *VMService pointer. StatsService has no back-reference to its sibling. Mirrors the dependency-struct pattern WorkspaceService already uses. Wiring: d.stats is populated in wireServices AFTER d.vm (closures must see a non-nil d.vm at call time). Dispatch table's four entries (vm.stats / vm.health / vm.ping / vm.ports) now resolve through d.stats. Background loop's pollStats / stopStaleVMs tickers do the same. Dispatch surface from the RPC client's perspective is byte-identical. After this commit: - vm_stats.go and ports.go are deleted; their content (plus the stats-specific fields) lives in stats_service.go. - VMService loses 12 methods. It's still the biggest service (~30 methods, all lifecycle-supporting: handle cache, disk provisioning, preflight, create-ops registry, lock helpers, the lifecycle verbs themselves) but it's finally one coherent concern instead of five. Tests: - TestWireServicesInstantiatesStatsService — pins that the wiring order puts d.stats non-nil + its five closures all populated. Prevents a silent background-loop regression. - All existing tests that called d.vm.HealthVM / d.vm.PingVM / d.vm.PortsVM / d.vm.collectStats were re-pointed at d.stats. Smoke: all 21 scenarios green, including vm ports (exercises the new PortsVM entry end-to-end) and the long-running workspace scenarios (exercise the background stats poller implicitly). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:46:59 -03:00
Thales Maciel	366e1560c9	daemon: replace RPC switch with generic method-to-handler table The dispatch method was a single ~240-line switch of 34 cases, each following the same pattern: decode params into some type P, call a service method returning (R, error), wrap R in a result struct and either marshalResultOrError-encode or return a raw rpc.NewError. Adding a method was a 4-line ceremony per site, and grepping for "methods banger speaks" meant reading the full switch. New shape, in internal/daemon/dispatch.go: - handler is the uniform `func(ctx, d, req) rpc.Response` type every method dispatches through. - paramHandler[P, R] is the generic wrapper that absorbs 28 of the 34 cases (decode, call, marshal). No reflection — P and R are deduced from the service-call literal, so each map entry is a one-liner referencing a small adapter func. - noParamHandler[R] is the decode-free variant for 6 methods that don't carry params. - rpcHandlers is the single source of truth for which methods exist and which adapter they dispatch to. - Four specials (ping, shutdown, vm.logs, vm.ssh) stay as named `handler`-typed functions: ping/shutdown encode with raw rpc.NewResult, vm.logs/vm.ssh need pre-service validation to emit distinct error codes (not_found, not_running) that the generic wrapper maps uniformly to operation_failed. Daemon.dispatch shrinks from a 240-line switch to 11 lines: version check, test-only handler short-circuit, table lookup, invoke-or-unknown. Tests: - TestRPCHandlersMatchDocumentedMethods — keyset guard. Adding or removing a method without updating the expected slice is a red flag the test surfaces. - TestRPCHandlersAllNonNil — catches nil-function registrations. All pre-existing dispatch tests (param decode, error codes, etc.) keep passing unchanged — the handler contract for any given method is byte-identical from the RPC client's perspective. Smoke (all 21 scenarios) exercises every code path end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:40:08 -03:00
Thales Maciel	11a33604c0	daemon: extract startVMLocked into step runner with per-step rollback startVMLocked was a ~260-line method running 18 sequential phases with one lumped error path: on any failure, cleanupOnErr called cleanupRuntime — a catch-all teardown that didn't distinguish "this phase acquired resources we should undo" from "this phase is idempotent." The blast radius was the entire VM lifecycle. Every tweak to boot, NAT, disk, or auth-sync orchestration had to reason about a closure that could fire at any of 18 points. This commit extracts the phases into a data-driven pipeline: - startContext threads the mutable state (vm, live, apiSock, dmName, tapName, etc.) through every step by pointer so step bodies mutate in place without returning copies. - startStep carries the op.stage name, optional vmCreateStage progress ping, optional log attrs, a run closure, and an optional undo closure. - runStartSteps walks steps in order, appends the failing step to the rollback set (so partial-acquire failures like machine.Start's post-spawn HTTP config get their undo fired), then iterates the rollback set in reverse and joins errors via errors.Join. Each phase that acquires a resource now owns its own undo: system_overlay removes a file it created, dm_snapshot cleans up the loop + DM handles it set, prepare_host_features delegates to capHooks.cleanupState, tap releases via releaseTap, metrics_file removes the file, firecracker_launch kills the spawned PID and drops the sockets, post_start_features calls capHooks.cleanupState again (capability Cleanup hooks are idempotent — safe to call whether PostStart reached every cap or not). The 11 phases with no teardown obligation leave `undo` nil and the driver silently skips them on rollback. cleanupRuntime is retired from the start-failure path. It stays intact for reconcile, stopVMLocked, killVMLocked, deleteVMLocked, stopStaleVMs — the crash-recovery / lifecycle-teardown contract those paths rely on is unchanged. startVMLocked shrinks from ~225 lines of sequential-phase code plus a cleanupOnErr closure to ~45 lines: compute derived paths, build the step list, drive it, persist ERROR state on failure. Stage names preserved 1:1 so existing log grep + the async-create progress stream stay compatible. Tests: - TestRunStartSteps_RollsBackInReverseOnFailure — the contract is pinned: succeeded-before-failing run, all their undos in reverse, failing step's undo also fires, original err still visible via errors.Is. - TestRunStartSteps_SkipsNilUndos — optional-undo contract. - TestRunStartSteps_JoinsRollbackErrors — undo failures don't hide the root cause. - TestRunStartSteps_HappyPathNoRollback — success path never fires any undo. Smoke: all 21 scenarios pass, including the start-path ones (bare vm run, workspace vm run, vm restart, vm lifecycle, vm set reconfig) that exercise real firecracker boots end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:34:34 -03:00
Thales Maciel	2ebd2b64bb	imagepull: update stale package + BuildExt4 docs The package doc in internal/imagepull/imagepull.go still described a two-step Pull + Flatten + BuildExt4 pipeline and warned that the resulting image was "suitable as input to `image build` but not directly bootable" because ownership preservation was deferred. That's been wrong for a while: ApplyOwnership (internal/imagepull/ownership.go) restores tar-header uid/gid/mode via a debugfs set_inode_field batch, and InjectGuestAgents (internal/imagepull/inject.go) writes banger's guest-side assets into the image. `image pull` now produces a directly bootable rootfs end-to-end. Updated: - imagepull.go package doc — describes the full Pull → Flatten → BuildExt4 → ApplyOwnership → InjectGuestAgents pipeline and drops the "Phase A limitations" list that spoke of deferred ownership. - ext4.go BuildExt4 doc — notes that the filesystem is root-owned via `-E root_owner=0:0` and points at ApplyOwnership as the step that handles per-file ownership, instead of the previous "see the package doc for the implications" handwave. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:34:25 -03:00
Thales Maciel	5eceebe49f	daemon: persist tap device on VM.Runtime so NAT teardown survives handle-cache loss Cleanup identity for kernel objects was split across two sources of truth: vm.Runtime (DB-backed, durable) held paths and the guest IP, but the TAP name lived only in the in-process handle cache + the best-effort handles.json scratch file next to the VM dir. Every other cleanup-identifying datum has a fallback — firecracker PID can be rediscovered via `pgrep -f <apiSock>`, loops via losetup, dm name from the deterministic ShortID(vm.ID). The tap is the one truly cache-only datum (allocated from a pool, not derivable). That made NAT teardown fragile: - daemon crash between `acquireTap` and the handles.json write - handles.json corrupt on the next daemon start - partial cleanup that already zeroed the cache In any of those cases natCapability.Cleanup short-circuited ("skipping nat cleanup without runtime network handles") and the per-VM POSTROUTING MASQUERADE + the two FORWARD rules keyed off the tap would leak. The VM row in the DB still existed, so a retry couldn't close the loop — the tap name was simply gone. Fix: mirror TapDevice onto model.VMRuntime (serialised via the existing runtime_json column, omitempty so existing rows upgrade cleanly). Set it in startVMLocked right next to the s.setVMHandles call that seeds the in-memory cache; clear it at every post-cleanup reset site (stop normal path + stop stale branch, kill normal path + kill stale branch, cleanupOnErr in start, reconcile's stale-vm branch, the stats poller's auto-stop path). Fallbacks now cascade: - natCapability.Cleanup: handles cache → Runtime.TapDevice - cleanupRuntime (releaseTap): handles cache → Runtime.TapDevice Both surfaces refuse gracefully (old behaviour) only when neither source has a value, which really does mean "no tap was ever allocated for this VM" rather than "we lost track of it." Test: TestNATCapabilityCleanup_FallsBackToRuntimeTapDevice clears the handle cache, sets vm.Runtime.TapDevice, and asserts Cleanup reaches the runner — the exact scenario the review flagged as a plausible leak and the exact code path that now guarantees it doesn't. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:21:13 -03:00
Thales Maciel	1850904d9c	file_sync: skip nested symlinks during recursive copy A user who sets `[[file_sync]] host = "~/.aws"` (per the README's own example) can unintentionally copy files from outside that directory if .aws contains symlinks. copyHostDir used os.Stat during recursion, which transparently follows: a symlink to a credential dir elsewhere would be recursed into, materialising unrelated secrets inside the guest. For credential trees that's an avoidable sprawl vector. Switched copyHostDir's per-entry probe from os.Stat to os.Lstat and added a default skip-with-warning branch for ModeSymlink. Files and dirs at the SAME level copy as before; symlinks (both file and directory flavours) surface a "file_sync skipped symlink (would escape the requested tree)" warn log and are otherwise omitted. Top-level entry paths still follow — the Stat in runFileSync is unchanged. The user explicitly named that path, so resolving "~/.aws" through a symlink out of $HOME is on them. Tests: - TestRunFileSyncSkipsNestedSymlinks — builds a synced dir with both a file symlink and a directory symlink pointing outside the tree; asserts real files copy, symlinks do not materialise anywhere in the guest mount, and each skipped symlink surfaces a warn log entry. README updated with a one-line note about the skip behaviour so users know to expect it rather than chasing "why didn't my file show up." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:11:58 -03:00
Thales Maciel	caa6a2b996	model: validate VM names as DNS labels at CLI + daemon A VM name flows into five places that all have narrower grammars than "arbitrary string": - the guest's /etc/hostname (vm_disk.patchRootOverlay) - the guest's /etc/hosts (same) - the <name>.vm DNS record (vmdns.RecordName) - the kernel command line (system.BuildBootArgs*) - VM-dir file-path fragments (layout.VMsDir/<id>, etc.) Nothing in the chain was validating the input. A name with whitespace, newline, dot, slash, colon, or = would produce broken hostnames, weird DNS labels, smuggled kernel cmdline tokens, or (in the worst case) surprising traversal through the on-disk layout. Not host shell injection — we already avoid shelling out with the raw name — but a real correctness and supportability bug. New: model.ValidateVMName. Rules: - 1..63 chars (DNS label max per RFC 1123; also a comfortable /etc/hostname cap) - lowercase ASCII letters, digits, '-' only - no leading or trailing '-' - no normalization — the name is the user-visible identifier (store key, `ssh <name>.vm`, `vm show`); silently rewriting "MyVM" → "myvm" would hand the user back something different than they typed Called from two places: - internal/cli/commands_vm.go vmCreateParamsFromFlags — rejects bad `--name` values before any RPC. Empty name still passes through so the daemon can generate one. - internal/daemon/vm_create.go reserveVM — defense in depth for any non-CLI RPC caller (SDK, direct JSON over the socket). Tests: - internal/model/vm_name_test.go — exhaustive character-class matrix (space, newline, tab, dot, slash, colon, equals, quote, control chars, unicode letters, uppercase, leading/trailing hyphen, over-length, max-length-exact, digits-only). - internal/cli TestVMCreateParamsFromFlagsRejectsInvalidName — CLI wire-through + empty-name passthrough. - internal/daemon TestReserveVMRejectsInvalidName — daemon defense-in-depth (including `box/../evil` path-traversal). - scripts/smoke.sh — end-to-end rejection + no-leaked-row assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:06:40 -03:00
Thales Maciel	700a1e6e60	cleanup: drop pre-v0.1 migration scaffolding + legacy-behavior refs banger hasn't shipped a public release — every "legacy", "pre-opt-in", "previously", "migration note", "no longer" reference in the tree is pinning against a state no real user's install has ever been in. That scaffolding has weight: it's a coordinate system future readers have to decode, and it keeps dead code alive. Removed (code): - internal/daemon/ssh_client_config.go - vmSSHConfigIncludeBegin / vmSSHConfigIncludeEnd constants and every `removeManagedBlock(existing, vm...)` call they enabled (legacy inline `Host .vm` block scrub) - cleanupLegacySSHConfigDir (+ its caller in syncVMSSHClientConfig) — wiped a pre-opt-in sibling file under $ConfigDir/ssh - sameDirOrParent + resolvePathForComparison — only ever used by cleanupLegacySSHConfigDir - the "also check legacy marker" fallback in UserSSHIncludeInstalled / UninstallUserSSHInclude - internal/store/migrations.go - migrateDropDeadImageColumns (migration 2) + its slice entry - dropColumnIfExists (orphaned after the above) - addColumnIfMissing + the whole "columns added across the pre- versioning lifetime" block at the end of migrateBaseline — subsumed into the baseline CREATE TABLE - `packages_path TEXT` column on the images table (the throwaway migration 2 dropped it, but there was never any reader) - internal/daemon/vm.go - vmDNSRecordName local wrapper — was justified as "avoid pulling vmdns into every file"; three of four callers already imported vmdns directly, so inline the one stray call - internal/cli/cli_test.go - TestLegacyRemovedCommandIsRejected (`tui` subcommand never shipped) Removed / simplified (tests): - ssh_client_config_test.go: dropped TestSameDirOrParentHandlesSymlinks, TestSyncVMSSHClientConfigPreservesUserKeyInLegacyDir, TestSyncVMSSHClientConfigNarrowsCleanupToLegacyFile, TestSyncVMSSHClientConfigLeavesUnexpectedLegacyContents, TestInstallUserSSHIncludeMigratesLegacyInlineBlock, plus the "legacy posture" regression strings in the remaining happy-path test; TestUninstallUserSSHIncludeRemovesBothMarkerBlocks collapsed to a single-block test - migrations_test.go: dropped TestMigrateDropDeadImageColumns_AcrossInstallPaths, TestDropColumnIfExistsIsIdempotent; TestOpenReadOnlyDoesNotRunMigrations simplified to test against the baseline marker Removed (docs): - README.md "Migration note.*" blockquote about the SSH-key path move - docs/advanced.md parenthetical "(the old behaviour)" Reworded (comments): - Dropped "Previously this file also contained LogLevel DEBUG3..." history from vm_disk.go's sshdGuestConfig doc - Dropped "Call sites that previously read vm.Runtime.{PID,...}" from vm_handles.go; now documents the current contract - Dropped "Pre-v0.1 the defaults are" scaffolding in doctor_test.go - Dropped "no longer does its own git inspection" phrasing in vm_run.go - Dropped the "(also cleans up legacy inline block from pre-opt-in builds)" aside on the `ssh-config` CLI docstring - Renamed test var `legacyKey` → `existingKey` in vm_test.go; its purpose was "pre-existing authorized_keys line," not banger-legacy Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 13:56:32 -03:00

1 2 3 4 5 ...

269 commits