banger

Author	SHA1	Message	Date
Thales Maciel	003b0488ce	cli,docs: trivial polish for v0.1.0 A pre-release audit collected ~12 trivial-effort UX and code-hygiene items. Rolling them up here so the v0.1.0 commit log isn't littered with one-line tweaks. CLI help / completion: * commands_image.go: drop dangling reference to a `banger image catalog` subcommand that doesn't exist; replace with a pointer to `banger image list`. * commands_image.go: --size flag example was "4GiB" but the parser rejects that suffix. Change example to "4G". (Parser-side fix is in a separate concern.) * commands_image.go + completion.go: image pull now wires a catalog completer (falls back to local image names since there's no image-catalog RPC yet); image show / delete / promote already completed local names. * commands_kernel.go + completion.go: kernel pull now wires a new completeKernelCatalogNameOnlyAtPos0 backed by the kernel.catalog RPC, so tab-complete suggests pullable kernels. * commands_vm.go: vm stats and vm set now have Long + Example blocks (peers all do); --from flag description updated to spell out the relationship to --branch. README: * Define "golden image" inline at first use. * Add a one-line Requirements block above Quick Start so users hit the firecracker / KVM dependency before `make build`. Code hygiene: * dashIfEmpty / emptyDash were the same function. Deleted emptyDash, retargeted three call sites. * formatBytes (introduced today in image cache prune) duplicated humanSize. Consolidated to humanSize, now with a space ("1.2 GiB" not "1.2GiB"). formatters_test.go expectations updated. Logging chattiness: * "operation started" (logger.go), "daemon request canceled" (daemon.go), and "helper rpc completed" (roothelper.go) all fired at INFO per RPC. Downgraded to DEBUG so routine shell completions don't spam syslog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 17:31:54 -03:00
Thales Maciel	182bccf8af	roothelper: pin bridge name + IP + CIDR to a banger-managed shape priv.ensure_bridge / priv.create_tap accepted the daemon's network config triple (BridgeName, BridgeIP, CIDR) and forwarded it straight to `ip link` / `ip addr` / `ip link set master`. Argv-style exec ruled out shell injection, but the kernel happily honours those commands against any iface a compromised owner-uid daemon names — including eth0/docker0/lo. Concretely: * priv.ensure_bridge could `ip link set <iface> up` against any host interface and `ip addr add` arbitrary IP/CIDR to it. * priv.create_tap could `ip link set <new-tap> master <iface>`, bridging the per-VM tap into the host's primary LAN so the guest sees host-local broadcast traffic. * priv.sync_resolver_routing / priv.clear_resolver_routing only enforced "name shaped like a Linux iface" — no banger constraint. New validators (single chokepoint via validateNetworkConfig): * validateBangerBridgeName: name must equal "br-fc" or start with "br-fc-". Stops a compromised daemon from naming any host iface in these RPCs. Users with a custom bridge keep the prefix. * validateCIDRPrefix: numeric in [8, 32]. Wider prefixes would silently widen the bridge subnet beyond what the daemon intends. * validateNetworkConfig bundles bridge-name + validateIPv4 + validateCIDRPrefix so every helper RPC that takes the triple stays in lockstep. Wired into methodEnsureBridge, methodCreateTap, and the resolver- routing pair (replacing the older validateLinuxIfaceName-only check with the stricter banger-bridge check). docs/privileges.md updated: the helper-RPC table rows now spell out the banger-managed bridge constraint, and the trust list includes the new validators. Tests: TestValidateBangerBridgeName (default + suffixed accepted, host ifaces / wrong prefix / oversized rejected), TestValidate CIDRPrefix (boundary + non-numeric + IPv6-style 64 rejected), TestValidateNetworkConfig (happy path + each-field-bad cases). Smoke at JOBS=4 still green — banger's defaults sail through the new gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 16:19:28 -03:00
Thales Maciel	3805b093b4	roothelper: tie kill/signal authorization to banger-launched firecracker validateFirecrackerPID was a substring check on /proc/<pid>/cmdline: "contains 'firecracker'". Good enough to refuse init/sshd/the test binary, but on a shared host where multiple users run firecracker the helper would happily SIGKILL someone else's VM. The owner-UID daemon could weaponise the helper as an arbitrary "kill any firecracker on this box" primitive. Replace the substring gate with two stronger acceptance modes: * Cgroup match (the supported path): /proc/<pid>/cgroup contains bangerd-root.service. systemd assigns every direct child of the helper unit into that cgroup at fork; the kernel keeps it there for the process's lifetime, so no daemon-UID code can forge it. Other users' firecracker processes live in different cgroups (user@<uid>.service, foreign service slices) and fail this check. Also robust across helper restarts: KillMode=control-group on the unit kills children when the service goes down, so an "orphan banger firecracker in some other cgroup" is rare by construction. * --api-sock fallback: cmdline carries `--api-sock <path>` with the path under banger's RuntimeDir. Covers the legacy direct (no-jailer) launch path, and gives daemon reconcile a way to clean up the rare orphan that lands outside the service cgroup after a hard helper crash. Tried /proc/<pid>/root first — pivot_root semantics make jailer'd firecracker read its root as "/" from any namespace, so the symlink is useless as a banger-managed fingerprint. Cgroup is the right signal. Also added a signal allowlist: priv.signal_process now rejects anything outside {TERM, KILL, INT, HUP, QUIT, USR1, USR2, ABRT} (case-insensitive, with or without SIG prefix). STOP/CONT, real-time signals, and numeric forms are refused — the helper running as root must not be a generic "send arbitrary signal to my pid" primitive. priv.kill_process is unaffected (it always sends KILL). Tests: validateSignalName covers allowlist + numeric/STOP/RTMIN rejection; extractFirecrackerAPISock pins the three flag forms (--api-sock VAL, --api-sock=VAL, -a VAL); pathIsUnder gets a small table; existing TestValidateFirecrackerPID still rejects PID 0, PID 1, and the test process itself. Doctor's non-system-mode test gained a t.TempDir-backed install path so it stops being environment-dependent on machines that happen to have /etc/banger/install.toml. Smoke at JOBS=4 still green — every banger-launched firecracker sails through the cgroup match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 16:00:41 -03:00
Thales Maciel	4a56e6c7d6	roothelper: walk validateManagedPath components, reject symlinks validateManagedPath was textual-only: filepath.Clean + dest-prefix match. That stopped `..` escapes but not the symlink-bypass attack that motivated this fix — a daemon-UID attacker can write into StateDir/RuntimeDir (it's their UID), so they can plant `<StateDir>/redirect -> /etc` and any helper RPC that then operates on `<StateDir>/redirect/...` resolves through the symlink at the kernel and lands at /etc/... on the host. Concretely the leaks this closed: * priv.create_dm_snapshot: rootfs/cow paths fed to losetup — losetup follows the symlink and attaches a host block device. * priv.launch_firecracker: kernel/initrd paths hard-linked into the chroot via `ln -f` — link(2) on Linux follows source symlinks, hard-linking host files into the jail. * priv.read_ext4_file / priv.write_ext4_files: image paths fed to debugfs / e2cp as root. * validateLaunchDrivePath: drive paths mknod'd or hard-linked. * validateJailerOpts: chroot base. Fix: after the existing prefix match, walk every component below the matched root with Lstat. Any existing symlink — leaf or intermediate — fails the validator. ENOENT is tolerated because several callers pass paths firecracker/the helper materialise later (sockets, log files, kernel hard-link targets); whoever materialises them goes through the same validation when the helper-side primitive runs. Subsumes most of validateNotSymlink's coverage but the explicit call sites (methodEnsureSocketAccess, methodCleanupJailerChroot) keep their belt-and-braces check — those paths must EXIST and not be symlinks, which validateNotSymlink enforces strictly while the broadened validateManagedPath tolerates ENOENT. Race-free in practice: helper RPCs are short and the validator fires on the same kernel state the next syscall sees. The helper loop processes RPCs serially per-connection, and the validator plus the syscall both run as root within microseconds of each other. Four new tests cover symlink leaf, symlink intermediate, missing leaf (must pass), and the plain happy path. Smoke at JOBS=4 still green — every legitimate daemon-supplied path passes the walk. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 15:26:56 -03:00
Thales Maciel	853249dec2	roothelper: tighten input validation across privileged RPCs Defence-in-depth pass over every helper method that touches the host as root. Each fix narrows what a compromised owner-uid daemon could ask the helper to do; many close concrete file-ownership and DoS primitives that the previous validators didn't reach. Path / identifier validation: * priv.fsck_snapshot now requires /dev/mapper/fc-rootfs-* (was "is the string non-empty"). e2fsck -fy on /dev/sda1 was the motivating exploit. * priv.kill_process and priv.signal_process now read /proc/<pid>/cmdline and require a "firecracker" substring before sending the signal. Killing arbitrary host PIDs (sshd, init, …) is no longer a one-RPC primitive. * priv.read_ext4_file and priv.write_ext4_files now require the image path to live under StateDir or be /dev/mapper/fc-rootfs-. priv.cleanup_dm_snapshot validates every non-empty Handles field: DM name fc-rootfs-, DM device /dev/mapper/fc-rootfs-, loops /dev/loopN. * priv.remove_dm_snapshot accepts only fc-rootfs-* names or /dev/mapper/fc-rootfs-* paths. * priv.ensure_nat now requires a parsable IPv4 address and a banger-prefixed tap. * priv.sync_resolver_routing and priv.clear_resolver_routing now require a Linux iface-name-shaped bridge name (1–15 chars, no whitespace/'/'/':') and, for sync, a parsable resolver address. Symlink defence: * priv.ensure_socket_access now validates the socket path is under RuntimeDir and not a symlink. The fcproc layer's chown/chmod moves to unix.Open(O_PATH\|O_NOFOLLOW) + Fchownat(AT_EMPTY_PATH) + Fchmodat via /proc/self/fd, so even a swap of the leaf into a symlink between validation and the syscall is refused. The local-priv (non-root) fallback uses `chown -h`. * priv.cleanup_jailer_chroot rejects symlinks at both the leaf (os.Lstat) and intermediate path components (filepath.EvalSymlinks + clean-equality). The umount sweep was rewritten from shell `umount --recursive --lazy` to direct unix.Unmount(MNT_DETACH \| UMOUNT_NOFOLLOW) per child mount, deepest-first; the findmnt guard remains as the rm-rf safety net. Local-priv mode falls back to `sudo umount --lazy`. Binary validation: * validateRootExecutable now opens with O_PATH\|O_NOFOLLOW and Fstats through the resulting fd. Rejects path-level symlinks and narrows the TOCTOU window between validation and the SDK's exec to fork+exec time on a healthy host. Daemon socket: * The owner daemon now reads SO_PEERCRED on every accepted connection and refuses any UID that isn't 0 or the registered owner. Filesystem perms (0600 + ownerUID) already enforced this; the check is belt-and-braces in case the socket FD is ever leaked to a non-owner process. Docs: * docs/privileges.md walked end-to-end. Each helper RPC's Validation gate row reflects what the code actually enforces. New section "Running outside the system install" calls out the looser dev-mode trust model (NOPASSWD sudoers, helper hardening bypassed) so users don't deploy that path on shared hosts. Trust list updated to include every new validator. Tests added: validators (DM-loop, DM-remove-target, DM-handles, ext4-image-path, iface-name, IPv4, resolver-addr, not-symlink, firecracker-PID, root-executable variants), the daemon's authorize path (non-unix conn rejection + unix conn happy path), the umount2 ordering contract (deepest-first + --lazy on the sudo branch), and positive/negative cases for the chown-no-follow fallback. Verified end-to-end via `make smoke JOBS=4` on a KVM host. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 14:39:41 -03:00
Thales Maciel	6b543cb17f	firecracker: adopt firecracker-jailer for VM launch (Phase B) Each VM's firecracker now runs inside a per-VM chroot dropped to the registered owner UID via firecracker-jailer. Closes the broad ambient- sudo escalation surface that survived Phase A: the helper still needs caps for tap/bridge/dm/loop/iptables, but the VMM itself no longer runs as root in the host root filesystem. The host helper stages each chroot up front: hard-links the kernel and (optional) initrd, mknods block-device drives + /dev/vhost-vsock, copies in the firecracker binary (jailer opens it O_RDWR so a ro bind fails with EROFS), and bind-mounts /usr/lib + /lib trees read-only so the dynamic linker can resolve. Self-binds the chroot first so the findmnt-guarded cleanup can recurse safely. AF_UNIX sun_path is 108 bytes; the chroot path easily blows past that. Daemon-side launch pre-symlinks the short request socket path to the long chroot socket before Machine.Start so the SDK's poll/connect sees the short path while the kernel resolves to the chroot socket. --new-pid-ns is intentionally disabled — jailer's PID-namespace fork makes the SDK see the parent exit and tear the API socket down too early. CapabilityBoundingSet for the helper expands to add CAP_FOWNER, CAP_KILL, CAP_MKNOD, CAP_SETGID, CAP_SETUID, CAP_SYS_CHROOT alongside the existing CAP_CHOWN/CAP_DAC_OVERRIDE/CAP_NET_ADMIN/CAP_NET_RAW/ CAP_SYS_ADMIN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 14:38:07 -03:00
Thales Maciel	e47b8146dc	daemon: thread per-RPC op_id end-to-end Today there's no way to correlate a CLI failure with a daemon log line. operationLog records relative timing but no id, two concurrent vm.start calls log indistinguishably, and the async vmCreateOperationState.ID is user-facing yet never reaches the journal. The root helper logs plain text to stderr while bangerd logs JSON, so a merged journalctl is hard to grep across the trust-boundary split. Mint a per-RPC op id at dispatch entry, store it on context, and include it as an "op_id" attr on every operationLog record. The id is stamped onto every error response (including the early short-circuit paths bad_version and unknown_method). rpc.Call forwards the context op id on requests so a daemon RPC and the helper RPCs it triggers all share one id. The helper now logs JSON to match bangerd, adopts the inbound id, and emits a single "helper rpc completed" / "helper rpc failed" line per call so operators can see at a glance how long each privileged op took. vmCreateOperationState.ID is now the same id dispatch generated for vm.create.begin — one identifier between client status polls, daemon logs, and helper logs. The wire format gains two optional fields: rpc.Request.OpID and rpc.ErrorResponse.OpID, both omitempty so older peers (and the opposite direction) ignore them. ErrorResponse.Error() now appends "(op-XXXXXX)" to its string form when set; existing callers that just print err.Error() get the id for free. Tests cover: dispatch stamps op_id on unknown_method, bad_version, and handler-returned errors; rpc.Call exposes the typed *ErrorResponse via errors.As so the CLI can read code/op_id; ctx op_id is forwarded to the server in the request envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 22:13:44 -03:00
Thales Maciel	59e48e830b	daemon: split owner daemon from root helper Move the supported systemd path to two services: an owner-user bangerd for orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop, and Firecracker ownership. This removes repeated sudo from daily vm and image flows without leaving the general daemon running as root. Add install metadata, system install/status/restart/uninstall commands, and a system-owned runtime layout. Keep user SSH/config material in the owner home, lock file_sync to the owner home, and move daemon known_hosts handling out of the old root-owned control path. Route privileged lifecycle steps through typed privilegedOps calls, harden the two systemd units, and rewrite smoke plus docs around the supported service model. Verified with make build, make test, make lint, and make smoke on the supported systemd host path.	2026-04-26 12:43:17 -03:00

8 commits