Commit graph

5 commits

Author SHA1 Message Date
182bccf8af
roothelper: pin bridge name + IP + CIDR to a banger-managed shape
priv.ensure_bridge / priv.create_tap accepted the daemon's network
config triple (BridgeName, BridgeIP, CIDR) and forwarded it straight
to `ip link` / `ip addr` / `ip link set master`. Argv-style exec
ruled out shell injection, but the kernel happily honours those
commands against any iface a compromised owner-uid daemon names —
including eth0/docker0/lo. Concretely:

  * priv.ensure_bridge could `ip link set <iface> up` against any
    host interface and `ip addr add` arbitrary IP/CIDR to it.
  * priv.create_tap could `ip link set <new-tap> master <iface>`,
    bridging the per-VM tap into the host's primary LAN so the
    guest sees host-local broadcast traffic.
  * priv.sync_resolver_routing / priv.clear_resolver_routing only
    enforced "name shaped like a Linux iface" — no banger constraint.

New validators (single chokepoint via validateNetworkConfig):
  * validateBangerBridgeName: name must equal "br-fc" or start with
    "br-fc-". Stops a compromised daemon from naming any host iface
    in these RPCs. Users with a custom bridge keep the prefix.
  * validateCIDRPrefix: numeric in [8, 32]. Wider prefixes would
    silently widen the bridge subnet beyond what the daemon intends.
  * validateNetworkConfig bundles bridge-name + validateIPv4 +
    validateCIDRPrefix so every helper RPC that takes the triple
    stays in lockstep.

Wired into methodEnsureBridge, methodCreateTap, and the resolver-
routing pair (replacing the older validateLinuxIfaceName-only check
with the stricter banger-bridge check).

docs/privileges.md updated: the helper-RPC table rows now spell out
the banger-managed bridge constraint, and the trust list includes
the new validators.

Tests: TestValidateBangerBridgeName (default + suffixed accepted,
host ifaces / wrong prefix / oversized rejected), TestValidate
CIDRPrefix (boundary + non-numeric + IPv6-style 64 rejected),
TestValidateNetworkConfig (happy path + each-field-bad cases).
Smoke at JOBS=4 still green — banger's defaults sail through the
new gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:19:28 -03:00
3805b093b4
roothelper: tie kill/signal authorization to banger-launched firecracker
validateFirecrackerPID was a substring check on /proc/<pid>/cmdline:
"contains 'firecracker'". Good enough to refuse init/sshd/the test
binary, but on a shared host where multiple users run firecracker
the helper would happily SIGKILL someone else's VM. The owner-UID
daemon could weaponise the helper as an arbitrary "kill any
firecracker on this box" primitive.

Replace the substring gate with two stronger acceptance modes:

  * Cgroup match (the supported path): /proc/<pid>/cgroup contains
    bangerd-root.service. systemd assigns every direct child of the
    helper unit into that cgroup at fork; the kernel keeps it there
    for the process's lifetime, so no daemon-UID code can forge it.
    Other users' firecracker processes live in different cgroups
    (user@<uid>.service, foreign service slices) and fail this
    check. Also robust across helper restarts: KillMode=control-group
    on the unit kills children when the service goes down, so an
    "orphan banger firecracker in some other cgroup" is rare by
    construction.

  * --api-sock fallback: cmdline carries `--api-sock <path>` with
    the path under banger's RuntimeDir. Covers the legacy direct
    (no-jailer) launch path, and gives daemon reconcile a way to
    clean up the rare orphan that lands outside the service cgroup
    after a hard helper crash.

Tried /proc/<pid>/root first — pivot_root semantics make jailer'd
firecracker read its root as "/" from any namespace, so the symlink
is useless as a banger-managed fingerprint. Cgroup is the right
signal.

Also added a signal allowlist: priv.signal_process now rejects
anything outside {TERM, KILL, INT, HUP, QUIT, USR1, USR2, ABRT}
(case-insensitive, with or without SIG prefix). STOP/CONT, real-time
signals, and numeric forms are refused — the helper running as root
must not be a generic "send arbitrary signal to my pid" primitive.
priv.kill_process is unaffected (it always sends KILL).

Tests: validateSignalName covers allowlist + numeric/STOP/RTMIN
rejection; extractFirecrackerAPISock pins the three flag forms
(--api-sock VAL, --api-sock=VAL, -a VAL); pathIsUnder gets a small
table; existing TestValidateFirecrackerPID still rejects PID 0,
PID 1, and the test process itself. Doctor's non-system-mode test
gained a t.TempDir-backed install path so it stops being
environment-dependent on machines that happen to have
/etc/banger/install.toml.

Smoke at JOBS=4 still green — every banger-launched firecracker
sails through the cgroup match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:00:41 -03:00
4a56e6c7d6
roothelper: walk validateManagedPath components, reject symlinks
validateManagedPath was textual-only: filepath.Clean + dest-prefix
match. That stopped `..` escapes but not the symlink-bypass attack
that motivated this fix — a daemon-UID attacker can write into
StateDir/RuntimeDir (it's their UID), so they can plant
`<StateDir>/redirect -> /etc` and any helper RPC that then operates
on `<StateDir>/redirect/...` resolves through the symlink at the
kernel and lands at /etc/... on the host.

Concretely the leaks this closed:
  * priv.create_dm_snapshot: rootfs/cow paths fed to losetup —
    losetup follows the symlink and attaches a host block device.
  * priv.launch_firecracker: kernel/initrd paths hard-linked into
    the chroot via `ln -f` — link(2) on Linux follows source
    symlinks, hard-linking host files into the jail.
  * priv.read_ext4_file / priv.write_ext4_files: image paths fed
    to debugfs / e2cp as root.
  * validateLaunchDrivePath: drive paths mknod'd or hard-linked.
  * validateJailerOpts: chroot base.

Fix: after the existing prefix match, walk every component below
the matched root with Lstat. Any existing symlink — leaf or
intermediate — fails the validator. ENOENT is tolerated because
several callers pass paths firecracker/the helper materialise
later (sockets, log files, kernel hard-link targets); whoever
materialises them goes through the same validation when the
helper-side primitive runs.

Subsumes most of validateNotSymlink's coverage but the explicit
call sites (methodEnsureSocketAccess, methodCleanupJailerChroot)
keep their belt-and-braces check — those paths must EXIST and
not be symlinks, which validateNotSymlink enforces strictly while
the broadened validateManagedPath tolerates ENOENT.

Race-free in practice: helper RPCs are short and the validator
fires on the same kernel state the next syscall sees. The helper
loop processes RPCs serially per-connection, and the validator
plus the syscall both run as root within microseconds of each
other.

Four new tests cover symlink leaf, symlink intermediate, missing
leaf (must pass), and the plain happy path. Smoke at JOBS=4 still
green — every legitimate daemon-supplied path passes the walk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:26:56 -03:00
853249dec2
roothelper: tighten input validation across privileged RPCs
Defence-in-depth pass over every helper method that touches the host
as root. Each fix narrows what a compromised owner-uid daemon could
ask the helper to do; many close concrete file-ownership and DoS
primitives that the previous validators didn't reach.

Path / identifier validation:
  * priv.fsck_snapshot now requires /dev/mapper/fc-rootfs-* (was
    "is the string non-empty"). e2fsck -fy on /dev/sda1 was the
    motivating exploit.
  * priv.kill_process and priv.signal_process now read
    /proc/<pid>/cmdline and require a "firecracker" substring before
    sending the signal. Killing arbitrary host PIDs (sshd, init, …)
    is no longer a one-RPC primitive.
  * priv.read_ext4_file and priv.write_ext4_files now require the
    image path to live under StateDir or be /dev/mapper/fc-rootfs-*.
  * priv.cleanup_dm_snapshot validates every non-empty Handles field:
    DM name fc-rootfs-*, DM device /dev/mapper/fc-rootfs-*, loops
    /dev/loopN.
  * priv.remove_dm_snapshot accepts only fc-rootfs-* names or
    /dev/mapper/fc-rootfs-* paths.
  * priv.ensure_nat now requires a parsable IPv4 address and a
    banger-prefixed tap.
  * priv.sync_resolver_routing and priv.clear_resolver_routing now
    require a Linux iface-name-shaped bridge name (1–15 chars, no
    whitespace/'/'/':') and, for sync, a parsable resolver address.

Symlink defence:
  * priv.ensure_socket_access now validates the socket path is under
    RuntimeDir and not a symlink. The fcproc layer's chown/chmod
    moves to unix.Open(O_PATH|O_NOFOLLOW) + Fchownat(AT_EMPTY_PATH)
    + Fchmodat via /proc/self/fd, so even a swap of the leaf into a
    symlink between validation and the syscall is refused. The
    local-priv (non-root) fallback uses `chown -h`.
  * priv.cleanup_jailer_chroot rejects symlinks at both the leaf
    (os.Lstat) and intermediate path components (filepath.EvalSymlinks
    + clean-equality). The umount sweep was rewritten from shell
    `umount --recursive --lazy` to direct unix.Unmount(MNT_DETACH |
    UMOUNT_NOFOLLOW) per child mount, deepest-first; the findmnt
    guard remains as the rm-rf safety net. Local-priv mode falls
    back to `sudo umount --lazy`.

Binary validation:
  * validateRootExecutable now opens with O_PATH|O_NOFOLLOW and
    Fstats through the resulting fd. Rejects path-level symlinks and
    narrows the TOCTOU window between validation and the SDK's exec
    to fork+exec time on a healthy host.

Daemon socket:
  * The owner daemon now reads SO_PEERCRED on every accepted
    connection and refuses any UID that isn't 0 or the registered
    owner. Filesystem perms (0600 + ownerUID) already enforced this;
    the check is belt-and-braces in case the socket FD is ever
    leaked to a non-owner process.

Docs:
  * docs/privileges.md walked end-to-end. Each helper RPC's
    Validation gate row reflects what the code actually enforces.
    New section "Running outside the system install" calls out the
    looser dev-mode trust model (NOPASSWD sudoers, helper hardening
    bypassed) so users don't deploy that path on shared hosts.
    Trust list updated to include every new validator.

Tests added: validators (DM-loop, DM-remove-target, DM-handles,
ext4-image-path, iface-name, IPv4, resolver-addr, not-symlink,
firecracker-PID, root-executable variants), the daemon's authorize
path (non-unix conn rejection + unix conn happy path), the umount2
ordering contract (deepest-first + --lazy on the sudo branch), and
positive/negative cases for the chown-no-follow fallback.

Verified end-to-end via `make smoke JOBS=4` on a KVM host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:39:41 -03:00
59e48e830b
daemon: split owner daemon from root helper
Move the supported systemd path to two services: an owner-user bangerd for
orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop,
and Firecracker ownership. This removes repeated sudo from daily vm and image
flows without leaving the general daemon running as root.

Add install metadata, system install/status/restart/uninstall commands, and a
system-owned runtime layout. Keep user SSH/config material in the owner home,
lock file_sync to the owner home, and move daemon known_hosts handling out of
the old root-owned control path.

Route privileged lifecycle steps through typed privilegedOps calls, harden the
two systemd units, and rewrite smoke plus docs around the supported service
model.

Verified with make build, make test, make lint, and make smoke on the
supported systemd host path.
2026-04-26 12:43:17 -03:00