banger/docs/privileges.md

# Privileges

This document describes exactly what banger does with the privileges it
asks for, what runs where, and how to undo it. The aim is to give a
reader enough information to grant — or refuse — the privileges with
their eyes open.

## Two services, two trust boundaries

`banger system install` lays down two systemd units:

| Unit | User | Socket | Purpose |
|---|---|---|---|
| `bangerd.service` | owner user (chosen at install) | `/run/banger/bangerd.sock` (0600, owner) | Orchestration: VM/image lifecycle, store, RPC to the CLI. |
| `bangerd-root.service` | `root` | `/run/banger-root/bangerd-root.sock` (0600, owner; root-owned dir at 0711) | Narrow root helper: bridge/tap, DM snapshots, NAT, Firecracker launch. |

The owner daemon does all the business logic. It never runs as root.
The root helper runs as root but only accepts a fixed list of operations
and rejects every input that isn't a banger-managed path or name.

The CLI (`banger ...`) talks to the owner daemon. The owner daemon
talks to the root helper for the handful of things only root can do.
Users and CI scripts never call the root helper directly.

### Why two daemons

Before this split the owner daemon shelled `sudo` for every device or
network operation. That meant the user's `sudo` config gated daily
work, and an attacker who compromised the owner daemon inherited
arbitrary `sudo` reach. After the split, the owner daemon has no
ambient root. The only way for it to make a privileged change is to
ask the helper, and the helper only honours requests that fit a
specific shape.

## Authentication

The root helper:

- Listens on a Unix socket at `/run/banger-root/bangerd-root.sock`,
  mode 0600, owned by the registered owner UID, in a root-owned
  runtime dir at 0711.
- Reads `SO_PEERCRED` on every accepted connection and rejects any
  caller whose UID is not 0 or the owner UID recorded in
  `/etc/banger/install.toml`. The match is by UID, not username.
- Decodes one JSON request per connection and dispatches it through a
  named-method switch. Unknown methods return `unknown_method`.

The owner daemon:

- Listens on `/run/banger/bangerd.sock`, mode 0600, owned by the
  install-time owner user. Other host users cannot connect.
- Reads `SO_PEERCRED` on every accepted connection and rejects any
  caller whose UID is not 0 or the install-time owner UID. The
  filesystem perms already gate access; the peer-cred read is
  belt-and-braces in case the socket FD is ever leaked to a
  non-owner process.
- Resolves the helper socket path from the install metadata and
  retries with backoff if the helper hasn't started yet.

There is no network listener. Every banger control surface is a Unix
socket on the local host.

## What the root helper will do, exactly

The helper exposes a fixed list of RPC methods (see
`internal/roothelper/roothelper.go` for the canonical set). Each is
shaped so the owner daemon can name a banger-managed object but
cannot pass an arbitrary host path or interface name. Every input
that names a path, device, PID, or interface is checked against a
validator before the helper touches the host.

| Method | Effect | Validation gate |
|---|---|---|
| `priv.ensure_bridge` | Create the configured Linux bridge if missing; assign the bridge IP. | Bridge name must equal `br-fc` or start with `br-fc-` (so a compromised daemon can't drive `ip link` against `eth0` / `docker0` / `lo`). Bridge IP must parse as IPv4. CIDR prefix must be a number in `[8, 32]`. |
| `priv.create_tap` | `ip link add tap NAME tuntap` and add to bridge, owned by the owner user. | Tap name must match `tap-fc-*` or `tap-pool-*`. Bridge config (name + IP + CIDR) passes the same banger-managed check as `priv.ensure_bridge`, otherwise the new tap could be `master`-attached to an arbitrary host iface. |
| `priv.delete_tap` | `ip link del NAME`. | Same prefix check on the tap name. |
| `priv.sync_resolver_routing` | `resolvectl dns/domain/default-route` on the configured bridge. | Bridge name must equal `br-fc` or start with `br-fc-` (same banger-managed check). Resolver address must parse via `net.ParseIP`. |
| `priv.clear_resolver_routing` | `resolvectl revert` on the bridge. | Same banger-managed bridge-name check. |
| `priv.ensure_nat` | `iptables -t nat MASQUERADE` for `(guest_ip, tap)` plus matching FORWARD rules; `enable=false` removes them. | Tap must be banger-prefixed. Guest IP must parse as IPv4. |
| `priv.create_dm_snapshot` | Create a `dmsetup` device-mapper snapshot from `rootfs.ext4` with COW backing file. | Both paths must be inside `/var/lib/banger`; DM name must start with `fc-rootfs-`. |
| `priv.cleanup_dm_snapshot` | `dmsetup remove` and `losetup -d` for a snapshot the helper itself just created. | Every non-empty `dmsnap.Handles` field is checked: DM name `fc-rootfs-*`, DM device `/dev/mapper/fc-rootfs-*`, loops `/dev/loopN`. |
| `priv.remove_dm_snapshot` | `dmsetup remove` by target. | Target must be either a `fc-rootfs-*` name or a `/dev/mapper/fc-rootfs-*` path. |
| `priv.fsck_snapshot` | `e2fsck -fy` against the DM device. | DM device path must match `/dev/mapper/fc-rootfs-*`. Exit 1 (filesystem cleaned) is tolerated. |
| `priv.read_ext4_file` | Read a file from inside an ext4 image via `debugfs cat`. | Image path must be inside `/var/lib/banger` or a managed DM device. Guest path is rejected if it contains debugfs-hostile chars (`"`/`\`/newline). |
| `priv.write_ext4_files` | Batch write files into an ext4 image, root:root, mode-controlled. | Same image-path validator. |
| `priv.resolve_firecracker_binary` | Stat and return the firecracker binary path. | Path is opened with `O_PATH \| O_NOFOLLOW` (refusing symlinks) and Fstat'd through the resulting fd: must be a regular file, executable, root-owned, not group/world-writable. |
| `priv.launch_firecracker` | Start the firecracker process for a VM (jailer-wrapped). | Socket and vsock paths must be inside `/run/banger`. Log/metrics/kernel/initrd paths must be inside `/var/lib/banger`. Tap name must be banger-prefixed. Drives must be inside the state dir or be a `/dev/mapper/fc-rootfs-*` device. Jailer chroot base must be inside the system state/runtime dirs; jailer UID/GID must equal the registered owner. Binary must pass the same root-owned-executable check. |
| `priv.ensure_socket_access` | `chown` and `chmod 0600` on a firecracker API or vsock socket so the owner user can talk to it. | Path must be inside `/run/banger` and not a symlink. The helper opens it with `O_PATH \| O_NOFOLLOW`, refuses anything that isn't a unix socket, and chmod/chown via the resulting fd (no symlink-follow). The local-priv fallback uses `chown -h`. |
| `priv.cleanup_jailer_chroot` | Detach every mount under the per-VM jailer chroot via direct `umount2(MNT_DETACH \| UMOUNT_NOFOLLOW)` syscalls (deepest-first), then `rm -rf` the tree. | Path must be inside the system state/runtime dirs and not a symlink — including no symlinks at intermediate components (resolved with `EvalSymlinks` and re-checked). `UMOUNT_NOFOLLOW` makes the unmounts symlink-safe even if a path is swapped after validation. A `findmnt` guard refuses to `rm -rf` if any mount remains underneath. |
| `priv.find_firecracker_pid` | Resolve a firecracker PID by API socket path. | Filters to processes whose cmdline mentions the requested API socket. |
| `priv.kill_process` / `priv.signal_process` | Send SIGKILL or a named signal to a PID. | PID must refer to a running process whose `/proc/<pid>/cmdline` mentions `firecracker`. |
| `priv.process_running` | Check whether a PID is alive (no host mutation). | Read-only; same cmdline filter. |

Anything outside this list returns `unknown_method` and is logged. The
helper does not run a shell, does not exec helper scripts, and does
not accept commands as strings.

## Filesystem mutations

Path used | Owner | What is created or changed
---|---|---
`/etc/banger/install.toml` | root, 0644 | Written once by `banger system install`. Holds owner UID/GID/home, install timestamp, version. Read by both daemons at startup.
`/etc/systemd/system/bangerd.service` | root, 0644 | Owner-daemon unit. Contents are deterministic; see below.
`/etc/systemd/system/bangerd-root.service` | root, 0644 | Root-helper unit.
`/usr/local/bin/banger` | root, 0755 | Copy of the build output.
`/usr/local/bin/bangerd` | root, 0755 | Same binary, second name.
`/usr/local/lib/banger/banger-vsock-agent` | root, 0755 | Companion agent injected into guests at image-pull time.
`/var/lib/banger/...` | owner (via systemd `StateDirectory=banger`), 0700 | Image artifacts, VM dirs, work disks, kernels, OCI cache, SSH key + known_hosts.
`/var/cache/banger/...` | owner, 0700 | Bundle and OCI download cache.
`/run/banger/...` | owner, 0700 | Owner daemon socket and per-VM firecracker API + vsock sockets.
`/run/banger-root/...` | root, 0711 | Root-helper socket dir; the socket itself is 0600.
`~/.config/banger/config.toml` | owner | Optional user config. Read by the owner daemon at startup.

Outside these directories, banger does not write to the host filesystem
during normal operation. The two exceptions are file-sync (the user
explicitly opts in to copying paths from their home into a guest, which
the owner daemon validates is inside the owner home before reading)
and the install/uninstall actions above.

### Why the owner home is locked down

The `[[file_sync]]` config lets users mirror host files into guests.
banger refuses to follow paths that escape the owner home, including
through symlinks:

- `ResolveFileSyncHostPath` (`internal/config/config.go`) expands a
  leading `~/` and rejects any candidate that resolves outside the
  configured `OwnerHomeDir`.
- `ResolveExistingFileSyncHostPath` re-checks after `EvalSymlinks` so
  a symlink inside `~/.aws` that points at `/etc/shadow` cannot leak
  out.

This means an installed banger never reads outside the owner home in
the file-sync path, even if the owner edits config to try.

## Network mutations

For each running VM banger creates:

- One bridge (default `br-fc`, configurable). Created on first VM
  start, never deleted automatically.
- One tap interface named `tap-fc-<vm_id>`. Created on VM start,
  deleted on VM stop or crash recovery.
- One iptables MASQUERADE rule per VM, only when `--nat` was passed.
  Removed by the symmetric `EnsureNAT(enable=false)` call at stop.
- Optionally, `resolvectl` routing entries that send `*.vm` lookups to
  banger's in-process DNS server on the bridge. Reverted at stop.

Banger does not touch UFW, firewalld, or other rule managers. It only
edits the iptables tables it created the rules in.

## Cleanup and uninstall

Per-VM cleanup happens at:

- `banger vm stop <name>` — stops firecracker, removes the per-VM tap,
  drops the NAT rule, removes the DM snapshot, removes per-VM
  sockets, leaves the work disk.
- `banger vm delete <name>` — same as stop, plus deletes the per-VM
  state directory under `/var/lib/banger/vms/<id>` (work disk,
  metadata).
- `banger vm prune` — bulk version.
- Crash recovery: on daemon start, `reconcile` runs the same teardown
  for any VM whose firecracker process is no longer alive.

System-level uninstall:

```
sudo banger system uninstall          # remove services, units, binaries
sudo banger system uninstall --purge  # also remove /var/lib/banger,
                                      # /var/cache/banger, /run/banger
```

Without `--purge`, the state dirs survive so a reinstall can pick up
where the previous one left off. With `--purge`, banger leaves no
files behind under `/var/lib`, `/var/cache`, or `/run`.

What `uninstall` does, in order:

1. `systemctl disable --now bangerd.service bangerd-root.service`.
2. Remove `/etc/systemd/system/bangerd.service` and `bangerd-root.service`.
3. Remove `/etc/banger/install.toml` and `/etc/banger/`.
4. `systemctl daemon-reload`.
5. Remove `/usr/local/bin/banger`, `/usr/local/bin/bangerd`,
   `/usr/local/lib/banger/`.
6. With `--purge` only: remove the system state, cache, and runtime
   dirs.

What `uninstall` does NOT do automatically:

- It does not delete the bridge or any iptables rules. Stop your VMs
  first (`banger vm prune` or `banger vm stop <name>` for each VM) so
  the per-VM teardown drops them. The bridge itself is intentionally
  persistent — a future reinstall reuses it. To remove it manually:
  `sudo ip link del br-fc`.
- It does not undo `resolvectl` routing on a bridge that no longer
  exists; the entries are harmless if the bridge is gone.
- It does not remove the owner user, the owner's home, or anything
  the user wrote into a guest from inside the guest.

## Updating banger

`banger update` is a user-triggered, manually-invoked operation. It
never runs in the background and never auto-checks for new releases.

The flow:

1. **Discover.** GET `https://releases.thaloco.com/banger/manifest.json`
   over HTTPS. The URL is hardcoded in the binary at compile time —
   a compromised daemon config can't redirect the updater. Manifest
   schema_version gates forward compat: a CLI that doesn't recognise
   the server's schema_version refuses to update.
2. **In-flight gate.** `daemon.operations.list` RPC. If any operation
   is not Done, refuse with the operation list. `--force` overrides.
3. **Download.** Capped GET on the tarball + `SHA256SUMS` (≤ 256 MiB
   and ≤ 16 KiB respectively). Tarball is sha256-verified on the fly
   against the digest published in `SHA256SUMS`; partial files are
   removed on any verification failure.
4. **Cosign signature.** `SHA256SUMS.sig` is fetched (≤ 1 KiB) and
   verified against the `BangerReleasePublicKey` embedded in the
   running banger binary. The signature is an ECDSA P-256 / SHA-256
   blob signature produced by `cosign sign-blob` — verified by Go's
   stdlib `crypto/ecdsa.VerifyASN1`, no third-party crypto deps. A
   missing signature URL or a verification failure aborts the update
   before any binary is touched.
5. **Sanity-run.** Staged `banger --version` must mention the
   expected version; staged `bangerd --check-migrations --system`
   must exit 0 (compatible) or 1 (will auto-migrate). Exit 2
   (incompatible — DB has migrations the new binary doesn't know)
   aborts the swap; the running install is untouched.
6. **Swap.** Atomic `os.Rename` for each of the three binaries
   (banger-vsock-agent → bangerd → banger), with `.previous` backups.
7. **Restart.** `systemctl restart bangerd-root.service` then
   `bangerd.service`. Wait for the new daemon socket to answer
   `ping`. Running VMs survive the daemon restart — they're each
   their own firecracker process and live in `bangerd-root.service`'s
   cgroup; restart's `KillMode=control-group` doesn't reach them.
   The new daemon's `reconcile` step re-attaches by reading the
   per-VM `handles.json` scratch file and verifying the firecracker
   process is still alive.
8. **Verify.** Run `banger doctor` against the just-installed CLI.
   FAIL triggers auto-rollback: restore `.previous` backups, restart
   services again so the OLD binaries take over. The original error
   bubbles to the operator; `--force` skips this step.
9. **Finalise.** Update `/etc/banger/install.toml`'s Version /
   Commit / BuiltAt. Remove `.previous` backups. Wipe the staging
   directory under `/var/cache/banger/updates/`.

What you're trusting in this flow:

- The cosign **public key** baked into the binary you're updating
  FROM. The maintainer rotates it by cutting a new release with a
  new key embedded; from then on, only signatures made with the
  new private key are accepted. v0.1.x predates a clean rotation
  story.
- TLS to `releases.thaloco.com` for transport. The cosign signature
  is the actual integrity check; TLS just gets us the bytes faster.
- The systemd unit owners (root for the helper, owner for the
  daemon). `banger update` requires root because it writes
  `/usr/local/bin` and talks to systemctl; it does NOT run via the
  helper RPC interface.

What `banger update` deliberately does NOT do:

- No background check timers. Operators run `banger update --check`
  on a schedule themselves if they want.
- No update across MINOR boundaries without an explicit `--to`
  flag. v0.x is pre-stable; we don't promise that v0.1.5 → v0.2.0
  is automatic.
- No state-DB downgrade. Schema migrations are forward-only;
  `--check-migrations` refuses to swap a binary that's older than
  the running schema.
- No agent re-injection into existing VMs. The vsock agent inside
  each VM is the version banger had at image-pull time, not the
  current install. v0.1.x doesn't enforce or detect skew here; the
  agent's HTTP API is small enough that compat across MINORs is
  expected.

## Running outside the system install

Everything above describes the supported deployment: `banger system
install` lays down both systemd units and the helper takes over every
privileged operation.

It is also possible to run `bangerd` directly without installing the
helper — the binary still works as a per-user daemon and shells `sudo
-n` for each privileged operation it would otherwise hand off
(`iptables`, `ip`, `mount`, `mknod`, `dmsetup`, `e2fsck`, `kill`,
`chown -h`, `chmod`, `losetup`, `chown`, `chmod`, `firecracker`).
This mode is intended for ad-hoc developer machines while iterating on
banger itself.

It carries a different trust model:

- It needs `NOPASSWD` sudoers entries for the developer (otherwise
  every VM action prompts for a password).
- Once those entries exist, **any** process running as the developer
  can invoke those commands with arbitrary arguments — banger's input
  validators only constrain what banger itself sends. They are no
  defence against a different program on the same account.
- The helper's `SO_PEERCRED` boundary, the systemd hardening
  (`NoNewPrivileges`, `ProtectSystem=strict`, the narrow
  `CapabilityBoundingSet`), and the helper's own input validators are
  all bypassed.

If you care about isolating banger's blast radius from anything else
running as your user, use the system install. If you only need
banger to work on your own dev box, the non-system mode is fine —
just don't run it on a shared or production host.

## Hardening of the systemd units

The two units ship with restrictive defaults; they are written by
banger at install time and the contents are deterministic.

Owner daemon (`bangerd.service`):

- `User=` is the install-time owner; never `root`.
- `NoNewPrivileges=yes`.
- `ProtectSystem=strict` — system directories are read-only.
- `ProtectHome=read-only` — owner home is read-only to the daemon
  unit. The daemon writes only to `StateDirectory`, `CacheDirectory`,
  `RuntimeDirectory`, plus owner config that the user edits.
- `ProtectControlGroups`, `ProtectKernelLogs`, `ProtectKernelModules`,
  `ProtectClock`, `ProtectHostname`, `RestrictSUIDSGID`,
  `LockPersonality`.
- `RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK AF_VSOCK`.
- No `AmbientCapabilities`.

Root helper (`bangerd-root.service`):

- Same hardening as above, plus `ProtectHome=yes` (no host-home
  visibility at all from the helper).
- `CapabilityBoundingSet=CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_KILL CAP_MKNOD CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_ADMIN CAP_SYS_CHROOT`.
  Only the capabilities required for tap/bridge, iptables, dmsetup,
  loop devices, ownership fixups, device node creation, and Firecracker
  process management. No `CAP_SYS_BOOT`, no `CAP_SYS_PTRACE`,
  no `CAP_SYS_MODULE`, no `CAP_NET_BIND_SERVICE`.
- `ReadWritePaths=/var/lib/banger`.

## What this leaves you trusting

If you install banger as root, you are trusting:

1. The two binaries banger drops under `/usr/local/bin` and the
   companion agent under `/usr/local/lib/banger`. These should match
   the build artifacts you reviewed.
2. The path/identifier validators in
   `internal/roothelper/roothelper.go` to be tight: `validateManagedPath`,
   `validateTapName`, `validateDMName`, `validateDMDevicePath`,
   `validateLoopDevicePath`, `validateDMRemoveTarget`,
   `validateDMSnapshotHandles`, `validateRootExecutable`,
   `validateNotSymlink`, `validateExt4ImagePath`,
   `validateLinuxIfaceName`, `validateBangerBridgeName`,
   `validateNetworkConfig`, `validateCIDRPrefix`, `validateIPv4`,
   `validateResolverAddr`, `validateSignalName`, and
   `validateFirecrackerPID`. If any of these are bypassed, the helper
   would carry out a privileged op against an unmanaged target. They
   are unit-tested in `internal/roothelper/roothelper_test.go`.
3. The Firecracker binary banger executes. The helper refuses to launch
   anything that isn't a regular, executable, root-owned, not
   world-writable file — but the binary's own behaviour is your
   responsibility.
4. Your own owner-user account. The owner can ask the helper to
   create taps, run firecracker, and edit ext4 images under
   `/var/lib/banger`. Anyone with the owner's UID can do those
   things; treat that account as semi-privileged.

What you do **not** have to trust:

- The CLI process. It only talks Unix-socket RPC.
- Other host users. The helper socket is 0600 root and the owner
  socket is 0700 owner.
- The contents of the user's home, except the file paths that
  `[[file_sync]]` explicitly names — and even those are clamped to
  the owner home.
- The guest. Guests cannot reach the helper or the owner daemon; the
  only host endpoint a guest sees is the in-process DNS server on the
  bridge IP and the bridge itself for outbound NAT.