roothelper: tie kill/signal authorization to banger-launched firecracker

validateFirecrackerPID was a substring check on /proc/<pid>/cmdline:
"contains 'firecracker'". Good enough to refuse init/sshd/the test
binary, but on a shared host where multiple users run firecracker
the helper would happily SIGKILL someone else's VM. The owner-UID
daemon could weaponise the helper as an arbitrary "kill any
firecracker on this box" primitive.

Replace the substring gate with two stronger acceptance modes:

  * Cgroup match (the supported path): /proc/<pid>/cgroup contains
    bangerd-root.service. systemd assigns every direct child of the
    helper unit into that cgroup at fork; the kernel keeps it there
    for the process's lifetime, so no daemon-UID code can forge it.
    Other users' firecracker processes live in different cgroups
    (user@<uid>.service, foreign service slices) and fail this
    check. Also robust across helper restarts: KillMode=control-group
    on the unit kills children when the service goes down, so an
    "orphan banger firecracker in some other cgroup" is rare by
    construction.

  * --api-sock fallback: cmdline carries `--api-sock <path>` with
    the path under banger's RuntimeDir. Covers the legacy direct
    (no-jailer) launch path, and gives daemon reconcile a way to
    clean up the rare orphan that lands outside the service cgroup
    after a hard helper crash.

Tried /proc/<pid>/root first — pivot_root semantics make jailer'd
firecracker read its root as "/" from any namespace, so the symlink
is useless as a banger-managed fingerprint. Cgroup is the right
signal.

Also added a signal allowlist: priv.signal_process now rejects
anything outside {TERM, KILL, INT, HUP, QUIT, USR1, USR2, ABRT}
(case-insensitive, with or without SIG prefix). STOP/CONT, real-time
signals, and numeric forms are refused — the helper running as root
must not be a generic "send arbitrary signal to my pid" primitive.
priv.kill_process is unaffected (it always sends KILL).

Tests: validateSignalName covers allowlist + numeric/STOP/RTMIN
rejection; extractFirecrackerAPISock pins the three flag forms
(--api-sock VAL, --api-sock=VAL, -a VAL); pathIsUnder gets a small
table; existing TestValidateFirecrackerPID still rejects PID 0,
PID 1, and the test process itself. Doctor's non-system-mode test
gained a t.TempDir-backed install path so it stops being
environment-dependent on machines that happen to have
/etc/banger/install.toml.

Smoke at JOBS=4 still green — every banger-launched firecracker
sails through the cgroup match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Thales Maciel 2026-04-28 16:00:41 -03:00
parent 4a56e6c7d6
commit 3805b093b4
No known key found for this signature in database
GPG key ID: 33112E6833C34679
3 changed files with 202 additions and 13 deletions

View file

@ -108,13 +108,17 @@ func findCheck(report system.Report, name string) *system.CheckResult {
}
// TestDoctorReport_NonSystemModeEmitsSecurityWarn pins the non-
// system-mode branch: when /etc/banger/install.toml is absent the
// security-posture check must surface a warn that points at the
// dev-mode caveat in docs/privileges.md. A pass row in this mode
// would imply guarantees the install isn't actually providing.
// system-mode branch: when install.toml is absent the security
// posture check must surface a warn that points at the dev-mode
// caveat in docs/privileges.md. A pass row in this mode would
// imply guarantees the install isn't actually providing. Drives
// the seam variant so the test is independent of whether the host
// happens to have /etc/banger/install.toml.
func TestDoctorReport_NonSystemModeEmitsSecurityWarn(t *testing.T) {
d := buildDoctorDaemon(t)
report := d.doctorReport(context.Background(), nil, false)
report := system.Report{}
missingInstall := filepath.Join(t.TempDir(), "install.toml")
d.addSecurityPostureChecksAt(context.Background(), &report, missingInstall, t.TempDir())
check := findCheck(report, "security posture")
if check == nil {