diff --git a/.githooks/pre-commit b/.githooks/pre-commit new file mode 100755 index 0000000..d5a034a --- /dev/null +++ b/.githooks/pre-commit @@ -0,0 +1,23 @@ +#!/usr/bin/env bash +# pre-commit gate. Runs lint (gofmt -l + go vet + shellcheck), unit +# tests, and a build before any commit lands. Activate once via +# `make install-hooks`, which points core.hooksPath at this directory. +# +# Bypass for in-flight WIP commits with `git commit --no-verify`. +set -euo pipefail + +# Resolve repo root so the hook works from any subdirectory. +repo_root="$(git rev-parse --show-toplevel)" +cd "$repo_root" + +# `make lint` already wraps `gofmt -l`, `go vet`, and shellcheck. +echo '[pre-commit] lint' +make --no-print-directory lint + +echo '[pre-commit] test' +make --no-print-directory test + +echo '[pre-commit] build' +make --no-print-directory build + +echo '[pre-commit] ok' diff --git a/.gitignore b/.gitignore index 4aad341..8f696f7 100644 --- a/.gitignore +++ b/.gitignore @@ -12,3 +12,11 @@ state/ squashfs-root/ rootfs* wtf/*.deb +*.pem +*.key +id_rsa +.env +/todos +/coverage.out +/coverage.html +/.codex diff --git a/AGENTS.md b/AGENTS.md index 60d086d..8050f32 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,10 +1,15 @@ # Repository Guidelines +Always run `make build` before commit. + ## Project Structure -- `cmd/banger` and `cmd/bangerd` are the main user entrypoints. -- `internal/` contains the daemon, CLI, RPC, storage, Firecracker integration, guest helpers, and web UI. -- `scripts/` contains explicit manual helper workflows for rootfs and kernel preparation. +- `cmd/banger`, `cmd/bangerd`, and `cmd/banger-vsock-agent` are the three binaries. The first two are user-facing; the third is a companion that ships inside each guest VM. +- `internal/` contains the daemon, CLI, RPC, storage, Firecracker integration, and guest helpers. +- `internal/daemon/` is the composition root; pure helpers live in its subpackages (`opstate`, `dmsnap`, `fcproc`, `imagemgr`, `workspace`). See `internal/daemon/ARCHITECTURE.md`. +- `internal/imagecat/` and `internal/kernelcat/` embed the image + kernel catalogs. +- `images/golden/` is the Dockerfile for the `debian-bookworm` catalog entry. +- `scripts/` contains manual helper workflows for rootfs, kernel, and bundle preparation. - `build/bin/` is the canonical source-checkout build output. - `build/manual/` is the canonical source-checkout location for manual rootfs/kernel artifacts. @@ -12,34 +17,44 @@ - `make build` builds `./build/bin/banger`, `./build/bin/bangerd`, and `./build/bin/banger-vsock-agent`. - `make test` runs `go test ./...`. +- `make lint` runs `gofmt -l`, `go vet ./...`, and `shellcheck --severity=error` on `scripts/*.sh`. Run before commits. - `./build/bin/banger doctor` checks host readiness. -- `./build/bin/banger image build --from-image ` builds a managed image from an existing registered image. +- `./build/bin/banger vm run` is the primary user-facing entry point — auto-pulls the default image + kernel from the catalogs if missing. +- `./build/bin/banger image pull ` uses the bundle catalog (fast) when `` is a catalog entry, or falls through to the OCI path for arbitrary registry refs. See `docs/image-catalog.md` and `docs/oci-import.md`. - `./build/bin/banger image register ...` registers an unmanaged host-side image stack. - `./build/bin/banger image promote ` copies an unmanaged image into daemon-owned managed artifacts. -- `make void-kernel`, `make rootfs-void`, and `make void-register` drive the experimental Void flow under `./build/manual`. +- `scripts/make-generic-kernel.sh` builds a Firecracker-optimized vmlinux from upstream sources. `scripts/publish-kernel.sh ` publishes it to the kernel catalog. +- `scripts/publish-golden-image.sh` rebuilds + publishes the golden image bundle and patches the image catalog. +- `scripts/publish-banger-release.sh ` cuts a banger release. Full runbook in `docs/release-process.md`. ## Image Model - Managed images own the full boot set: rootfs, optional work-seed, kernel, optional initrd, and optional modules. -- There is no runtime bundle and no auto-registered default image from disk paths. -- `default_image_name` selects a registered image only. +- The image catalog ships pre-built bundles. `vm run` auto-pulls the default catalog entry; `image pull ` can be invoked explicitly. +- `default_image_name` defaults to `debian-bookworm`. On miss, the daemon auto-pulls from `imagecat` before surfacing "not found". +- Kernel references follow the same auto-pull pattern against `kernelcat`. ## Config - Config lives at `~/.config/banger/config.toml`. - Firecracker comes from `PATH` by default, or `firecracker_bin`. -- SSH uses `ssh_key_path` or an auto-managed default key at `~/.config/banger/ssh/id_ed25519`. +- SSH uses `ssh_key_path` or an auto-managed default key at `~/.local/state/banger/ssh/id_ed25519`. ## Coding Style - Prefer small, direct Go code and standard library solutions. - Keep shell scripts strict with `set -euo pipefail`. - Use `gofmt` for Go formatting. +- When a CLI accepts either an inline string or a file input, always prefer the file-based form. +- For shell commands and AI/LLM tooling, prefer passing files as input whenever the CLI allows it. +- Create temporary files as needed to follow the file-first rule. +- Examples: use `git commit -F ` instead of `git commit -m `, and use prompt files instead of inline prompt strings when invoking LLM CLIs. ## Testing Guidance -- Primary automated coverage is `go test ./...`. -- For lifecycle changes, smoke-test with `vm create`, `vm ssh`, `vm stop`, and `vm delete`. +- Primary automated coverage is `go test ./...` (wired through `make test`). +- `make coverage` runs the suite with `-coverpkg=./...` and prints per-package averages plus a total; `make coverage-html` writes a browsable report to `coverage.html`; `make coverage-total` prints just the total (for scripts/CI). +- For lifecycle changes, smoke-test with `vm run` end-to-end (covers create + start + boot + ssh). - If guest provisioning changes, document whether existing images must be rebuilt or recreated. ## Security diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..e706114 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,326 @@ +# Changelog + +All notable changes to banger are documented here. The format is based +on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this +project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +The version line printed by `banger version` is the canonical reference +for what's installed; this file is the canonical reference for what +changed between versions. + +## [Unreleased] + +## [v0.1.10] - 2026-05-03 + +### Added + +- README now includes an animated demo GIF showing the typical + sandbox lifecycle (`vm run`, host-side `ssh demo.vm`, stop/start + with file persistence, `vm exec`, `curl http://demo.vm`). The + recording script lives at `assets/demo.tape` and is rendered with + [VHS](https://github.com/charmbracelet/vhs). + +## [v0.1.9] - 2026-05-01 + +### Fixed + +- `vm exec` no longer falls back to `cd /root/repo` on VMs that have + no recorded workspace. Previously, running `vm exec` against a plain + VM (one that never had `vm workspace prepare` / `vm run ./repo`) + blew up with `cd: /root/repo: No such file or directory` — surfaced + via the login shell's mise activate hook because `bash -lc` sources + profile.d before the explicit cd. Now the auto-cd only fires when + the user passes `--guest-path` or the VM actually has a workspace + recorded; otherwise the command runs from root's home. Mise wrapping + is unchanged — without a `.mise.toml` it's a no-op. + +### Changed + +- `vm exec --guest-path` default in `--help` now reads "from last + workspace prepare; otherwise root's home" (was "or /root/repo"). + Anyone who relied on the implicit `/root/repo` default for a VM that + has a repo there but no workspace record must now pass + `--guest-path /root/repo` explicitly. + +### Notes + +- Internal: smoke-test harness ported from `scripts/smoke.sh` to a + Go test suite under `internal/smoketest`. `make smoke` is unchanged + for maintainers; no user-visible effect. + +## [v0.1.8] - 2026-05-01 + +### Fixed + +- `.vm` resolution from the host (NSS path: curl, ssh hostname, + etc.) now works on systemd-resolved hosts. The root helper's + `validateResolverAddr` was rejecting the `host:port` form + (`127.0.0.1:42069`) that banger constructs to point resolved at the + in-process DNS server, so the auto-wire silently failed at every + daemon startup. `dig @127.0.0.1` worked because that bypasses NSS; + any tool going through glibc's resolver chain didn't. +- Validator now accepts both bare IPs and `IP:port` (matching what + `resolvectl dns` itself accepts) with new test coverage for the + port'd form. + +### Notes + +- Existing v0.1.x installs that already booted with the broken + validator have stale per-link resolved state. After updating to + v0.1.8, run `sudo banger system restart` once to re-trigger the + auto-wire, or restart the host. systemd-resolved restarts also + wipe per-link state — banger restores it on its own daemon + startup but won't re-run for an already-running daemon. + +## [v0.1.7] - 2026-05-01 + +### Added + +- `vm run -d` / `--detach` creates the VM, runs workspace prep + tooling + bootstrap, then exits without attaching to ssh. Reconnect later with + `banger vm ssh `. The combos `-d --rm` and `-d -- ` are + rejected before VM creation. +- `vm run --no-bootstrap` skips the mise tooling install entirely; useful + when a workspace has a `.mise.toml` you don't want banger to act on. +- `banger doctor --verbose` / `-v` prints every check with details. + Without it, doctor's default output now collapses (see Changed). + +### Changed + +- **`vm run` refuses early when bootstrap can't succeed.** Previously, a + workspace containing `.mise.toml` or `.tool-versions` without `--nat` + set silently failed the bootstrap into a log file and dropped you into + ssh with tools missing. It now refuses before VM creation with + `tooling bootstrap requires --nat (or pass --no-bootstrap to skip)`. + Existing scripts that relied on the silent-failure path will need to + add `--nat` or `--no-bootstrap`. +- **`banger doctor` default output is now compact.** A healthy host + collapses to a single line (`all N checks passed`); failing or warning + checks print only the affected entries plus a summary footer + (`N passed, M warnings, K failures`). Pass `--verbose` for the full + per-check output. Anything parsing the previous always-verbose output + needs to switch to `doctor --verbose`. + +### Fixed + +- The detached bootstrap path runs synchronously (foreground, tee'd to + the existing log file) so the CLI only returns once installs finish. + Interactive mode keeps today's nohup'd background behaviour so the ssh + session starts promptly. + +## [v0.1.6] - 2026-04-29 + +### Fixed + +- v0.1.4's "running VMs survive daemon restart" fix was incomplete: + the binary-level reconcile path was correct, but `/run/banger` (the + daemon's runtime dir) was being wiped on every daemon stop because + systemd defaults to `RuntimeDirectoryPreserve=no`. The api-sock + symlinks the helper had created for live VMs vanished with it, + and `findByJailerPidfile` couldn't resolve them to find the chroot + + pidfile. v0.1.6 sets `RuntimeDirectoryPreserve=yes` on both + unit templates so the symlinks (and helper RPC sock) survive + the restart window. Live-verified: FC PID and guest boot_id both + unchanged across a full helper+daemon restart cycle with a VM + running. +- v0.1.4's CHANGELOG correction stands: existing v0.1.x installs + (where x < 6) need a one-time `sudo banger system install` after + updating to v0.1.6 to pick up both the new `KillMode=process` and + the new `RuntimeDirectoryPreserve=yes` directives. `banger update` + swaps binaries, not unit files. + +## [v0.1.5] - 2026-04-29 + +No functional changes. Verification release for v0.1.4: the previous +release shipped the running-VMs-survive-update fix, but updating +*to* v0.1.4 from v0.1.3 used v0.1.3's buggy driver, so the fix +couldn't be verified live in that direction. v0.1.5 exists so a +host on v0.1.4 can update to it and observe a running VM survive +end-to-end with v0.1.4 in the driver seat. + +## [v0.1.4] - 2026-04-29 + +### Fixed + +- Daemon restarts no longer kill running VMs. Two changes together: + - The `bangerd-root.service` and `bangerd.service` unit templates + now set `KillMode=process`. The default (`control-group`) sent + SIGKILL to every process in the unit's cgroup on stop/restart, + including the jailer-spawned firecracker children — fork/exec + doesn't escape a systemd cgroup. With `KillMode=process` only + the unit's main PID is signalled; firecracker children survive. + - `fcproc.FindPID` now also looks up jailer'd firecracker + processes via the pidfile jailer writes at + `/firecracker.pid` (sibling of the api-sock target). + Previously the only lookup path was `pgrep -n -f `, + which can't see jailer'd processes because their cmdline only + carries the chroot-relative `--api-sock /firecracker.socket`. + Reconcile after a daemon restart now correctly re-attaches to + surviving guests instead of mistaking them for stale and tearing + down their dm-snapshot. + +### Notes + +- v0.1.0's CHANGELOG line "daemon restarts do not interrupt running + guests" was wrong: it was true at the systemd cgroup layer in + theory but the default `KillMode` defeated it, and even with + `KillMode=process` the daemon's reconcile would mistake + surviving FCs for stale and tear them down. v0.1.4 is the version + where this actually works end-to-end. +- Updating from v0.1.0–v0.1.3 to v0.1.4 still kills running VMs + because the *driver* of the update is the buggy older binary. + Updates from v0.1.4 onward preserve running VMs across the + helper+daemon restart that `banger update` performs. +- Existing v0.1.0–v0.1.3 installs that update to v0.1.4 do NOT + automatically pick up the new unit files — `banger update` swaps + binaries, not systemd units. Run `sudo banger system install` once + on those hosts after updating to refresh the units. New v0.1.4+ + installs get the correct units from the start. + +## [v0.1.3] - 2026-04-29 + +No functional changes. Verification release: v0.1.2 fixed +`banger update`'s install.toml handling, but the fix only takes +effect when v0.1.2 (or later) is the driver of an update. v0.1.3 +exists so a host running v0.1.2 can update to it and confirm the +fix works end-to-end with the new code in the driver seat. + +## [v0.1.2] - 2026-04-29 + +### Fixed + +- `banger update` now writes the freshly-installed binary's commit + and built_at fields to `/etc/banger/install.toml`, not the running + CLI's. Previously install.toml's `version` was correct after an + update but `commit` + `built_at` still pointed at the pre-update + binary's identity, which made `banger doctor` raise a false-positive + "CLI/install drift" warning on every update. Caught by the v0.1.0 + → v0.1.1 live update smoke-test. + +## [v0.1.1] - 2026-04-29 + +### Added + +- `install.sh` — one-command installer published at + `https://releases.thaloco.com/banger/install.sh`. Runs as the + invoking user, downloads + verifies the latest signed release with + the embedded cosign public key, and re-execs `sudo` only for the + actual system-install step. Pre-sudo summary explains in plain + language why elevation is needed. +- `BANGER_INSTALL_NONINTERACTIVE=1` env var on `install.sh` for + non-interactive use through `curl | bash` (CI, automated provisioning). + +## [v0.1.0] - 2026-04-29 + +First public release. banger runs disposable development sandboxes as +Firecracker microVMs: each sandbox boots in a few seconds, gets its own +root filesystem and network, and exits on demand. + +### Added + +**Sandbox VMs** +- `banger vm run` boots a microVM, drops you into ssh, and tears it down + on exit. Optional positional path ships a host repo into the guest; + `-- cmd args` runs a command non-interactively and exits with its + status. +- Long-lived VMs via `vm create` / `vm start` / `vm stop` / + `vm restart` / `vm ssh` / `vm exec` / `vm logs` / `vm stats` / + `vm ports` / `vm kill`. `vm list` and `ps` enumerate state; + `vm prune` deletes every non-running VM. +- `vm workspace` ships a host repo into a guest and pulls diffs back. +- Per-VM cgroup-isolated firecracker process under jailer chroot; + daemon restarts do not interrupt running guests. + +**Images** +- `banger image pull ` pulls a curated rootfs+kernel bundle from + the banger image catalog. `image pull ` pulls any OCI image. +- `image list` / `image show` / `image delete` / `image promote` / + `image register` round out the lifecycle. +- `image cache` manages the OCI layer-blob cache. +- Concurrent pulls of the same image are coalesced; the first pull + wins, the rest wait. + +**Kernels** +- `banger kernel pull ` pulls a Firecracker-compatible kernel + from the banger kernel catalog. `kernel list` / `kernel show` / + `kernel rm` manage the local store. + +**Host networking** +- Per-host bridge with NAT; per-VM tap device; deterministic IPv4 + assignment; iptables rules installed/removed with VM lifecycle. +- DNS routing: local resolver on `127.0.0.1:42069` answers queries + for `.vm` so plain `ssh .vm` reaches the guest. +- `banger ssh-config` writes a one-time `~/.ssh/config` include so + ssh, scp, and rsync resolve `.vm` from any terminal. + +**System install** +- `sudo banger system install` installs an owner-mode daemon + (`bangerd.service`) and a root-helper (`bangerd-root.service`) as + systemd units. The owner daemon runs as the invoking user; only the + root helper holds privilege, and only for a vetted set of operations. +- `system status` / `system restart` / `system uninstall` round out + the lifecycle. `daemon` is a thin alias. +- `banger doctor` audits host readiness: architecture, CLI/install + version drift, state store, host runtime, vm lifecycle prerequisites, + vsock guest agent, vm defaults, ssh shortcut, /root work disk, DNS, + NAT, firecracker binary version, systemd units, socket permissions, + helper unit hardening directives. + +**Self-update** +- `banger update` downloads, verifies, and installs newer releases + from the public manifest. Flow: fetch manifest, refuse if any VM + operation is in flight, download tarball + `SHA256SUMS` + + `SHA256SUMS.sig`, verify the cosign signature against the embedded + public key, verify the tarball hash, stage to a scratch dir, run + `bangerd --check-migrations` against the staged binary, atomically + swap the three banger binaries, restart the systemd units, run + `banger doctor`, finalise the install record. +- Pre-restart abort and post-restart auto-rollback both restore the + previous install on failure. +- `banger update --check` reports whether a newer release is + available without applying it; `--to vX.Y.Z` pins a specific + version; `--dry-run` prints the plan; `--force` skips the + in-flight-op refusal. + +**Trust model** +- Every release is cosign-signed. The public key is embedded in the + banger binary at build time; the signed payload is `SHA256SUMS`, + which in turn covers the release tarball. Verification uses the + Go standard library (`crypto/ecdsa.VerifyASN1`); cosign is needed + only for *signing*, not for verification. +- The release manifest URL is hardcoded into the binary so a + compromised daemon config cannot redirect the updater to a different + bucket. + +**CLI surface** +- Top-level: `vm`, `ps`, `image`, `kernel`, `ssh-config`, `system`, + `daemon`, `doctor`, `update`, `version`, `completion`. +- `banger version` reports the version, commit SHA, and build + timestamp baked in via ldflags at release-build time. + +### Compatibility + +- The host-side and guest-side vsock agent protocol is informally + stable across **patch** versions (v0.1.x). Minor-version bumps + (v0.2.x) may change it; existing VMs created against an older + minor will need to be re-pulled. `banger doctor` warns when a + running VM's agent is older than the daemon expects but does not + block lifecycle operations. +- The on-disk store schema is forward-only. Downgrading the binary + against a database written by a newer binary is unsupported; the + updater detects this via `bangerd --check-migrations` and refuses + the swap rather than starting up against an incompatible store. +- Linux only. amd64 only. KVM required. + +[Unreleased]: https://git.thaloco.com/thaloco/banger/compare/v0.1.10...HEAD +[v0.1.10]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.10 +[v0.1.9]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.9 +[v0.1.8]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.8 +[v0.1.7]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.7 +[v0.1.6]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.6 +[v0.1.5]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.5 +[v0.1.4]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.4 +[v0.1.3]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.3 +[v0.1.2]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.2 +[v0.1.1]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.1 +[v0.1.0]: https://git.thaloco.com/thaloco/banger/releases/tag/v0.1.0 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..ec83255 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,62 @@ +# Contributing + +## Build from source + +```bash +make build +sudo ./build/bin/banger system install --owner "$USER" +``` + +`make build` produces three binaries under `./build/bin/`: + +- `banger` — the user-facing CLI +- `bangerd` — the owner-user daemon (exposes `/run/banger/bangerd.sock`) +- `banger-vsock-agent` — the in-guest companion + +`system install` copies them into `/usr/local`, writes install +metadata under `/etc/banger`, lays down `bangerd.service` and +`bangerd-root.service`, and starts both. After that, daily commands +like `banger vm run` are unprivileged. + +To inspect or refresh the services: + +```bash +banger system status +sudo banger system restart +``` + +The two-service split (owner daemon + privileged root helper) is +explained in [`docs/privileges.md`](docs/privileges.md), including +the exact capability set the root helper holds. + +## Tests + +```bash +make test # go test ./... +make coverage # per-package + total statement coverage +make lint # gofmt + go vet + shellcheck +``` + +The smoke suite (`make smoke`) builds coverage-instrumented binaries, +installs them as a temporary systemd service, and runs end-to-end +scenarios against real Firecracker. Requires a KVM-capable host and +`sudo`. The suite lives under `internal/smoketest/` (build-tagged +`smoke`); `make smoke-list` prints scenario names; `make smoke-one +SCENARIO=` runs just one (comma-separated for several). See +the smoke comments in the `Makefile` for details. + +## Pre-commit hook + +```bash +make install-hooks +``` + +Points `core.hooksPath` at `.githooks/`, which runs lint + test + +build on every commit. Bypass with `git commit --no-verify`; revert +with `git config --unset core.hooksPath`. + +## Internals + +- [`docs/privileges.md`](docs/privileges.md) — daemon split, capability set, trust model. +- [`docs/release-process.md`](docs/release-process.md) — cutting and signing a release. +- [`AGENTS.md`](AGENTS.md) — repo-wide notes for code agents. diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..63a2f3f --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2026 Thales Maciel + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/Makefile b/Makefile index 4dd0db6..640f615 100644 --- a/Makefile +++ b/Makefile @@ -15,66 +15,237 @@ BANGERD_BIN ?= $(BUILD_BIN_DIR)/bangerd VSOCK_AGENT_BIN ?= $(BUILD_BIN_DIR)/banger-vsock-agent BINARIES := $(BANGER_BIN) $(BANGERD_BIN) $(VSOCK_AGENT_BIN) GO_SOURCES := $(shell find cmd internal -type f -name '*.go' | sort) -VOID_IMAGE_NAME ?= void-exp -VOID_VM_NAME ?= void-dev -ALPINE_RELEASE ?= 3.23.3 -ALPINE_IMAGE_NAME ?= alpine -ALPINE_VM_NAME ?= alpine-dev +# BUILD_INPUTS is everything that can change a binary's bytes: Go sources +# plus embedded assets (catalog.json, future static files). Listing +# everything is cheaper than missing a rebuild — go's own cache absorbs +# any redundant invocations. +BUILD_INPUTS := $(shell find cmd internal -type f | sort) +SHELL_SOURCES := $(shell find scripts -type f -name '*.sh' | sort) +SMOKE_DIR := $(BUILD_DIR)/smoke +SMOKE_BIN_DIR := $(SMOKE_DIR)/bin +SMOKE_COVER_DIR := $(SMOKE_DIR)/covdata +SMOKE_XDG_DIR := $(SMOKE_DIR)/xdg +VERSION ?= $(shell git describe --tags --exact-match 2>/dev/null || echo dev) +COMMIT ?= $(shell git rev-parse --verify HEAD 2>/dev/null || echo unknown) +BUILT_AT ?= $(shell date -u +%Y-%m-%dT%H:%M:%SZ) +GO_LDFLAGS := -X banger/internal/buildinfo.Version=$(VERSION) -X banger/internal/buildinfo.Commit=$(COMMIT) -X banger/internal/buildinfo.BuiltAt=$(BUILT_AT) .DEFAULT_GOAL := help -.PHONY: help build banger bangerd test fmt tidy clean rootfs rootfs-void void-kernel void-register void-vm verify-void alpine-kernel rootfs-alpine alpine-register alpine-vm verify-alpine install bench-create +# `make smoke-one` requires SCENARIO=. Validate before any prerequisite +# (notably smoke-build) so a typo'd invocation doesn't pay for a Go +# rebuild before learning it's wrong. +ifneq (,$(filter smoke-one,$(MAKECMDGOALS))) +ifndef SCENARIO +$(error smoke-one needs SCENARIO=name (see `make smoke-list` for names)) +endif +endif + +.PHONY: help build banger bangerd test fmt tidy clean install uninstall lint lint-go lint-shell coverage coverage-html coverage-total coverage-combined coverage-combined-html smoke smoke-build smoke-list smoke-one smoke-coverage-html smoke-clean smoke-fresh install-hooks help: @printf '%s\n' \ 'Targets:' \ - ' make build Build ./build/bin/banger, ./build/bin/bangerd, and ./build/bin/banger-vsock-agent' \ - ' make bench-create Benchmark vm create and SSH readiness with scripts/bench-create.sh' \ - ' make install Build and install banger, bangerd, and the companion vsock helper' \ - ' make test Run go test ./...' \ - ' make fmt Format Go sources under cmd/ and internal/' \ - ' make tidy Run go mod tidy' \ - ' make clean Remove built Go binaries' \ - ' make rootfs Rebuild the manual Debian rootfs image in ./build/manual' \ - ' make void-kernel Download and stage a Void kernel, initramfs, and modules under ./build/manual/void-kernel' \ - ' make rootfs-void Build an experimental Void Linux rootfs and work-seed in ./build/manual' \ - ' make void-register Register or update the experimental Void image as $(VOID_IMAGE_NAME)' \ - ' make void-vm Register the experimental Void image and create a VM named $(VOID_VM_NAME)' \ - ' make verify-void Register the experimental Void image and run scripts/verify.sh against it' \ - ' make alpine-kernel Download and stage an Alpine virt kernel, initramfs, and modules under ./build/manual/alpine-kernel' \ - ' make rootfs-alpine Build an experimental Alpine Linux rootfs and work-seed in ./build/manual' \ - ' make alpine-register Register or update the experimental Alpine image as $(ALPINE_IMAGE_NAME)' \ - ' make alpine-vm Register the experimental Alpine image and create a VM named $(ALPINE_VM_NAME)' \ - ' make verify-alpine Register the experimental Alpine image and run scripts/verify.sh against it' + ' make build Build ./build/bin/banger, ./build/bin/bangerd, and ./build/bin/banger-vsock-agent' \ + ' make install Build and install banger, bangerd, and the companion vsock helper' \ + ' make uninstall Stop the daemon and remove installed binaries (leaves user state by default)' \ + ' make test Run go test ./...' \ + ' make coverage Run tests with coverage; print per-package + total' \ + ' make coverage-html Open a browsable per-line HTML report (writes coverage.html)' \ + ' make coverage-total Print just the total statement coverage (for scripts/CI)' \ + ' make coverage-combined Merge unit-test + smoke covdata; print per-package + total' \ + ' make coverage-combined-html HTML report of the merged unit+smoke coverage' \ + ' make lint Run gofmt + go vet + shellcheck (errors)' \ + ' make fmt Format Go sources under cmd/ and internal/' \ + ' make tidy Run go mod tidy' \ + ' make clean Remove built Go binaries and coverage artefacts' \ + ' make smoke Build instrumented binaries, run the supported systemd smoke suite, report coverage (needs KVM + sudo)' \ + ' make smoke JOBS=N Override parallelism (default: nproc, capped at 8). JOBS=1 forces serial.' \ + ' make smoke-list Print the list of smoke scenarios (no build, no install)' \ + ' make smoke-one SCENARIO=NAME Run a single smoke scenario (still does the install preamble; comma-separated for several)' \ + ' make smoke-fresh smoke-clean + smoke — purges stale smoke-owned installs before a clean supported-path run' \ + ' make smoke-coverage-html HTML coverage report from the last smoke run' \ + ' make smoke-clean Remove the smoke build tree and purge any stale smoke-owned system install' \ + ' make install-hooks Point core.hooksPath at .githooks (lint + test + build run on every commit)' build: $(BINARIES) -$(BANGER_BIN): $(GO_SOURCES) go.mod go.sum +$(BANGER_BIN): $(BUILD_INPUTS) go.mod go.sum mkdir -p "$(BUILD_BIN_DIR)" - $(GO) build -o "$(BANGER_BIN)" ./cmd/banger + $(GO) build -ldflags '$(GO_LDFLAGS)' -o "$(BANGER_BIN)" ./cmd/banger -$(BANGERD_BIN): $(GO_SOURCES) go.mod go.sum +$(BANGERD_BIN): $(BUILD_INPUTS) go.mod go.sum mkdir -p "$(BUILD_BIN_DIR)" - $(GO) build -o "$(BANGERD_BIN)" ./cmd/bangerd + $(GO) build -ldflags '$(GO_LDFLAGS)' -o "$(BANGERD_BIN)" ./cmd/bangerd -$(VSOCK_AGENT_BIN): $(GO_SOURCES) go.mod go.sum +$(VSOCK_AGENT_BIN): $(BUILD_INPUTS) go.mod go.sum mkdir -p "$(BUILD_BIN_DIR)" - CGO_ENABLED=0 GOOS=linux GOARCH=amd64 $(GO) build -o "$(VSOCK_AGENT_BIN)" ./cmd/banger-vsock-agent + CGO_ENABLED=0 GOOS=linux GOARCH=amd64 $(GO) build -ldflags '$(GO_LDFLAGS)' -o "$(VSOCK_AGENT_BIN)" ./cmd/banger-vsock-agent test: $(GO) test ./... +# Coverage targets use -coverpkg=./... so packages without their own +# tests still get counted when another package exercises them (common +# for daemon/* subpackages). coverage.out is gitignored. +coverage: + $(GO) test -coverpkg=./... -coverprofile=coverage.out ./... + @echo '' + @echo 'Per-package:' + @$(GO) tool cover -func=coverage.out | awk -F'\t+' '/^total:/ {total=$$NF; next} {pkg=$$1; sub("banger/", "", pkg); sub("/[^/]+:[0-9]+:$$", "", pkg); pkgs[pkg]+=1; covered[pkg]+=$$NF+0} END {for (p in pkgs) printf " %-40s %.1f%% (avg of %d funcs)\n", p, covered[p]/pkgs[p], pkgs[p] | "sort"; print ""; print "Total statement coverage:", total}' + +coverage-html: coverage + $(GO) tool cover -html=coverage.out -o coverage.html + @echo 'wrote coverage.html' + +coverage-total: + @$(GO) test -coverpkg=./... -coverprofile=coverage.out ./... >/dev/null 2>&1 && $(GO) tool cover -func=coverage.out | awk '/^total:/ {print $$NF}' + +# coverage-combined unions unit-test coverage and smoke coverage into +# one report. Unit tests cover pure-Go logic (error branches, parsing, +# handler wiring); smoke covers the real sudo / firecracker / dm-snap +# paths that unit tests physically can't reach. Separately each tells +# half the story. Merged, this is the single "what's not being +# exercised at all" view. +# +# Requires an up-to-date smoke run (the target depends on smoke-build +# to rebuild instrumented binaries; re-run `make smoke` yourself if +# scenarios changed). Modes must match; smoke uses the default 'set', +# so the unit run below drops the default 'atomic' for alignment. +COMBINED_COVER_DIR := $(BUILD_DIR)/combined +UNIT_COVER_DIR := $(BUILD_DIR)/unit/covdata +coverage-combined: + @test -d "$(SMOKE_COVER_DIR)" && test "$$(ls -A $(SMOKE_COVER_DIR) 2>/dev/null)" || { \ + echo 'no smoke covdata at $(SMOKE_COVER_DIR); run `make smoke` first' >&2; exit 1; \ + } + rm -rf "$(UNIT_COVER_DIR)" "$(COMBINED_COVER_DIR)" + mkdir -p "$(UNIT_COVER_DIR)" "$(COMBINED_COVER_DIR)" + $(GO) test -cover -coverpkg=./... ./... -args -test.gocoverdir="$(abspath $(UNIT_COVER_DIR))" >/dev/null + $(GO) tool covdata merge -i="$(UNIT_COVER_DIR),$(SMOKE_COVER_DIR)" -o="$(COMBINED_COVER_DIR)" + $(GO) tool covdata textfmt -i="$(COMBINED_COVER_DIR)" -o="$(BUILD_DIR)/combined.cover.out" + @echo '' + @echo 'Per-package (merged unit + smoke):' + @$(GO) tool cover -func="$(BUILD_DIR)/combined.cover.out" | awk -F'\t+' '/^total:/ {total=$$NF; next} {pkg=$$1; sub("banger/", "", pkg); sub("/[^/]+:[0-9]+:$$", "", pkg); pkgs[pkg]+=1; covered[pkg]+=$$NF+0} END {for (p in pkgs) printf " %-40s %.1f%% (avg of %d funcs)\n", p, covered[p]/pkgs[p], pkgs[p] | "sort"; print ""; print "Total statement coverage:", total}' + +coverage-combined-html: coverage-combined + $(GO) tool cover -html="$(BUILD_DIR)/combined.cover.out" -o "$(BUILD_DIR)/combined.cover.html" + @echo 'wrote $(BUILD_DIR)/combined.cover.html' + +lint: lint-go lint-shell + +lint-go: + @unformatted="$$($(GOFMT) -l $(GO_SOURCES))"; \ + if [ -n "$$unformatted" ]; then \ + printf 'gofmt: the following files are not formatted:\n%s\n' "$$unformatted" >&2; \ + exit 1; \ + fi + $(GO) vet ./... + +lint-shell: + @command -v shellcheck >/dev/null 2>&1 || { echo 'shellcheck is required for make lint-shell' >&2; exit 1; } + shellcheck --severity=error $(SHELL_SOURCES) + fmt: $(GOFMT) -w $(GO_SOURCES) tidy: $(GO) mod tidy -clean: - rm -rf "$(BUILD_BIN_DIR)" +# Local-only: redirect git's hook lookup at .githooks/ so .githooks/pre-commit +# fires on every `git commit`. Idempotent. Bypass an individual commit with +# `git commit --no-verify`. +install-hooks: + git config core.hooksPath .githooks + @echo 'core.hooksPath -> .githooks (run `git config --unset core.hooksPath` to revert)' -bench-create: build - BANGER_BIN="$(abspath $(BANGER_BIN))" bash ./scripts/bench-create.sh $(ARGS) +clean: + rm -rf "$(BUILD_BIN_DIR)" coverage.out coverage.html + +# Smoke test suite. Builds the three banger binaries with -cover +# instrumentation under $(SMOKE_BIN_DIR), installs them as temporary +# bangerd.service + bangerd-root.service, runs the Go scenarios under +# internal/smoketest (built with -tags=smoke), copies service covdata +# out of /var/lib/banger, then purges the smoke-owned install on exit. +# +# This touches global systemd state. The harness refuses to overwrite a +# pre-existing non-smoke install and drops a marker file under +# /etc/banger so `make smoke-clean` can recover a stale smoke-owned +# install after an interrupted run. +# +# Requires a KVM-capable Linux host with sudo. This is a pre-release +# gate, not CI — the Go unit suite (`make test`) is what runs everywhere. +smoke-build: $(SMOKE_BIN_DIR)/.built + +$(SMOKE_BIN_DIR)/.built: $(BUILD_INPUTS) go.mod go.sum + mkdir -p "$(SMOKE_BIN_DIR)" + $(GO) build -cover -ldflags '$(GO_LDFLAGS)' -o "$(SMOKE_BIN_DIR)/banger" ./cmd/banger + $(GO) build -cover -ldflags '$(GO_LDFLAGS)' -o "$(SMOKE_BIN_DIR)/bangerd" ./cmd/bangerd + CGO_ENABLED=0 GOOS=linux GOARCH=amd64 $(GO) build -ldflags '$(GO_LDFLAGS)' -o "$(SMOKE_BIN_DIR)/banger-vsock-agent" ./cmd/banger-vsock-agent + touch "$@" + +# JOBS defaults to nproc; SMOKE_JOBS clamps it at 8. Each parallel slot +# runs a smoke-tuned VM, and over-subscribing the host pushes +# waitForSSH past its 60s deadline. Floored at 1 so JOBS=1 still works. +JOBS ?= $(shell nproc 2>/dev/null || echo 1) +SMOKE_JOBS := $(shell n=$(JOBS); [ $$n -lt 1 ] && n=1; [ $$n -gt 8 ] && n=8; echo $$n) + +smoke: smoke-build + rm -rf "$(SMOKE_COVER_DIR)" + mkdir -p "$(SMOKE_COVER_DIR)" "$(SMOKE_XDG_DIR)" + BANGER_SMOKE_BIN_DIR="$(abspath $(SMOKE_BIN_DIR))" \ + BANGER_SMOKE_COVER_DIR="$(abspath $(SMOKE_COVER_DIR))" \ + BANGER_SMOKE_XDG_DIR="$(abspath $(SMOKE_XDG_DIR))" \ + $(GO) test -tags=smoke -count=1 -v -parallel $(SMOKE_JOBS) -timeout 30m ./internal/smoketest + @echo '' + @echo 'Smoke coverage:' + @$(GO) tool covdata percent -i="$(SMOKE_COVER_DIR)" + +# smoke-list parses the test scaffold for scenario names. Cheap: no +# smoke-build dep, no env vars, no test binary spawned. +smoke-list: + @grep -oE 't\.Run\("[a-z_]+", *test[A-Za-z]+\)' internal/smoketest/smoke_test.go \ + | sed -E 's/t\.Run\("([a-z_]+)".*/ \1/' + +# smoke-one runs one scenario (or a comma-separated list) with the +# install preamble. Comma list becomes a regex alternation so multiple +# scenarios can be selected without invoking go test by hand. +SCENARIO_PATTERN := $(shell echo '$(SCENARIO)' | tr ',' '|') + +smoke-one: smoke-build + rm -rf "$(SMOKE_COVER_DIR)" + mkdir -p "$(SMOKE_COVER_DIR)" "$(SMOKE_XDG_DIR)" + BANGER_SMOKE_BIN_DIR="$(abspath $(SMOKE_BIN_DIR))" \ + BANGER_SMOKE_COVER_DIR="$(abspath $(SMOKE_COVER_DIR))" \ + BANGER_SMOKE_XDG_DIR="$(abspath $(SMOKE_XDG_DIR))" \ + $(GO) test -tags=smoke -count=1 -v -timeout 30m \ + -run "TestSmoke/.*/($(SCENARIO_PATTERN))$$" \ + ./internal/smoketest + +smoke-coverage-html: smoke + $(GO) tool covdata textfmt -i="$(SMOKE_COVER_DIR)" -o="$(SMOKE_DIR)/cover.out" + $(GO) tool cover -html="$(SMOKE_DIR)/cover.out" -o "$(SMOKE_DIR)/cover.html" + @echo 'wrote $(SMOKE_DIR)/cover.html' + +smoke-clean: + @if sudo test -f /etc/banger/.smoke-owned; then \ + bin=''; \ + if [ -x "$(SMOKE_BIN_DIR)/banger" ]; then \ + bin="$(abspath $(SMOKE_BIN_DIR))/banger"; \ + elif [ -x "$(BANGER_BIN)" ]; then \ + bin="$(abspath $(BANGER_BIN))"; \ + elif [ -x /usr/local/bin/banger ]; then \ + bin=/usr/local/bin/banger; \ + fi; \ + if [ -n "$$bin" ]; then \ + sudo "$$bin" system uninstall --purge >/dev/null 2>&1 || true; \ + fi; \ + fi + rm -rf "$(SMOKE_DIR)" + +# smoke-fresh wipes the instrumented build tree, purges any stale +# smoke-owned install, and then runs the supported-path smoke suite +# from scratch. +smoke-fresh: smoke-clean smoke install: build mkdir -p "$(DESTDIR)$(BINDIR)" @@ -83,35 +254,18 @@ install: build $(INSTALL) -m 0755 "$(BANGERD_BIN)" "$(DESTDIR)$(BINDIR)/bangerd" $(INSTALL) -m 0755 "$(VSOCK_AGENT_BIN)" "$(DESTDIR)$(LIBDIR)/banger/banger-vsock-agent" -rootfs: - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/make-rootfs.sh $(ARGS) - -void-kernel: - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" ./scripts/make-void-kernel.sh $(ARGS) - -rootfs-void: - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/make-rootfs-void.sh $(ARGS) - -void-register: build - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" VOID_IMAGE_NAME="$(VOID_IMAGE_NAME)" BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/register-void-image.sh - -void-vm: void-register - "$(abspath $(BANGER_BIN))" vm create --image "$(VOID_IMAGE_NAME)" --name "$(VOID_VM_NAME)" - -verify-void: void-register - BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/verify.sh --image "$(VOID_IMAGE_NAME)" - -alpine-kernel: - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" ALPINE_RELEASE="$(ALPINE_RELEASE)" ./scripts/make-alpine-kernel.sh $(ARGS) - -rootfs-alpine: - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" ALPINE_RELEASE="$(ALPINE_RELEASE)" BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/make-rootfs-alpine.sh $(ARGS) - -alpine-register: build - BANGER_MANUAL_DIR="$(abspath $(BUILD_MANUAL_DIR))" ALPINE_IMAGE_NAME="$(ALPINE_IMAGE_NAME)" BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/register-alpine-image.sh - -alpine-vm: alpine-register - "$(abspath $(BANGER_BIN))" vm create --image "$(ALPINE_IMAGE_NAME)" --name "$(ALPINE_VM_NAME)" - -verify-alpine: alpine-register - BANGER_BIN="$(abspath $(BANGER_BIN))" ./scripts/verify.sh --image "$(ALPINE_IMAGE_NAME)" +# uninstall stops a running daemon (if any) and removes the installed +# binaries. It does NOT touch user data (config, SSH keys, VM state, +# image/kernel caches) — rm -rf those paths manually if wanted; they +# are printed for convenience. +uninstall: + @if [ -x "$(DESTDIR)$(BINDIR)/banger" ]; then \ + "$(DESTDIR)$(BINDIR)/banger" daemon stop >/dev/null 2>&1 || true; \ + fi + rm -f "$(DESTDIR)$(BINDIR)/banger" "$(DESTDIR)$(BINDIR)/bangerd" + rm -rf "$(DESTDIR)$(LIBDIR)/banger" + @printf '\nRemoved binaries. User data is preserved at:\n' + @printf ' ~/.config/banger/ (config, ssh keys)\n' + @printf ' ~/.local/state/banger/ (VMs, images, kernels, db, logs)\n' + @printf ' ~/.cache/banger/ (OCI layer cache)\n' + @printf '\nDelete those paths manually if you want a full purge.\n' diff --git a/README.md b/README.md index 9868e27..ab2a8e6 100644 --- a/README.md +++ b/README.md @@ -1,247 +1,172 @@ # banger -`banger` manages Firecracker development VMs with a local daemon, managed image artifacts, and a localhost web UI. +One-command development sandboxes on Firecracker microVMs. -## Requirements +![banger demo](assets/banger.gif) -- Linux with `/dev/kvm` -- `sudo` -- Firecracker installed on `PATH`, or `firecracker_bin` set in config -- The usual host tools checked by `./build/bin/banger doctor` +Spin up a clean Linux VM with your repo and tooling preloaded, drop +into ssh, and tear it down — all from one command. banger is built +for the dev loop, not the server use case: guests are short-lived, +single-user, reachable at `.vm` from your host, and disposable. -`banger` now owns complete managed image sets. A managed image includes: +## Quick start -- `rootfs` -- optional `work-seed` -- `kernel` -- optional `initrd` -- optional `modules` +**Requirements**: +- Linux x86_64 with KVM +- Systemd +- [Firecracker >= v1.5](https://github.com/firecracker-microvm/firecracker) -There is no runtime bundle anymore. - -## Build +Install: ```bash -make build +curl -fsSL https://releases.thaloco.com/banger/install.sh | bash ``` -This writes: +The installer downloads the signed release, then prompts for sudo for install. +[Read more about how banger uses sudo](#Security) -- `./build/bin/banger` -- `./build/bin/bangerd` -- `./build/bin/banger-vsock-agent` +Verify host configuration: +```bash +banger doctor +``` -## Install +First VM: +>The first run may take a couple minutes for the bundle download. +>Subsequent `vm run`s are expected to take from 1 to 3 seconds. ```bash -make install +banger vm run --name my-vm ``` -That installs: +This auto-pulls the default image and drops you into an interactive ssh session. +Disconnecting an interactive session leaves the VM running, +`--rm` auto-deletes the VM when the session or command exits. -- `banger` -- `bangerd` -- the `banger-vsock-agent` companion helper under `../lib/banger/` +## `vm run` + +```bash +banger vm run ./my-repo # copy /my-repo into /root/repo — drops into ssh +banger vm run ./repo -- make test # workspace + run command, exits with its status +banger vm run --rm -- script.sh # ephemeral: VM is deleted on exit +banger vm run -d ./repo --nat # detached: prep + bootstrap, exit (no ssh attach) +``` + +If a repository is passed, banger copies your repo's git-tracked files +into `/root/repo` and runs a `mise` bootstrap from `.mise.toml` / +`.tool-versions` if either is present. The bootstrap reaches the +public internet, so workspaces with mise manifests require `--nat`; +pass `--no-bootstrap` to skip the install entirely. Untracked files +are skipped by default — pass `--include-untracked` to ship them +too, or `--dry-run` to preview the file list. + +In **command mode** (`-- `), the exit code propagates through +`banger`. In **detached mode** (`-d`), banger creates the VM, runs +workspace prep + bootstrap synchronously, then exits — no ssh +attach. Reconnect later with `banger vm ssh `. + +### Other VM verbs + +The CLI tries to feel familiar — every command and subcommand has +`--help`. Beyond `vm run`: `vm list` shows running VMs (`--all` for +every state), `vm ssh ` reconnects to one, `vm exec -- +` runs a command without a shell, `vm stop` / `vm kill` shut a +VM down (graceful / hard), `vm delete` removes a stopped one, and +`vm prune` sweeps every non-running VM. + +### `--nat`: outbound internet + +By default, a guest can't reach the internet. +Pass `--nat` to enable it (host-side MASQUERADE): + +```bash +banger vm run --nat ./repo -- npm install +``` + +`--nat` works on `vm run` and `vm create`. To toggle on an existing +VM: `banger vm set --nat ` (or `--no-nat` to remove it). + +## Hostnames: `.vm` + +banger's daemon runs a DNS server for the `.vm` zone. With host-side +DNS routing, `curl http://sandbox.vm:3000` works from anywhere on +the host — no IP juggling. On systemd-resolved hosts, banger wires +this up automatically; everywhere else there's a manual recipe in +[`docs/dns-routing.md`](docs/dns-routing.md). + +For `ssh sandbox.vm` (instead of `banger vm ssh sandbox`): + +```bash +banger ssh-config --install +``` + +That adds a marker-fenced `Include` line to `~/.ssh/config`. +`banger ssh-config --uninstall` reverses it. ## Config -Config lives at `~/.config/banger/config.toml`. - -Supported keys: - -- `log_level` -- `web_listen_addr` -- `firecracker_bin` -- `ssh_key_path` -- `default_image_name` -- `auto_stop_stale_after` -- `stats_poll_interval` -- `metrics_poll_interval` -- `bridge_name` -- `bridge_ip` -- `cidr` -- `tap_pool_size` -- `default_dns` - -If `ssh_key_path` is unset, banger creates and uses: - -- `~/.config/banger/ssh/id_ed25519` - -`default_image_name` now only means “use this registered image when `vm create` omits `--image`”. The daemon does not auto-register images from host paths. - -## Core Workflow - -Check the host: - -```bash -./build/bin/banger doctor -``` - -Register an existing host-side image stack: - -```bash -./build/bin/banger image register \ - --name base \ - --rootfs /abs/path/rootfs.ext4 \ - --kernel /abs/path/vmlinux \ - --initrd /abs/path/initrd.img \ - --modules /abs/path/modules -``` - -Build a managed image from an existing registered image: - -```bash -./build/bin/banger image build \ - --name devbox \ - --from-image base \ - --docker -``` - -Promote an unmanaged image into daemon-owned managed artifacts: - -```bash -./build/bin/banger image promote base -``` - -Create and use a VM: - -```bash -./build/bin/banger vm create --image devbox --name testbox -./build/bin/banger vm ssh testbox -./build/bin/banger vm stop testbox -``` - -`vm create` stays synchronous by default, but on a TTY it now shows live progress until the VM is fully ready. - -Start a repo-backed VM session and attach `opencode` automatically: - -```bash -./build/bin/banger vm run -./build/bin/banger vm run ../some-repo --branch feature/alpine --from HEAD -``` - -`vm run` resolves the enclosing git repository, creates a VM, copies a git checkout plus current tracked and untracked non-ignored files into `/root/`, and then runs `opencode attach` from the host against the guest. - -## Web UI - -`bangerd` serves a local web UI by default at: - -- `http://127.0.0.1:7777` - -See the effective URL with: - -```bash -./build/bin/banger daemon status -``` - -Disable it with: +`~/.config/banger/config.toml`. All keys are optional: ```toml -web_listen_addr = "" +[vm_defaults] +vcpu = 4 +memory_mib = 4096 +disk_size = "16G" + +[[file_sync]] +host = "~/.config/git/config" +guest = "~/.config/git/config" + +[[file_sync]] +host = "~/.aws" +guest = "~/.aws" ``` -## Guest Services +`vm_defaults` overrides banger's host-derived sizing. `file_sync` +copies host files into the VM's work disk at create time — handy +for credentials and dotfiles you want in every sandbox. Full +reference: [`docs/config.md`](docs/config.md). -Provisioned images include: - -- `banger-vsock-agent` -- guest networking bootstrap -- `mise` -- `opencode` -- a default guest `opencode` service on `0.0.0.0:4096` - -If host `~/.local/share/opencode/auth.json` exists, `banger` syncs it into the guest at `/root/.local/share/opencode/auth.json` on VM start. Changes on the host take effect after the VM is restarted. - -From the host: +## Updating ```bash -./build/bin/banger vm ports testbox -opencode attach http://:4096 +banger update --check # is a newer release available? +sudo banger update # download, verify, swap, restart, run doctor ``` -## Manual Helpers +The release tarball is cosign-verified against a public key embedded +in the running binary. On any post-swap failure, banger auto-restores +the previous install. See [`docs/privileges.md`](docs/privileges.md) +for the trust model. -The shell helpers are now explicit manual workflows under `./build/manual`. - -Rebuild a Debian-style manual rootfs: +## Uninstalling ```bash -make rootfs ARGS='--base-rootfs /abs/path/rootfs.ext4 --kernel /abs/path/vmlinux --initrd /abs/path/initrd.img --modules /abs/path/modules' +sudo banger system uninstall # remove services + binaries; keep state +sudo banger system uninstall --purge # also wipe VMs, images, caches under /var/lib/banger ``` -The output lands in: +User config (`~/.config/banger/`) and SSH key +(`~/.local/state/banger/ssh/`) stay put either way — delete them by +hand if you want a full clean slate. -- `./build/manual/rootfs-docker.ext4` -- `./build/manual/rootfs-docker.work-seed.ext4` +## Security -## Experimental Void Flow +Guest VMs are single-user dev sandboxes, not multi-tenant servers. +sshd accepts only the host SSH key (no passwords, no +kbd-interactive), and guests are reachable only through the host +bridge (`172.16.0.0/24`). Don't expose the bridge or guest IPs to +an untrusted network. -Stage a Void kernel: +The privileged surface lives entirely in `bangerd-root.service` and +is documented in [`docs/privileges.md`](docs/privileges.md). -```bash -make void-kernel -``` +## Further reading -Build the experimental Void rootfs: - -```bash -make rootfs-void -``` - -Register it: - -```bash -make void-register -``` - -That flow uses: - -- `./build/manual/void-kernel/` -- `./build/manual/rootfs-void.ext4` -- `./build/manual/rootfs-void.work-seed.ext4` - -## Experimental Alpine Flow - -Stage an Alpine virt kernel: - -```bash -make alpine-kernel -``` - -Build the experimental Alpine rootfs: - -```bash -make rootfs-alpine -``` - -Register it: - -```bash -make alpine-register -``` - -Create a VM from it: - -```bash -./build/bin/banger vm create --image alpine --name alpine-dev -``` - -That flow uses: - -- `./build/manual/alpine-kernel/` -- `./build/manual/rootfs-alpine.ext4` -- `./build/manual/rootfs-alpine.work-seed.ext4` - -The experimental Alpine flow stages a pinned Alpine release by default. Override -that pin with `ALPINE_RELEASE=...` when running the `make alpine-kernel` and -`make rootfs-alpine` helpers if you need a different patch release. - -Alpine support currently applies to the explicit register-and-run flow above. -The generic `banger image build --from-image ...` path remains Debian/systemd- -oriented and should not be treated as an Alpine image builder. - -## Notes - -- Firecracker is resolved from `PATH` by default. -- Managed image delete removes the daemon-owned artifact dir. -- The companion vsock helper is internal to the install/build layout, not a user-configured runtime path. +- [`docs/config.md`](docs/config.md) — full config reference. +- [`docs/dns-routing.md`](docs/dns-routing.md) — `.vm` host-side resolution. +- [`docs/image-catalog.md`](docs/image-catalog.md) — image bundles and how to publish. +- [`docs/kernel-catalog.md`](docs/kernel-catalog.md) — kernel bundles. +- [`docs/oci-import.md`](docs/oci-import.md) — pulling arbitrary OCI images. +- [`docs/advanced.md`](docs/advanced.md) — `vm create`, scripting, custom rootfs. +- [`docs/privileges.md`](docs/privileges.md) — trust model, capability set, daemon split. +- [`CONTRIBUTING.md`](CONTRIBUTING.md) — building from source, running tests. diff --git a/assets/banger.gif b/assets/banger.gif new file mode 100644 index 0000000..2f88c5a Binary files /dev/null and b/assets/banger.gif differ diff --git a/assets/demo.tape b/assets/demo.tape new file mode 100644 index 0000000..d68741a --- /dev/null +++ b/assets/demo.tape @@ -0,0 +1,112 @@ +# banger hero demo — VHS tape +# Render with: vhs assets/demo.tape + +Output assets/banger.gif + +Require banger +Require ssh +Require curl + +Set Shell "bash" +Set FontSize 14 +Set LineHeight 1.4 +Set Width 1200 +Set Height 720 +Set Padding 20 +Set Theme "Catppuccin Frappe" +Set TypingSpeed 66ms + +# Off-camera reset: enable bash syntax highlighting via ble.sh, prompt +# styling, drop any prior demo VM, and clear the screen. +Hide +Type "source ~/.local/share/blesh/ble.sh --noattach" +Enter +Sleep 200ms +Type "bleopt complete_auto_complete= complete_auto_history=" +Enter +Sleep 100ms +Type `export PS1="\n$PS1"` +Enter +Sleep 200ms +Type "[[ ${BLE_VERSION-} ]] && ble-attach" +Enter +Sleep 400ms +Type "ble-face -s syntax_error fg=red" +Enter +Sleep 100ms +Type "banger vm kill demo 2>/dev/null; banger vm delete demo 2>/dev/null; clear" +Enter +Sleep 500ms +Show + +Type "banger vm run --nat --name demo" +Enter +Wait+Line /demo:~#/ +Sleep 1.4s + +Type "uname -a" +Enter +Sleep 1.4s + +Type "exit" +Enter +Wait +Sleep 700ms + +Type "banger vm list" +Enter +Wait +Sleep 1.8s + +Type "ssh demo.vm" +Enter +Wait+Line /demo:~#/ +Sleep 500ms + +Type "touch foo bar baz" +Enter +Sleep 700ms + +Type "ls" +Enter +Sleep 1.4s + +Type "exit" +Enter +Sleep 700ms + +Type "banger vm stop demo" +Enter +Wait +Sleep 1s + +Type "banger vm start demo" +Enter +Wait +Sleep 1s + +Type "banger vm exec demo -- ls" +Enter +Wait +Sleep 1.4s + +Type "banger vm exec demo -- docker run -d -p 80:80 nginx" +Enter +Wait +Sleep 1.6s + +Type "banger vm ports demo" +Enter +Wait +Sleep 2s + +Type "curl http://demo.vm" +Sleep 1.2s +Enter +Wait +Sleep 4s + +Type "banger vm kill demo && banger vm delete demo" +Enter +Wait +Sleep 3s diff --git a/cmd/banger-vsock-agent/main.go b/cmd/banger-vsock-agent/main.go index 54cf31a..a45a8c0 100644 --- a/cmd/banger-vsock-agent/main.go +++ b/cmd/banger-vsock-agent/main.go @@ -11,12 +11,15 @@ import ( "syscall" "time" + "banger/internal/buildinfo" sdkvsock "github.com/firecracker-microvm/firecracker-go-sdk/vsock" "github.com/sirupsen/logrus" "banger/internal/vsockagent" ) +var _, _, _ = buildinfo.Version, buildinfo.Commit, buildinfo.BuiltAt + func main() { ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM) defer cancel() diff --git a/cmd/banger/main.go b/cmd/banger/main.go index f7a616f..ca2bd69 100644 --- a/cmd/banger/main.go +++ b/cmd/banger/main.go @@ -2,12 +2,14 @@ package main import ( "context" + "errors" "fmt" "os" "os/signal" "syscall" "banger/internal/cli" + "banger/internal/cli/style" ) func main() { @@ -16,7 +18,16 @@ func main() { cmd := cli.NewBangerCommand() if err := cmd.ExecuteContext(ctx); err != nil { - fmt.Fprintf(os.Stderr, "banger: %v\n", err) + var exitErr cli.ExitCodeError + if errors.As(err, &exitErr) { + os.Exit(exitErr.Code) + } + // Render the failure through the CLI's translator so RPC + // codes become friendly text, op_ids land in parens for + // journalctl grepping, and the "banger:" prefix turns red + // on a TTY. + prefix := style.Fail(os.Stderr, "banger:") + fmt.Fprintf(os.Stderr, "%s %s\n", prefix, cli.TranslateError(os.Stderr, err)) os.Exit(1) } } diff --git a/cmd/bangerd/main.go b/cmd/bangerd/main.go index 0cf8ab1..ee4826b 100644 --- a/cmd/bangerd/main.go +++ b/cmd/bangerd/main.go @@ -11,6 +11,12 @@ import ( ) func main() { + // 0o077 ensures the firecracker API/vsock sockets (and any other files + // the daemon or its children create) are user-private by default. The + // previous shell wrapper around firecracker exec did this inline; with + // the wrapper gone, the daemon process owns the umask. + syscall.Umask(0o077) + ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM) defer stop() diff --git a/configs/firecracker-x86_64-6.1.config b/configs/firecracker-x86_64-6.1.config new file mode 100644 index 0000000..f56e15c --- /dev/null +++ b/configs/firecracker-x86_64-6.1.config @@ -0,0 +1,3556 @@ +# +# Automatically generated file; DO NOT EDIT. +# Linux/x86_64 6.1.167-27.319.amzn2023.x86_64 Kernel Configuration +# +CONFIG_CC_VERSION_TEXT="gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)" +CONFIG_CC_IS_GCC=y +CONFIG_GCC_VERSION=110500 +CONFIG_CLANG_VERSION=0 +CONFIG_AS_IS_GNU=y +CONFIG_AS_VERSION=24100 +CONFIG_LD_IS_BFD=y +CONFIG_LD_VERSION=24100 +CONFIG_LLD_VERSION=0 +CONFIG_CC_CAN_LINK=y +CONFIG_CC_CAN_LINK_STATIC=y +CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y +CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y +CONFIG_CC_HAS_ASM_INLINE=y +CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y +CONFIG_PAHOLE_VERSION=129 +CONFIG_IRQ_WORK=y +CONFIG_BUILDTIME_TABLE_SORT=y +CONFIG_THREAD_INFO_IN_TASK=y + +# +# General setup +# +CONFIG_INIT_ENV_ARG_LIMIT=32 +# CONFIG_COMPILE_TEST is not set +# CONFIG_WERROR is not set +CONFIG_LOCALVERSION="" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_BUILD_SALT="6.1.167-27.319.amzn2023.x86_64" +CONFIG_HAVE_KERNEL_GZIP=y +CONFIG_HAVE_KERNEL_BZIP2=y +CONFIG_HAVE_KERNEL_LZMA=y +CONFIG_HAVE_KERNEL_XZ=y +CONFIG_HAVE_KERNEL_LZO=y +CONFIG_HAVE_KERNEL_LZ4=y +CONFIG_HAVE_KERNEL_ZSTD=y +CONFIG_KERNEL_GZIP=y +# CONFIG_KERNEL_BZIP2 is not set +# CONFIG_KERNEL_LZMA is not set +# CONFIG_KERNEL_XZ is not set +# CONFIG_KERNEL_LZO is not set +# CONFIG_KERNEL_LZ4 is not set +# CONFIG_KERNEL_ZSTD is not set +CONFIG_DEFAULT_INIT="" +CONFIG_DEFAULT_HOSTNAME="(none)" +CONFIG_SYSVIPC=y +CONFIG_SYSVIPC_SYSCTL=y +CONFIG_SYSVIPC_COMPAT=y +CONFIG_POSIX_MQUEUE=y +CONFIG_POSIX_MQUEUE_SYSCTL=y +# CONFIG_WATCH_QUEUE is not set +CONFIG_CROSS_MEMORY_ATTACH=y +# CONFIG_USELIB is not set +CONFIG_AUDIT=y +CONFIG_HAVE_ARCH_AUDITSYSCALL=y +CONFIG_AUDITSYSCALL=y + +# +# IRQ subsystem +# +CONFIG_GENERIC_IRQ_PROBE=y +CONFIG_GENERIC_IRQ_SHOW=y +CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y +CONFIG_GENERIC_PENDING_IRQ=y +CONFIG_GENERIC_IRQ_MIGRATION=y +CONFIG_HARDIRQS_SW_RESEND=y +CONFIG_IRQ_DOMAIN=y +CONFIG_IRQ_DOMAIN_HIERARCHY=y +CONFIG_GENERIC_MSI_IRQ=y +CONFIG_GENERIC_MSI_IRQ_DOMAIN=y +CONFIG_IRQ_MSI_IOMMU=y +CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y +CONFIG_GENERIC_IRQ_RESERVATION_MODE=y +CONFIG_IRQ_FORCED_THREADING=y +CONFIG_SPARSE_IRQ=y +# CONFIG_GENERIC_IRQ_DEBUGFS is not set +# end of IRQ subsystem + +CONFIG_CLOCKSOURCE_WATCHDOG=y +CONFIG_ARCH_CLOCKSOURCE_INIT=y +CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y +CONFIG_GENERIC_TIME_VSYSCALL=y +CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y +CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y +CONFIG_GENERIC_CMOS_UPDATE=y +CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_CONTEXT_TRACKING=y +CONFIG_CONTEXT_TRACKING_IDLE=y + +# +# Timers subsystem +# +CONFIG_TICK_ONESHOT=y +CONFIG_NO_HZ_COMMON=y +# CONFIG_HZ_PERIODIC is not set +CONFIG_NO_HZ_IDLE=y +# CONFIG_NO_HZ_FULL is not set +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y +CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=100 +# end of Timers subsystem + +CONFIG_BPF=y +CONFIG_HAVE_EBPF_JIT=y +CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y + +# +# BPF subsystem +# +CONFIG_BPF_SYSCALL=y +CONFIG_BPF_UNPRIV_DEFAULT_OFF=y +CONFIG_USERMODE_DRIVER=y +CONFIG_BPF_PRELOAD=y +CONFIG_BPF_PRELOAD_UMD=y +# end of BPF subsystem + +CONFIG_PREEMPT_BUILD=y +CONFIG_PREEMPT_NONE=y +# CONFIG_PREEMPT_VOLUNTARY is not set +# CONFIG_PREEMPT is not set +CONFIG_PREEMPT_COUNT=y +CONFIG_PREEMPTION=y +CONFIG_PREEMPT_DYNAMIC=y +# CONFIG_SCHED_CORE is not set + +# +# CPU/Task time and stats accounting +# +CONFIG_TICK_CPU_ACCOUNTING=y +# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set +# CONFIG_IRQ_TIME_ACCOUNTING is not set +CONFIG_HAVE_SCHED_AVG_IRQ=y +CONFIG_BSD_PROCESS_ACCT=y +CONFIG_BSD_PROCESS_ACCT_V3=y +CONFIG_TASKSTATS=y +CONFIG_TASK_DELAY_ACCT=y +CONFIG_TASK_XACCT=y +CONFIG_TASK_IO_ACCOUNTING=y +CONFIG_PSI=y +CONFIG_PSI_DEFAULT_DISABLED=y +# end of CPU/Task time and stats accounting + +CONFIG_CPU_ISOLATION=y + +# +# RCU Subsystem +# +CONFIG_TREE_RCU=y +CONFIG_PREEMPT_RCU=y +# CONFIG_RCU_EXPERT is not set +CONFIG_SRCU=y +CONFIG_TREE_SRCU=y +CONFIG_TASKS_RCU_GENERIC=y +CONFIG_TASKS_RCU=y +CONFIG_TASKS_TRACE_RCU=y +CONFIG_RCU_STALL_COMMON=y +CONFIG_RCU_NEED_SEGCBLIST=y +# end of RCU Subsystem + +# CONFIG_IKCONFIG is not set +# CONFIG_IKHEADERS is not set +CONFIG_LOG_BUF_SHIFT=17 +CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 +CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13 +# CONFIG_PRINTK_INDEX is not set +CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y + +# +# Scheduler features +# +# CONFIG_UCLAMP_TASK is not set +# end of Scheduler features + +CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y +CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y +CONFIG_CC_HAS_INT128=y +CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5" +CONFIG_GCC10_NO_ARRAY_BOUNDS=y +CONFIG_CC_NO_ARRAY_BOUNDS=y +CONFIG_ARCH_SUPPORTS_INT128=y +CONFIG_NUMA_BALANCING=y +# CONFIG_NUMA_BALANCING_DEFAULT_ENABLED is not set +CONFIG_CGROUPS=y +CONFIG_PAGE_COUNTER=y +# CONFIG_CGROUP_FAVOR_DYNMODS is not set +CONFIG_MEMCG=y +CONFIG_MEMCG_KMEM=y +CONFIG_BLK_CGROUP=y +CONFIG_CGROUP_WRITEBACK=y +CONFIG_CGROUP_SCHED=y +CONFIG_FAIR_GROUP_SCHED=y +CONFIG_CFS_BANDWIDTH=y +CONFIG_RT_GROUP_SCHED=y +CONFIG_CGROUP_PIDS=y +# CONFIG_CGROUP_RDMA is not set +CONFIG_CGROUP_FREEZER=y +CONFIG_CGROUP_HUGETLB=y +CONFIG_CPUSETS=y +CONFIG_PROC_PID_CPUSET=y +CONFIG_CGROUP_DEVICE=y +CONFIG_CGROUP_CPUACCT=y +CONFIG_CGROUP_PERF=y +CONFIG_CGROUP_BPF=y +# CONFIG_CGROUP_MISC is not set +# CONFIG_CGROUP_DEBUG is not set +CONFIG_SOCK_CGROUP_DATA=y +CONFIG_NAMESPACES=y +CONFIG_UTS_NS=y +CONFIG_TIME_NS=y +CONFIG_IPC_NS=y +CONFIG_USER_NS=y +CONFIG_PID_NS=y +CONFIG_NET_NS=y +# CONFIG_CHECKPOINT_RESTORE is not set +CONFIG_SCHED_AUTOGROUP=y +# CONFIG_SYSFS_DEPRECATED is not set +CONFIG_RELAY=y +CONFIG_BLK_DEV_INITRD=y +CONFIG_INITRAMFS_SOURCE="" +CONFIG_RD_GZIP=y +CONFIG_RD_BZIP2=y +CONFIG_RD_LZMA=y +CONFIG_RD_XZ=y +CONFIG_RD_LZO=y +CONFIG_RD_LZ4=y +CONFIG_RD_ZSTD=y +# CONFIG_BOOT_CONFIG is not set +CONFIG_INITRAMFS_PRESERVE_MTIME=y +CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y +# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set +CONFIG_LD_ORPHAN_WARN=y +CONFIG_SYSCTL=y +CONFIG_HAVE_UID16=y +CONFIG_SYSCTL_EXCEPTION_TRACE=y +CONFIG_HAVE_PCSPKR_PLATFORM=y +# CONFIG_EXPERT is not set +CONFIG_UID16=y +CONFIG_MULTIUSER=y +CONFIG_SGETMASK_SYSCALL=y +CONFIG_SYSFS_SYSCALL=y +CONFIG_FHANDLE=y +CONFIG_POSIX_TIMERS=y +CONFIG_PRINTK=y +CONFIG_BUG=y +CONFIG_ELF_CORE=y +CONFIG_PCSPKR_PLATFORM=y +CONFIG_BASE_FULL=y +CONFIG_FUTEX=y +CONFIG_FUTEX_PI=y +CONFIG_EPOLL=y +CONFIG_SIGNALFD=y +CONFIG_TIMERFD=y +CONFIG_EVENTFD=y +CONFIG_SHMEM=y +CONFIG_AIO=y +CONFIG_IO_URING=y +CONFIG_ADVISE_SYSCALLS=y +CONFIG_MEMBARRIER=y +CONFIG_KALLSYMS=y +# CONFIG_KALLSYMS_ALL is not set +CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y +CONFIG_KALLSYMS_BASE_RELATIVE=y +CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y +CONFIG_RSEQ=y +# CONFIG_EMBEDDED is not set +CONFIG_HAVE_PERF_EVENTS=y + +# +# Kernel Performance Events And Counters +# +CONFIG_PERF_EVENTS=y +# CONFIG_DEBUG_PERF_USE_VMALLOC is not set +# end of Kernel Performance Events And Counters + +CONFIG_PROFILING=y +# end of General setup + +CONFIG_64BIT=y +CONFIG_X86_64=y +CONFIG_X86=y +CONFIG_INSTRUCTION_DECODER=y +CONFIG_OUTPUT_FORMAT="elf64-x86-64" +CONFIG_LOCKDEP_SUPPORT=y +CONFIG_STACKTRACE_SUPPORT=y +CONFIG_MMU=y +CONFIG_ARCH_MMAP_RND_BITS_MIN=28 +CONFIG_ARCH_MMAP_RND_BITS_MAX=32 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8 +CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16 +CONFIG_GENERIC_ISA_DMA=y +CONFIG_GENERIC_BUG=y +CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y +CONFIG_ARCH_MAY_HAVE_PC_FDC=y +CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_ARCH_HAS_CPU_RELAX=y +CONFIG_ARCH_HIBERNATION_POSSIBLE=y +CONFIG_ARCH_NR_GPIO=1024 +CONFIG_ARCH_SUSPEND_POSSIBLE=y +CONFIG_AUDIT_ARCH=y +CONFIG_X86_64_SMP=y +CONFIG_ARCH_SUPPORTS_UPROBES=y +CONFIG_FIX_EARLYCON_MEM=y +CONFIG_PGTABLE_LEVELS=4 +CONFIG_CC_HAS_SANE_STACKPROTECTOR=y + +# +# Processor type and features +# +CONFIG_SMP=y +CONFIG_X86_FEATURE_NAMES=y +CONFIG_X86_X2APIC=y +# CONFIG_X86_MPPARSE is not set +# CONFIG_GOLDFISH is not set +# CONFIG_X86_CPU_RESCTRL is not set +# CONFIG_X86_EXTENDED_PLATFORM is not set +# CONFIG_X86_INTEL_LPSS is not set +# CONFIG_X86_AMD_PLATFORM_DEVICE is not set +# CONFIG_IOSF_MBI is not set +CONFIG_SCHED_OMIT_FRAME_POINTER=y +CONFIG_HYPERVISOR_GUEST=y +CONFIG_PARAVIRT=y +# CONFIG_PARAVIRT_DEBUG is not set +CONFIG_PARAVIRT_SPINLOCKS=y +CONFIG_X86_HV_CALLBACK_VECTOR=y +# CONFIG_XEN is not set +CONFIG_KVM_GUEST=y +CONFIG_ARCH_CPUIDLE_HALTPOLL=y +CONFIG_PVH=y +CONFIG_PARAVIRT_TIME_ACCOUNTING=y +CONFIG_PARAVIRT_CLOCK=y +# CONFIG_JAILHOUSE_GUEST is not set +# CONFIG_ACRN_GUEST is not set +# CONFIG_INTEL_TDX_GUEST is not set +# CONFIG_MK8 is not set +# CONFIG_MPSC is not set +# CONFIG_MCORE2 is not set +# CONFIG_MATOM is not set +CONFIG_GENERIC_CPU=y +CONFIG_X86_INTERNODE_CACHE_SHIFT=6 +CONFIG_X86_L1_CACHE_SHIFT=6 +CONFIG_X86_TSC=y +CONFIG_X86_CMPXCHG64=y +CONFIG_X86_CMOV=y +CONFIG_X86_MINIMUM_CPU_FAMILY=64 +CONFIG_X86_DEBUGCTLMSR=y +CONFIG_IA32_FEAT_CTL=y +CONFIG_X86_VMX_FEATURE_NAMES=y +CONFIG_CPU_SUP_INTEL=y +CONFIG_CPU_SUP_AMD=y +CONFIG_CPU_SUP_HYGON=y +CONFIG_CPU_SUP_CENTAUR=y +CONFIG_CPU_SUP_ZHAOXIN=y +CONFIG_HPET_TIMER=y +CONFIG_DMI=y +# CONFIG_GART_IOMMU is not set +# CONFIG_MAXSMP is not set +CONFIG_NR_CPUS_RANGE_BEGIN=2 +CONFIG_NR_CPUS_RANGE_END=512 +CONFIG_NR_CPUS_DEFAULT=64 +CONFIG_NR_CPUS=64 +CONFIG_SCHED_CLUSTER=y +CONFIG_SCHED_SMT=y +CONFIG_SCHED_MC=y +CONFIG_SCHED_MC_PRIO=y +CONFIG_X86_LOCAL_APIC=y +CONFIG_X86_IO_APIC=y +CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y +# CONFIG_X86_MCE is not set + +# +# Performance monitoring +# +CONFIG_PERF_EVENTS_INTEL_UNCORE=y +CONFIG_PERF_EVENTS_INTEL_RAPL=y +CONFIG_PERF_EVENTS_INTEL_CSTATE=y +# CONFIG_PERF_EVENTS_AMD_POWER is not set +CONFIG_PERF_EVENTS_AMD_UNCORE=y +# CONFIG_PERF_EVENTS_AMD_BRS is not set +# end of Performance monitoring + +CONFIG_X86_16BIT=y +CONFIG_X86_ESPFIX64=y +CONFIG_X86_VSYSCALL_EMULATION=y +CONFIG_X86_IOPL_IOPERM=y +# CONFIG_MICROCODE is not set +CONFIG_X86_MSR=y +CONFIG_X86_CPUID=y +# CONFIG_X86_5LEVEL is not set +CONFIG_X86_DIRECT_GBPAGES=y +# CONFIG_X86_CPA_STATISTICS is not set +# CONFIG_AMD_MEM_ENCRYPT is not set +CONFIG_NUMA=y +CONFIG_AMD_NUMA=y +CONFIG_X86_64_ACPI_NUMA=y +# CONFIG_NUMA_EMU is not set +CONFIG_NODES_SHIFT=10 +CONFIG_ARCH_SPARSEMEM_ENABLE=y +CONFIG_ARCH_SPARSEMEM_DEFAULT=y +CONFIG_ARCH_MEMORY_PROBE=y +CONFIG_ARCH_PROC_KCORE_TEXT=y +CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000 +# CONFIG_X86_PMEM_LEGACY is not set +CONFIG_X86_CHECK_BIOS_CORRUPTION=y +CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y +CONFIG_MTRR=y +CONFIG_MTRR_SANITIZER=y +CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0 +CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1 +CONFIG_X86_PAT=y +CONFIG_ARCH_USES_PG_UNCACHED=y +CONFIG_X86_UMIP=y +CONFIG_CC_HAS_IBT=y +# CONFIG_X86_KERNEL_IBT is not set +CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y +CONFIG_X86_INTEL_TSX_MODE_OFF=y +# CONFIG_X86_INTEL_TSX_MODE_ON is not set +# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set +# CONFIG_X86_SGX is not set +# CONFIG_EFI is not set +# CONFIG_HZ_100 is not set +CONFIG_HZ_250=y +# CONFIG_HZ_300 is not set +# CONFIG_HZ_1000 is not set +CONFIG_HZ=250 +CONFIG_SCHED_HRTICK=y +# CONFIG_KEXEC is not set +CONFIG_KEXEC_FILE=y +CONFIG_ARCH_HAS_KEXEC_PURGATORY=y +# CONFIG_KEXEC_SIG is not set +# CONFIG_CRASH_DUMP is not set +CONFIG_PHYSICAL_START=0x1000000 +CONFIG_RELOCATABLE=y +CONFIG_RANDOMIZE_BASE=y +CONFIG_X86_NEED_RELOCS=y +CONFIG_PHYSICAL_ALIGN=0x1000000 +CONFIG_DYNAMIC_MEMORY_LAYOUT=y +CONFIG_RANDOMIZE_MEMORY=y +CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa +CONFIG_HOTPLUG_CPU=y +# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set +# CONFIG_DEBUG_HOTPLUG_CPU0 is not set +# CONFIG_COMPAT_VDSO is not set +CONFIG_LEGACY_VSYSCALL_XONLY=y +# CONFIG_LEGACY_VSYSCALL_NONE is not set +# CONFIG_CMDLINE_BOOL is not set +CONFIG_MODIFY_LDT_SYSCALL=y +# CONFIG_STRICT_SIGALTSTACK_SIZE is not set +CONFIG_HAVE_LIVEPATCH=y +# end of Processor type and features + +CONFIG_CC_HAS_SLS=y +CONFIG_CC_HAS_RETURN_THUNK=y +CONFIG_CPU_MITIGATIONS=y +CONFIG_PAGE_TABLE_ISOLATION=y +CONFIG_RETPOLINE=y +CONFIG_RETHUNK=y +CONFIG_CPU_UNRET_ENTRY=y +CONFIG_CPU_IBPB_ENTRY=y +CONFIG_CPU_IBRS_ENTRY=y +CONFIG_CPU_SRSO=y +# CONFIG_SLS is not set +# CONFIG_GDS_FORCE_MITIGATION is not set +CONFIG_MITIGATION_RFDS=y +CONFIG_MITIGATION_SPECTRE_BHI=y +CONFIG_MITIGATION_ITS=y +CONFIG_MITIGATION_TSA=y +CONFIG_ARCH_HAS_ADD_PAGES=y +CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y + +# +# Power management and ACPI options +# +CONFIG_ARCH_HIBERNATION_HEADER=y +# CONFIG_SUSPEND is not set +CONFIG_HIBERNATE_CALLBACKS=y +CONFIG_HIBERNATION=y +CONFIG_HIBERNATION_SNAPSHOT_DEV=y +CONFIG_PM_STD_PARTITION="" +CONFIG_PM_SLEEP=y +CONFIG_PM_SLEEP_SMP=y +# CONFIG_PM_AUTOSLEEP is not set +# CONFIG_PM_USERSPACE_AUTOSLEEP is not set +# CONFIG_PM_WAKELOCKS is not set +CONFIG_PM=y +# CONFIG_PM_DEBUG is not set +CONFIG_PM_CLK=y +# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set +# CONFIG_ENERGY_MODEL is not set +CONFIG_ARCH_SUPPORTS_ACPI=y +CONFIG_ACPI=y +CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y +CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y +CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y +# CONFIG_ACPI_DEBUGGER is not set +# CONFIG_ACPI_SPCR_TABLE is not set +# CONFIG_ACPI_FPDT is not set +CONFIG_ACPI_LPIT=y +CONFIG_ACPI_SLEEP=y +# CONFIG_ACPI_REV_OVERRIDE_POSSIBLE is not set +# CONFIG_ACPI_EC_DEBUGFS is not set +# CONFIG_ACPI_AC is not set +# CONFIG_ACPI_BATTERY is not set +# CONFIG_ACPI_BUTTON is not set +# CONFIG_ACPI_TINY_POWER_BUTTON is not set +# CONFIG_ACPI_FAN is not set +# CONFIG_ACPI_TAD is not set +# CONFIG_ACPI_DOCK is not set +CONFIG_ACPI_CPU_FREQ_PSS=y +CONFIG_ACPI_PROCESSOR_CSTATE=y +CONFIG_ACPI_PROCESSOR_IDLE=y +CONFIG_ACPI_CPPC_LIB=y +CONFIG_ACPI_PROCESSOR=y +CONFIG_ACPI_HOTPLUG_CPU=y +# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set +# CONFIG_ACPI_THERMAL is not set +CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y +# CONFIG_ACPI_TABLE_UPGRADE is not set +# CONFIG_ACPI_DEBUG is not set +# CONFIG_ACPI_PCI_SLOT is not set +CONFIG_ACPI_CONTAINER=y +# CONFIG_ACPI_HOTPLUG_MEMORY is not set +CONFIG_ACPI_HOTPLUG_IOAPIC=y +# CONFIG_ACPI_SBS is not set +# CONFIG_ACPI_HED is not set +# CONFIG_ACPI_CUSTOM_METHOD is not set +# CONFIG_ACPI_NFIT is not set +CONFIG_ACPI_NUMA=y +# CONFIG_ACPI_HMAT is not set +CONFIG_HAVE_ACPI_APEI=y +CONFIG_HAVE_ACPI_APEI_NMI=y +# CONFIG_ACPI_APEI is not set +# CONFIG_ACPI_DPTF is not set +# CONFIG_ACPI_CONFIGFS is not set +# CONFIG_ACPI_PFRUT is not set +CONFIG_ACPI_PCC=y +# CONFIG_PMIC_OPREGION is not set +CONFIG_X86_PM_TIMER=y + +# +# CPU Frequency scaling +# +CONFIG_CPU_FREQ=y +CONFIG_CPU_FREQ_GOV_ATTR_SET=y +CONFIG_CPU_FREQ_STAT=y +CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y +# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set +# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set +CONFIG_CPU_FREQ_GOV_PERFORMANCE=y +# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set +# CONFIG_CPU_FREQ_GOV_USERSPACE is not set +# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set +# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set +CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y + +# +# CPU frequency scaling drivers +# +CONFIG_X86_INTEL_PSTATE=y +# CONFIG_X86_PCC_CPUFREQ is not set +# CONFIG_X86_AMD_PSTATE is not set +# CONFIG_X86_AMD_PSTATE_UT is not set +# CONFIG_X86_ACPI_CPUFREQ is not set +# CONFIG_X86_SPEEDSTEP_CENTRINO is not set +# CONFIG_X86_P4_CLOCKMOD is not set + +# +# shared options +# +# end of CPU Frequency scaling + +# +# CPU Idle +# +CONFIG_CPU_IDLE=y +CONFIG_CPU_IDLE_GOV_LADDER=y +CONFIG_CPU_IDLE_GOV_MENU=y +# CONFIG_CPU_IDLE_GOV_TEO is not set +CONFIG_CPU_IDLE_GOV_HALTPOLL=y +CONFIG_HALTPOLL_CPUIDLE=y +# end of CPU Idle + +CONFIG_INTEL_IDLE=y +# end of Power management and ACPI options + +# +# Bus options (PCI etc.) +# +CONFIG_PCI_DIRECT=y +CONFIG_PCI_MMCONFIG=y +CONFIG_MMCONF_FAM10H=y +CONFIG_ISA_DMA_API=y +CONFIG_AMD_NB=y +# end of Bus options (PCI etc.) + +# +# Binary Emulations +# +CONFIG_IA32_EMULATION=y +# CONFIG_X86_X32_ABI is not set +CONFIG_COMPAT_32=y +CONFIG_COMPAT=y +CONFIG_COMPAT_FOR_U64_ALIGNMENT=y +# end of Binary Emulations + +CONFIG_HAVE_KVM=y +# CONFIG_VIRTUALIZATION is not set +CONFIG_AS_AVX512=y +CONFIG_AS_SHA1_NI=y +CONFIG_AS_SHA256_NI=y +CONFIG_AS_TPAUSE=y +CONFIG_ARCH_CONFIGURES_CPU_MITIGATIONS=y + +# +# General architecture-dependent options +# +CONFIG_CRASH_CORE=y +CONFIG_KEXEC_CORE=y +CONFIG_HOTPLUG_SMT=y +CONFIG_GENERIC_ENTRY=y +CONFIG_JUMP_LABEL=y +# CONFIG_STATIC_KEYS_SELFTEST is not set +# CONFIG_STATIC_CALL_SELFTEST is not set +CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y +CONFIG_ARCH_USE_BUILTIN_BSWAP=y +CONFIG_HAVE_IOREMAP_PROT=y +CONFIG_HAVE_KPROBES=y +CONFIG_HAVE_KRETPROBES=y +CONFIG_HAVE_OPTPROBES=y +CONFIG_HAVE_KPROBES_ON_FTRACE=y +CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y +CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y +CONFIG_HAVE_NMI=y +CONFIG_TRACE_IRQFLAGS_SUPPORT=y +CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y +CONFIG_HAVE_ARCH_TRACEHOOK=y +CONFIG_HAVE_DMA_CONTIGUOUS=y +CONFIG_GENERIC_SMP_IDLE_THREAD=y +CONFIG_ARCH_HAS_FORTIFY_SOURCE=y +CONFIG_ARCH_HAS_SET_MEMORY=y +CONFIG_ARCH_HAS_SET_DIRECT_MAP=y +CONFIG_ARCH_HAS_CPU_FINALIZE_INIT=y +CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y +CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y +CONFIG_ARCH_WANTS_NO_INSTR=y +CONFIG_HAVE_ASM_MODVERSIONS=y +CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y +CONFIG_HAVE_RSEQ=y +CONFIG_HAVE_RUST=y +CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y +CONFIG_HAVE_HW_BREAKPOINT=y +CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y +CONFIG_HAVE_USER_RETURN_NOTIFIER=y +CONFIG_HAVE_PERF_EVENTS_NMI=y +CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y +CONFIG_HAVE_PERF_REGS=y +CONFIG_HAVE_PERF_USER_STACK_DUMP=y +CONFIG_HAVE_ARCH_JUMP_LABEL=y +CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y +CONFIG_MMU_GATHER_TABLE_FREE=y +CONFIG_MMU_GATHER_RCU_TABLE_FREE=y +CONFIG_MMU_GATHER_MERGE_VMAS=y +CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y +CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y +CONFIG_HAVE_CMPXCHG_LOCAL=y +CONFIG_HAVE_CMPXCHG_DOUBLE=y +CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y +CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y +CONFIG_HAVE_ARCH_SECCOMP=y +CONFIG_HAVE_ARCH_SECCOMP_FILTER=y +CONFIG_SECCOMP=y +CONFIG_SECCOMP_FILTER=y +# CONFIG_SECCOMP_CACHE_DEBUG is not set +CONFIG_HAVE_ARCH_STACKLEAK=y +CONFIG_HAVE_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR=y +CONFIG_STACKPROTECTOR_STRONG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG=y +CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y +CONFIG_LTO_NONE=y +CONFIG_ARCH_SUPPORTS_CFI_CLANG=y +CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y +CONFIG_HAVE_CONTEXT_TRACKING_USER=y +CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK=y +CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y +CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y +CONFIG_HAVE_MOVE_PUD=y +CONFIG_HAVE_MOVE_PMD=y +CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y +CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y +CONFIG_HAVE_ARCH_HUGE_VMAP=y +CONFIG_HAVE_ARCH_HUGE_VMALLOC=y +CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y +CONFIG_HAVE_ARCH_SOFT_DIRTY=y +CONFIG_HAVE_MOD_ARCH_SPECIFIC=y +CONFIG_MODULES_USE_ELF_RELA=y +CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y +CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y +CONFIG_SOFTIRQ_ON_OWN_STACK=y +CONFIG_ARCH_HAS_ELF_RANDOMIZE=y +CONFIG_HAVE_ARCH_MMAP_RND_BITS=y +CONFIG_HAVE_EXIT_THREAD=y +CONFIG_ARCH_MMAP_RND_BITS=28 +CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y +CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8 +CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y +CONFIG_PAGE_SIZE_LESS_THAN_64KB=y +CONFIG_PAGE_SIZE_LESS_THAN_256KB=y +CONFIG_HAVE_OBJTOOL=y +CONFIG_HAVE_JUMP_LABEL_HACK=y +CONFIG_HAVE_NOINSTR_HACK=y +CONFIG_HAVE_NOINSTR_VALIDATION=y +CONFIG_HAVE_UACCESS_VALIDATION=y +CONFIG_HAVE_STACK_VALIDATION=y +CONFIG_HAVE_RELIABLE_STACKTRACE=y +CONFIG_OLD_SIGSUSPEND3=y +CONFIG_COMPAT_OLD_SIGACTION=y +CONFIG_COMPAT_32BIT_TIME=y +CONFIG_HAVE_ARCH_VMAP_STACK=y +CONFIG_VMAP_STACK=y +CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y +CONFIG_RANDOMIZE_KSTACK_OFFSET=y +# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set +CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y +CONFIG_STRICT_KERNEL_RWX=y +CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y +CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y +# CONFIG_LOCK_EVENT_COUNTS is not set +CONFIG_ARCH_HAS_MEM_ENCRYPT=y +CONFIG_HAVE_STATIC_CALL=y +CONFIG_HAVE_STATIC_CALL_INLINE=y +CONFIG_HAVE_PREEMPT_DYNAMIC=y +CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y +CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y +CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y +CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y +CONFIG_ARCH_HAS_ELFCORE_COMPAT=y +CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y +CONFIG_DYNAMIC_SIGFRAME=y +CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y + +# +# GCOV-based kernel profiling +# +# CONFIG_GCOV_KERNEL is not set +CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y +# end of GCOV-based kernel profiling + +CONFIG_HAVE_GCC_PLUGINS=y +# end of General architecture-dependent options + +CONFIG_RT_MUTEXES=y +CONFIG_BASE_SMALL=0 +# CONFIG_MODULES is not set +CONFIG_BLOCK=y +CONFIG_BLOCK_LEGACY_AUTOLOAD=y +CONFIG_BLK_RQ_ALLOC_TIME=y +CONFIG_BLK_CGROUP_RWSTAT=y +CONFIG_BLK_DEV_BSG_COMMON=y +CONFIG_BLK_ICQ=y +CONFIG_BLK_DEV_BSGLIB=y +CONFIG_BLK_DEV_INTEGRITY=y +# CONFIG_BLK_DEV_ZONED is not set +CONFIG_BLK_DEV_THROTTLING=y +# CONFIG_BLK_DEV_THROTTLING_LOW is not set +CONFIG_BLK_WBT=y +CONFIG_BLK_WBT_MQ=y +# CONFIG_BLK_CGROUP_IOLATENCY is not set +CONFIG_BLK_CGROUP_IOCOST=y +# CONFIG_BLK_CGROUP_IOPRIO is not set +CONFIG_BLK_DEBUG_FS=y +# CONFIG_BLK_SED_OPAL is not set +# CONFIG_BLK_INLINE_ENCRYPTION is not set + +# +# Partition Types +# +CONFIG_PARTITION_ADVANCED=y +# CONFIG_ACORN_PARTITION is not set +# CONFIG_AIX_PARTITION is not set +# CONFIG_OSF_PARTITION is not set +# CONFIG_AMIGA_PARTITION is not set +# CONFIG_ATARI_PARTITION is not set +# CONFIG_MAC_PARTITION is not set +# CONFIG_MSDOS_PARTITION is not set +# CONFIG_LDM_PARTITION is not set +# CONFIG_SGI_PARTITION is not set +# CONFIG_ULTRIX_PARTITION is not set +# CONFIG_SUN_PARTITION is not set +# CONFIG_KARMA_PARTITION is not set +# CONFIG_EFI_PARTITION is not set +# CONFIG_SYSV68_PARTITION is not set +# CONFIG_CMDLINE_PARTITION is not set +# end of Partition Types + +CONFIG_BLOCK_COMPAT=y +CONFIG_BLK_MQ_PCI=y +CONFIG_BLK_MQ_VIRTIO=y +CONFIG_BLK_PM=y + +# +# IO Schedulers +# +CONFIG_MQ_IOSCHED_DEADLINE=y +CONFIG_MQ_IOSCHED_KYBER=y +CONFIG_IOSCHED_BFQ=y +CONFIG_BFQ_GROUP_IOSCHED=y +# CONFIG_BFQ_CGROUP_DEBUG is not set +# end of IO Schedulers + +CONFIG_PADATA=y +CONFIG_ASN1=y +CONFIG_UNINLINE_SPIN_UNLOCK=y +CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y +CONFIG_MUTEX_SPIN_ON_OWNER=y +CONFIG_RWSEM_SPIN_ON_OWNER=y +CONFIG_LOCK_SPIN_ON_OWNER=y +CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y +CONFIG_QUEUED_SPINLOCKS=y +CONFIG_ARCH_USE_QUEUED_RWLOCKS=y +CONFIG_QUEUED_RWLOCKS=y +CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y +CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y +CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y +CONFIG_FREEZER=y + +# +# Executable file formats +# +CONFIG_BINFMT_ELF=y +CONFIG_COMPAT_BINFMT_ELF=y +CONFIG_ELFCORE=y +CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y +CONFIG_BINFMT_SCRIPT=y +CONFIG_BINFMT_MISC=y +CONFIG_COREDUMP=y +# end of Executable file formats + +# +# Memory Management options +# +CONFIG_ZPOOL=y +CONFIG_SWAP=y +CONFIG_ZSWAP=y +# CONFIG_ZSWAP_DEFAULT_ON is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_DEFLATE is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZO=y +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_842 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4 is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_LZ4HC is not set +# CONFIG_ZSWAP_COMPRESSOR_DEFAULT_ZSTD is not set +CONFIG_ZSWAP_COMPRESSOR_DEFAULT="lzo" +CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y +# CONFIG_ZSWAP_ZPOOL_DEFAULT_Z3FOLD_DEPRECATED is not set +# CONFIG_ZSWAP_ZPOOL_DEFAULT_ZSMALLOC is not set +CONFIG_ZSWAP_ZPOOL_DEFAULT="zbud" +CONFIG_ZBUD=y +# CONFIG_Z3FOLD_DEPRECATED is not set +# CONFIG_ZSMALLOC is not set + +# +# SLAB allocator options +# +# CONFIG_SLAB is not set +CONFIG_SLUB=y +CONFIG_SLAB_MERGE_DEFAULT=y +CONFIG_SLAB_FREELIST_RANDOM=y +CONFIG_SLAB_FREELIST_HARDENED=y +# CONFIG_SLUB_STATS is not set +CONFIG_SLUB_CPU_PARTIAL=y +# end of SLAB allocator options + +CONFIG_SHUFFLE_PAGE_ALLOCATOR=y +# CONFIG_COMPAT_BRK is not set +CONFIG_SPARSEMEM=y +CONFIG_SPARSEMEM_EXTREME=y +CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y +CONFIG_SPARSEMEM_VMEMMAP=y +CONFIG_HAVE_FAST_GUP=y +CONFIG_NUMA_KEEP_MEMINFO=y +CONFIG_MEMORY_ISOLATION=y +CONFIG_EXCLUSIVE_SYSTEM_RAM=y +CONFIG_HAVE_BOOTMEM_INFO_NODE=y +CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y +CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y +CONFIG_MEMORY_HOTPLUG=y +# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set +CONFIG_MEMORY_HOTREMOVE=y +CONFIG_MHP_MEMMAP_ON_MEMORY=y +CONFIG_SPLIT_PTLOCK_CPUS=4 +CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y +CONFIG_MEMORY_BALLOON=y +CONFIG_BALLOON_COMPACTION=y +CONFIG_COMPACTION=y +CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1 +CONFIG_PAGE_REPORTING=y +CONFIG_MIGRATION=y +CONFIG_DEVICE_MIGRATION=y +CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y +CONFIG_ARCH_ENABLE_THP_MIGRATION=y +CONFIG_CONTIG_ALLOC=y +CONFIG_PCP_BATCH_SCALE_MAX=5 +CONFIG_PHYS_ADDR_T_64BIT=y +CONFIG_KSM=y +CONFIG_DEFAULT_MMAP_MIN_ADDR=4096 +CONFIG_ARCH_WANT_GENERAL_HUGETLB=y +CONFIG_ARCH_WANTS_THP_SWAP=y +CONFIG_TRANSPARENT_HUGEPAGE=y +# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set +CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y +CONFIG_THP_SWAP=y +# CONFIG_READ_ONLY_THP_FOR_FS is not set +CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y +CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y +CONFIG_USE_PERCPU_NUMA_NODE_ID=y +CONFIG_HAVE_SETUP_PER_CPU_AREA=y +CONFIG_FRONTSWAP=y +# CONFIG_CMA is not set +CONFIG_GENERIC_EARLY_IOREMAP=y +CONFIG_DEFERRED_STRUCT_PAGE_INIT=y +CONFIG_PAGE_IDLE_FLAG=y +# CONFIG_IDLE_PAGE_TRACKING is not set +CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y +CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y +CONFIG_ARCH_HAS_PTE_DEVMAP=y +CONFIG_ZONE_DMA=y +CONFIG_ZONE_DMA32=y +CONFIG_ZONE_DEVICE=y +# CONFIG_DEVICE_PRIVATE is not set +CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y +CONFIG_ARCH_HAS_PKEYS=y +CONFIG_VM_EVENT_COUNTERS=y +CONFIG_PERCPU_STATS=y +# CONFIG_GUP_TEST is not set +CONFIG_ARCH_HAS_PTE_SPECIAL=y +CONFIG_SECRETMEM=y +# CONFIG_ANON_VMA_NAME is not set +CONFIG_USERFAULTFD=y +CONFIG_HAVE_ARCH_USERFAULTFD_WP=y +CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y +CONFIG_PTE_MARKER=y +CONFIG_PTE_MARKER_UFFD_WP=y +CONFIG_LRU_GEN=y +# CONFIG_LRU_GEN_ENABLED is not set +# CONFIG_LRU_GEN_STATS is not set +CONFIG_LOCK_MM_AND_FIND_VMA=y + +# +# Data Access Monitoring +# +CONFIG_DAMON=y +CONFIG_DAMON_VADDR=y +CONFIG_DAMON_PADDR=y +CONFIG_DAMON_SYSFS=y +CONFIG_DAMON_DBGFS=y +CONFIG_DAMON_RECLAIM=y +CONFIG_DAMON_LRU_SORT=y +# end of Data Access Monitoring +# end of Memory Management options + +CONFIG_NET=y +CONFIG_NET_INGRESS=y +CONFIG_SKB_EXTENSIONS=y + +# +# Networking options +# +CONFIG_PACKET=y +# CONFIG_PACKET_DIAG is not set +CONFIG_UNIX=y +CONFIG_AF_UNIX_OOB=y +# CONFIG_UNIX_DIAG is not set +# CONFIG_TLS is not set +CONFIG_XFRM=y +CONFIG_XFRM_ALGO=y +CONFIG_XFRM_USER=y +# CONFIG_XFRM_USER_COMPAT is not set +# CONFIG_XFRM_INTERFACE is not set +CONFIG_XFRM_SUB_POLICY=y +CONFIG_XFRM_MIGRATE=y +CONFIG_XFRM_STATISTICS=y +# CONFIG_NET_KEY is not set +CONFIG_XDP_SOCKETS=y +# CONFIG_XDP_SOCKETS_DIAG is not set +CONFIG_INET=y +CONFIG_IP_MULTICAST=y +CONFIG_IP_ADVANCED_ROUTER=y +# CONFIG_IP_FIB_TRIE_STATS is not set +CONFIG_IP_MULTIPLE_TABLES=y +CONFIG_IP_ROUTE_MULTIPATH=y +CONFIG_IP_ROUTE_VERBOSE=y +CONFIG_IP_PNP=y +CONFIG_IP_PNP_DHCP=y +CONFIG_IP_PNP_BOOTP=y +CONFIG_IP_PNP_RARP=y +# CONFIG_NET_IPIP is not set +# CONFIG_NET_IPGRE_DEMUX is not set +CONFIG_IP_MROUTE_COMMON=y +CONFIG_IP_MROUTE=y +CONFIG_IP_MROUTE_MULTIPLE_TABLES=y +CONFIG_IP_PIMSM_V1=y +CONFIG_IP_PIMSM_V2=y +CONFIG_SYN_COOKIES=y +# CONFIG_NET_IPVTI is not set +# CONFIG_NET_FOU is not set +# CONFIG_INET_AH is not set +# CONFIG_INET_ESP is not set +# CONFIG_INET_IPCOMP is not set +CONFIG_INET_TABLE_PERTURB_ORDER=16 +CONFIG_INET_DIAG=y +CONFIG_INET_TCP_DIAG=y +# CONFIG_INET_UDP_DIAG is not set +# CONFIG_INET_RAW_DIAG is not set +CONFIG_INET_DIAG_DESTROY=y +CONFIG_TCP_CONG_ADVANCED=y +# CONFIG_TCP_CONG_BIC is not set +CONFIG_TCP_CONG_CUBIC=y +# CONFIG_TCP_CONG_WESTWOOD is not set +# CONFIG_TCP_CONG_HTCP is not set +# CONFIG_TCP_CONG_HSTCP is not set +# CONFIG_TCP_CONG_HYBLA is not set +# CONFIG_TCP_CONG_VEGAS is not set +# CONFIG_TCP_CONG_NV is not set +# CONFIG_TCP_CONG_SCALABLE is not set +# CONFIG_TCP_CONG_LP is not set +# CONFIG_TCP_CONG_VENO is not set +# CONFIG_TCP_CONG_YEAH is not set +# CONFIG_TCP_CONG_ILLINOIS is not set +# CONFIG_TCP_CONG_DCTCP is not set +# CONFIG_TCP_CONG_CDG is not set +# CONFIG_TCP_CONG_BBR is not set +CONFIG_DEFAULT_CUBIC=y +# CONFIG_DEFAULT_RENO is not set +CONFIG_DEFAULT_TCP_CONG="cubic" +CONFIG_TCP_MD5SIG=y +CONFIG_IPV6=y +CONFIG_IPV6_ROUTER_PREF=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_OPTIMISTIC_DAD=y +# CONFIG_INET6_AH is not set +# CONFIG_INET6_ESP is not set +# CONFIG_INET6_IPCOMP is not set +# CONFIG_IPV6_MIP6 is not set +# CONFIG_IPV6_ILA is not set +# CONFIG_IPV6_VTI is not set +# CONFIG_IPV6_SIT is not set +# CONFIG_IPV6_TUNNEL is not set +CONFIG_IPV6_MULTIPLE_TABLES=y +CONFIG_IPV6_SUBTREES=y +CONFIG_IPV6_MROUTE=y +CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=y +CONFIG_IPV6_PIMSM_V2=y +CONFIG_IPV6_SEG6_LWTUNNEL=y +CONFIG_IPV6_SEG6_HMAC=y +CONFIG_IPV6_SEG6_BPF=y +# CONFIG_IPV6_RPL_LWTUNNEL is not set +# CONFIG_IPV6_IOAM6_LWTUNNEL is not set +CONFIG_NETLABEL=y +CONFIG_MPTCP=y +CONFIG_INET_MPTCP_DIAG=y +CONFIG_MPTCP_IPV6=y +CONFIG_NETWORK_SECMARK=y +CONFIG_NET_PTP_CLASSIFY=y +CONFIG_NETWORK_PHY_TIMESTAMPING=y +CONFIG_NETFILTER=y +CONFIG_NETFILTER_ADVANCED=y +CONFIG_BRIDGE_NETFILTER=y + +# +# Core Netfilter Configuration +# +CONFIG_NETFILTER_INGRESS=y +# CONFIG_NETFILTER_EGRESS is not set +CONFIG_NETFILTER_NETLINK=y +CONFIG_NETFILTER_FAMILY_BRIDGE=y +# CONFIG_NETFILTER_NETLINK_HOOK is not set +# CONFIG_NETFILTER_NETLINK_ACCT is not set +# CONFIG_NETFILTER_NETLINK_QUEUE is not set +# CONFIG_NETFILTER_NETLINK_LOG is not set +# CONFIG_NETFILTER_NETLINK_OSF is not set +CONFIG_NF_CONNTRACK=y +CONFIG_NF_LOG_SYSLOG=y +CONFIG_NF_CONNTRACK_MARK=y +CONFIG_NF_CONNTRACK_SECMARK=y +CONFIG_NF_CONNTRACK_ZONES=y +CONFIG_NF_CONNTRACK_PROCFS=y +CONFIG_NF_CONNTRACK_EVENTS=y +CONFIG_NF_CONNTRACK_TIMEOUT=y +CONFIG_NF_CONNTRACK_TIMESTAMP=y +CONFIG_NF_CONNTRACK_LABELS=y +CONFIG_NF_CT_PROTO_DCCP=y +CONFIG_NF_CT_PROTO_SCTP=y +CONFIG_NF_CT_PROTO_UDPLITE=y +# CONFIG_NF_CONNTRACK_AMANDA is not set +# CONFIG_NF_CONNTRACK_FTP is not set +# CONFIG_NF_CONNTRACK_H323 is not set +# CONFIG_NF_CONNTRACK_IRC is not set +# CONFIG_NF_CONNTRACK_NETBIOS_NS is not set +# CONFIG_NF_CONNTRACK_SNMP is not set +# CONFIG_NF_CONNTRACK_PPTP is not set +# CONFIG_NF_CONNTRACK_SANE is not set +# CONFIG_NF_CONNTRACK_SIP is not set +# CONFIG_NF_CONNTRACK_TFTP is not set +# CONFIG_NF_CT_NETLINK is not set +# CONFIG_NF_CT_NETLINK_TIMEOUT is not set +CONFIG_NF_NAT=y +CONFIG_NF_NAT_REDIRECT=y +CONFIG_NF_NAT_MASQUERADE=y +CONFIG_NETFILTER_SYNPROXY=y +CONFIG_NF_TABLES=y +# CONFIG_NF_TABLES_INET is not set +# CONFIG_NF_TABLES_NETDEV is not set +# CONFIG_NFT_NUMGEN is not set +CONFIG_NFT_CT=y +# CONFIG_NFT_CONNLIMIT is not set +# CONFIG_NFT_LOG is not set +# CONFIG_NFT_LIMIT is not set +# CONFIG_NFT_MASQ is not set +# CONFIG_NFT_REDIR is not set +CONFIG_NFT_NAT=y +# CONFIG_NFT_TUNNEL is not set +# CONFIG_NFT_OBJREF is not set +# CONFIG_NFT_QUOTA is not set +# CONFIG_NFT_REJECT is not set +CONFIG_NFT_COMPAT=y +# CONFIG_NFT_HASH is not set +# CONFIG_NFT_XFRM is not set +# CONFIG_NFT_SOCKET is not set +# CONFIG_NFT_OSF is not set +# CONFIG_NFT_TPROXY is not set +# CONFIG_NFT_SYNPROXY is not set +# CONFIG_NF_FLOW_TABLE is not set +CONFIG_NETFILTER_XTABLES=y +CONFIG_NETFILTER_XTABLES_COMPAT=y + +# +# Xtables combined modules +# +# CONFIG_NETFILTER_XT_MARK is not set +# CONFIG_NETFILTER_XT_CONNMARK is not set + +# +# Xtables targets +# +# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set +# CONFIG_NETFILTER_XT_TARGET_CHECKSUM is not set +# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set +# CONFIG_NETFILTER_XT_TARGET_CONNMARK is not set +# CONFIG_NETFILTER_XT_TARGET_CONNSECMARK is not set +# CONFIG_NETFILTER_XT_TARGET_DSCP is not set +# CONFIG_NETFILTER_XT_TARGET_HL is not set +# CONFIG_NETFILTER_XT_TARGET_HMARK is not set +# CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set +# CONFIG_NETFILTER_XT_TARGET_LOG is not set +# CONFIG_NETFILTER_XT_TARGET_MARK is not set +CONFIG_NETFILTER_XT_NAT=y +CONFIG_NETFILTER_XT_TARGET_NETMAP=y +# CONFIG_NETFILTER_XT_TARGET_NFLOG is not set +# CONFIG_NETFILTER_XT_TARGET_NFQUEUE is not set +# CONFIG_NETFILTER_XT_TARGET_RATEEST is not set +CONFIG_NETFILTER_XT_TARGET_REDIRECT=y +CONFIG_NETFILTER_XT_TARGET_MASQUERADE=y +# CONFIG_NETFILTER_XT_TARGET_TEE is not set +# CONFIG_NETFILTER_XT_TARGET_TPROXY is not set +# CONFIG_NETFILTER_XT_TARGET_SECMARK is not set +# CONFIG_NETFILTER_XT_TARGET_TCPMSS is not set +# CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP is not set + +# +# Xtables matches +# +CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=y +# CONFIG_NETFILTER_XT_MATCH_BPF is not set +# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set +# CONFIG_NETFILTER_XT_MATCH_CLUSTER is not set +# CONFIG_NETFILTER_XT_MATCH_COMMENT is not set +# CONFIG_NETFILTER_XT_MATCH_CONNBYTES is not set +# CONFIG_NETFILTER_XT_MATCH_CONNLABEL is not set +# CONFIG_NETFILTER_XT_MATCH_CONNLIMIT is not set +# CONFIG_NETFILTER_XT_MATCH_CONNMARK is not set +CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y +# CONFIG_NETFILTER_XT_MATCH_CPU is not set +# CONFIG_NETFILTER_XT_MATCH_DCCP is not set +# CONFIG_NETFILTER_XT_MATCH_DEVGROUP is not set +# CONFIG_NETFILTER_XT_MATCH_DSCP is not set +# CONFIG_NETFILTER_XT_MATCH_ECN is not set +# CONFIG_NETFILTER_XT_MATCH_ESP is not set +# CONFIG_NETFILTER_XT_MATCH_HASHLIMIT is not set +# CONFIG_NETFILTER_XT_MATCH_HELPER is not set +# CONFIG_NETFILTER_XT_MATCH_HL is not set +# CONFIG_NETFILTER_XT_MATCH_IPCOMP is not set +# CONFIG_NETFILTER_XT_MATCH_IPRANGE is not set +# CONFIG_NETFILTER_XT_MATCH_L2TP is not set +# CONFIG_NETFILTER_XT_MATCH_LENGTH is not set +# CONFIG_NETFILTER_XT_MATCH_LIMIT is not set +# CONFIG_NETFILTER_XT_MATCH_MAC is not set +# CONFIG_NETFILTER_XT_MATCH_MARK is not set +# CONFIG_NETFILTER_XT_MATCH_MULTIPORT is not set +# CONFIG_NETFILTER_XT_MATCH_NFACCT is not set +# CONFIG_NETFILTER_XT_MATCH_OSF is not set +# CONFIG_NETFILTER_XT_MATCH_OWNER is not set +# CONFIG_NETFILTER_XT_MATCH_POLICY is not set +# CONFIG_NETFILTER_XT_MATCH_PHYSDEV is not set +# CONFIG_NETFILTER_XT_MATCH_PKTTYPE is not set +# CONFIG_NETFILTER_XT_MATCH_QUOTA is not set +# CONFIG_NETFILTER_XT_MATCH_RATEEST is not set +# CONFIG_NETFILTER_XT_MATCH_REALM is not set +# CONFIG_NETFILTER_XT_MATCH_RECENT is not set +# CONFIG_NETFILTER_XT_MATCH_SCTP is not set +# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set +# CONFIG_NETFILTER_XT_MATCH_STATE is not set +# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set +# CONFIG_NETFILTER_XT_MATCH_STRING is not set +# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set +# CONFIG_NETFILTER_XT_MATCH_TIME is not set +# CONFIG_NETFILTER_XT_MATCH_U32 is not set +# end of Core Netfilter Configuration + +# CONFIG_IP_SET is not set +# CONFIG_IP_VS is not set + +# +# IP: Netfilter Configuration +# +CONFIG_NF_DEFRAG_IPV4=y +# CONFIG_NF_SOCKET_IPV4 is not set +# CONFIG_NF_TPROXY_IPV4 is not set +CONFIG_NF_TABLES_IPV4=y +CONFIG_NFT_DUP_IPV4=y +# CONFIG_NFT_FIB_IPV4 is not set +# CONFIG_NF_TABLES_ARP is not set +CONFIG_NF_DUP_IPV4=y +# CONFIG_NF_LOG_ARP is not set +# CONFIG_NF_LOG_IPV4 is not set +CONFIG_NF_REJECT_IPV4=y +CONFIG_IP_NF_IPTABLES=y +# CONFIG_IP_NF_MATCH_AH is not set +# CONFIG_IP_NF_MATCH_ECN is not set +# CONFIG_IP_NF_MATCH_RPFILTER is not set +# CONFIG_IP_NF_MATCH_TTL is not set +CONFIG_IP_NF_FILTER=y +CONFIG_IP_NF_TARGET_REJECT=y +CONFIG_IP_NF_TARGET_SYNPROXY=y +CONFIG_IP_NF_NAT=y +CONFIG_IP_NF_TARGET_MASQUERADE=y +CONFIG_IP_NF_TARGET_NETMAP=y +CONFIG_IP_NF_TARGET_REDIRECT=y +CONFIG_IP_NF_MANGLE=y +# CONFIG_IP_NF_TARGET_CLUSTERIP is not set +# CONFIG_IP_NF_TARGET_ECN is not set +# CONFIG_IP_NF_TARGET_TTL is not set +# CONFIG_IP_NF_RAW is not set +# CONFIG_IP_NF_SECURITY is not set +# CONFIG_IP_NF_ARPTABLES is not set +# end of IP: Netfilter Configuration + +# +# IPv6: Netfilter Configuration +# +# CONFIG_NF_SOCKET_IPV6 is not set +# CONFIG_NF_TPROXY_IPV6 is not set +# CONFIG_NF_TABLES_IPV6 is not set +# CONFIG_NF_DUP_IPV6 is not set +CONFIG_NF_REJECT_IPV6=y +CONFIG_NF_LOG_IPV6=y +CONFIG_IP6_NF_IPTABLES=y +# CONFIG_IP6_NF_MATCH_AH is not set +# CONFIG_IP6_NF_MATCH_EUI64 is not set +# CONFIG_IP6_NF_MATCH_FRAG is not set +# CONFIG_IP6_NF_MATCH_OPTS is not set +# CONFIG_IP6_NF_MATCH_HL is not set +# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set +# CONFIG_IP6_NF_MATCH_MH is not set +# CONFIG_IP6_NF_MATCH_RPFILTER is not set +# CONFIG_IP6_NF_MATCH_RT is not set +# CONFIG_IP6_NF_MATCH_SRH is not set +# CONFIG_IP6_NF_TARGET_HL is not set +CONFIG_IP6_NF_FILTER=y +CONFIG_IP6_NF_TARGET_REJECT=y +CONFIG_IP6_NF_TARGET_SYNPROXY=y +CONFIG_IP6_NF_MANGLE=y +# CONFIG_IP6_NF_RAW is not set +# CONFIG_IP6_NF_SECURITY is not set +CONFIG_IP6_NF_NAT=y +CONFIG_IP6_NF_TARGET_MASQUERADE=y +# CONFIG_IP6_NF_TARGET_NPT is not set +# end of IPv6: Netfilter Configuration + +CONFIG_NF_DEFRAG_IPV6=y +# CONFIG_NF_TABLES_BRIDGE is not set +# CONFIG_NF_CONNTRACK_BRIDGE is not set +# CONFIG_BRIDGE_NF_EBTABLES is not set +CONFIG_BPFILTER=y +CONFIG_BPFILTER_UMH=y +# CONFIG_IP_DCCP is not set +# CONFIG_IP_SCTP is not set +# CONFIG_RDS is not set +# CONFIG_TIPC is not set +# CONFIG_ATM is not set +# CONFIG_L2TP is not set +CONFIG_STP=y +CONFIG_BRIDGE=y +CONFIG_BRIDGE_IGMP_SNOOPING=y +# CONFIG_BRIDGE_MRP is not set +# CONFIG_BRIDGE_CFM is not set +# CONFIG_NET_DSA is not set +# CONFIG_VLAN_8021Q is not set +CONFIG_LLC=y +# CONFIG_LLC2 is not set +# CONFIG_ATALK is not set +# CONFIG_X25 is not set +# CONFIG_LAPB is not set +# CONFIG_PHONET is not set +# CONFIG_6LOWPAN is not set +# CONFIG_IEEE802154 is not set +CONFIG_NET_SCHED=y + +# +# Queueing/Scheduling +# +# CONFIG_NET_SCH_HTB is not set +# CONFIG_NET_SCH_HFSC is not set +# CONFIG_NET_SCH_PRIO is not set +# CONFIG_NET_SCH_MULTIQ is not set +# CONFIG_NET_SCH_RED is not set +# CONFIG_NET_SCH_SFB is not set +# CONFIG_NET_SCH_SFQ is not set +# CONFIG_NET_SCH_TEQL is not set +# CONFIG_NET_SCH_TBF is not set +# CONFIG_NET_SCH_CBS is not set +# CONFIG_NET_SCH_ETF is not set +# CONFIG_NET_SCH_TAPRIO is not set +# CONFIG_NET_SCH_GRED is not set +# CONFIG_NET_SCH_NETEM is not set +# CONFIG_NET_SCH_DRR is not set +# CONFIG_NET_SCH_MQPRIO is not set +# CONFIG_NET_SCH_SKBPRIO is not set +# CONFIG_NET_SCH_CHOKE is not set +# CONFIG_NET_SCH_QFQ is not set +# CONFIG_NET_SCH_CODEL is not set +# CONFIG_NET_SCH_FQ_CODEL is not set +# CONFIG_NET_SCH_CAKE is not set +# CONFIG_NET_SCH_FQ is not set +# CONFIG_NET_SCH_HHF is not set +# CONFIG_NET_SCH_PIE is not set +# CONFIG_NET_SCH_INGRESS is not set +# CONFIG_NET_SCH_PLUG is not set +# CONFIG_NET_SCH_ETS is not set +# CONFIG_NET_SCH_DEFAULT is not set + +# +# Classification +# +CONFIG_NET_CLS=y +# CONFIG_NET_CLS_BASIC is not set +# CONFIG_NET_CLS_ROUTE4 is not set +# CONFIG_NET_CLS_FW is not set +# CONFIG_NET_CLS_U32 is not set +# CONFIG_NET_CLS_FLOW is not set +# CONFIG_NET_CLS_CGROUP is not set +# CONFIG_NET_CLS_BPF is not set +# CONFIG_NET_CLS_FLOWER is not set +# CONFIG_NET_CLS_MATCHALL is not set +CONFIG_NET_EMATCH=y +CONFIG_NET_EMATCH_STACK=32 +# CONFIG_NET_EMATCH_CMP is not set +# CONFIG_NET_EMATCH_NBYTE is not set +# CONFIG_NET_EMATCH_U32 is not set +# CONFIG_NET_EMATCH_META is not set +# CONFIG_NET_EMATCH_TEXT is not set +# CONFIG_NET_EMATCH_IPT is not set +CONFIG_NET_CLS_ACT=y +# CONFIG_NET_ACT_POLICE is not set +# CONFIG_NET_ACT_GACT is not set +# CONFIG_NET_ACT_MIRRED is not set +# CONFIG_NET_ACT_SAMPLE is not set +# CONFIG_NET_ACT_IPT is not set +# CONFIG_NET_ACT_NAT is not set +# CONFIG_NET_ACT_PEDIT is not set +# CONFIG_NET_ACT_SIMP is not set +# CONFIG_NET_ACT_SKBEDIT is not set +# CONFIG_NET_ACT_CSUM is not set +# CONFIG_NET_ACT_MPLS is not set +# CONFIG_NET_ACT_VLAN is not set +# CONFIG_NET_ACT_BPF is not set +# CONFIG_NET_ACT_CONNMARK is not set +# CONFIG_NET_ACT_CTINFO is not set +# CONFIG_NET_ACT_SKBMOD is not set +# CONFIG_NET_ACT_IFE is not set +# CONFIG_NET_ACT_TUNNEL_KEY is not set +# CONFIG_NET_ACT_GATE is not set +# CONFIG_NET_TC_SKB_EXT is not set +CONFIG_NET_SCH_FIFO=y +CONFIG_DCB=y +CONFIG_DNS_RESOLVER=y +# CONFIG_BATMAN_ADV is not set +# CONFIG_OPENVSWITCH is not set +CONFIG_VSOCKETS=y +# CONFIG_VSOCKETS_DIAG is not set +# CONFIG_VSOCKETS_LOOPBACK is not set +CONFIG_VIRTIO_VSOCKETS=y +CONFIG_VIRTIO_VSOCKETS_COMMON=y +# CONFIG_NETLINK_DIAG is not set +CONFIG_MPLS=y +# CONFIG_NET_MPLS_GSO is not set +# CONFIG_MPLS_ROUTING is not set +# CONFIG_NET_NSH is not set +# CONFIG_HSR is not set +# CONFIG_NET_SWITCHDEV is not set +CONFIG_NET_L3_MASTER_DEV=y +# CONFIG_QRTR is not set +# CONFIG_NET_NCSI is not set +CONFIG_PCPU_DEV_REFCNT=y +CONFIG_RPS=y +CONFIG_RFS_ACCEL=y +CONFIG_SOCK_RX_QUEUE_MAPPING=y +CONFIG_XPS=y +CONFIG_CGROUP_NET_PRIO=y +CONFIG_CGROUP_NET_CLASSID=y +CONFIG_NET_RX_BUSY_POLL=y +CONFIG_BQL=y +CONFIG_BPF_STREAM_PARSER=y +CONFIG_NET_FLOW_LIMIT=y + +# +# Network testing +# +# CONFIG_NET_PKTGEN is not set +# end of Network testing +# end of Networking options + +# CONFIG_HAMRADIO is not set +# CONFIG_CAN is not set +# CONFIG_BT is not set +# CONFIG_AF_RXRPC is not set +# CONFIG_AF_KCM is not set +CONFIG_STREAM_PARSER=y +# CONFIG_MCTP is not set +CONFIG_FIB_RULES=y +# CONFIG_WIRELESS is not set +# CONFIG_RFKILL is not set +# CONFIG_NET_9P is not set +# CONFIG_CAIF is not set +# CONFIG_CEPH_LIB is not set +# CONFIG_NFC is not set +# CONFIG_PSAMPLE is not set +# CONFIG_NET_IFE is not set +CONFIG_LWTUNNEL=y +CONFIG_LWTUNNEL_BPF=y +CONFIG_DST_CACHE=y +CONFIG_GRO_CELLS=y +CONFIG_NET_SOCK_MSG=y +CONFIG_PAGE_POOL=y +# CONFIG_PAGE_POOL_STATS is not set +CONFIG_FAILOVER=y +CONFIG_ETHTOOL_NETLINK=y + +# +# Device Drivers +# +CONFIG_HAVE_PCI=y +CONFIG_PCI=y +CONFIG_PCI_DOMAINS=y +CONFIG_PCIEPORTBUS=y +# CONFIG_PCIEAER is not set +CONFIG_PCIEASPM=y +CONFIG_PCIEASPM_DEFAULT=y +# CONFIG_PCIEASPM_POWERSAVE is not set +# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set +# CONFIG_PCIEASPM_PERFORMANCE is not set +CONFIG_PCIE_PME=y +# CONFIG_PCIE_PTM is not set +CONFIG_PCI_MSI=y +CONFIG_PCI_MSI_IRQ_DOMAIN=y +CONFIG_PCI_QUIRKS=y +# CONFIG_PCI_DEBUG is not set +# CONFIG_PCI_STUB is not set +CONFIG_PCI_LOCKLESS_CONFIG=y +# CONFIG_PCI_IOV is not set +# CONFIG_PCI_PRI is not set +# CONFIG_PCI_PASID is not set +# CONFIG_PCI_P2PDMA is not set +CONFIG_PCI_LABEL=y +CONFIG_VGA_ARB=y +CONFIG_VGA_ARB_MAX_GPUS=16 +# CONFIG_HOTPLUG_PCI is not set + +# +# PCI controller drivers +# +# CONFIG_VMD is not set + +# +# DesignWare PCI Core Support +# +# CONFIG_PCIE_DW_PLAT_HOST is not set +# CONFIG_PCI_MESON is not set +# end of DesignWare PCI Core Support + +# +# Mobiveil PCIe Core Support +# +# end of Mobiveil PCIe Core Support + +# +# Cadence PCIe controllers support +# +# end of Cadence PCIe controllers support +# end of PCI controller drivers + +# +# PCI Endpoint +# +# CONFIG_PCI_ENDPOINT is not set +# end of PCI Endpoint + +# +# PCI switch controller drivers +# +# CONFIG_PCI_SW_SWITCHTEC is not set +# end of PCI switch controller drivers + +# CONFIG_CXL_BUS is not set +# CONFIG_PCCARD is not set +# CONFIG_RAPIDIO is not set + +# +# Generic Driver Options +# +CONFIG_UEVENT_HELPER=y +CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" +CONFIG_DEVTMPFS=y +CONFIG_DEVTMPFS_MOUNT=y +# CONFIG_DEVTMPFS_SAFE is not set +CONFIG_STANDALONE=y +CONFIG_PREVENT_FIRMWARE_BUILD=y + +# +# Firmware loader +# +CONFIG_FW_LOADER=y +CONFIG_FW_LOADER_PAGED_BUF=y +CONFIG_FW_LOADER_SYSFS=y +CONFIG_EXTRA_FIRMWARE="" +CONFIG_FW_LOADER_USER_HELPER=y +# CONFIG_FW_LOADER_USER_HELPER_FALLBACK is not set +# CONFIG_FW_LOADER_COMPRESS is not set +CONFIG_FW_CACHE=y +# CONFIG_FW_UPLOAD is not set +# end of Firmware loader + +CONFIG_ALLOW_DEV_COREDUMP=y +# CONFIG_DEBUG_DRIVER is not set +# CONFIG_DEBUG_DEVRES is not set +# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set +CONFIG_GENERIC_CPU_AUTOPROBE=y +CONFIG_GENERIC_CPU_VULNERABILITIES=y +CONFIG_DMA_SHARED_BUFFER=y +# CONFIG_DMA_FENCE_TRACE is not set +# end of Generic Driver Options + +# +# Bus devices +# +# CONFIG_MHI_BUS is not set +# CONFIG_MHI_BUS_EP is not set +# end of Bus devices + +CONFIG_CONNECTOR=y +CONFIG_PROC_EVENTS=y + +# +# Firmware Drivers +# + +# +# ARM System Control and Management Interface Protocol +# +# end of ARM System Control and Management Interface Protocol + +# CONFIG_EDD is not set +CONFIG_FIRMWARE_MEMMAP=y +CONFIG_DMIID=y +# CONFIG_DMI_SYSFS is not set +CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y +# CONFIG_ISCSI_IBFT is not set +# CONFIG_FW_CFG_SYSFS is not set +# CONFIG_SYSFB_SIMPLEFB is not set +# CONFIG_GOOGLE_FIRMWARE is not set + +# +# Tegra firmware driver +# +# end of Tegra firmware driver +# end of Firmware Drivers + +# CONFIG_GNSS is not set +# CONFIG_MTD is not set +# CONFIG_OF is not set +CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y +# CONFIG_PARPORT is not set +CONFIG_PNP=y +CONFIG_PNP_DEBUG_MESSAGES=y + +# +# Protocols +# +CONFIG_PNPACPI=y +CONFIG_BLK_DEV=y +# CONFIG_BLK_DEV_NULL_BLK is not set +# CONFIG_BLK_DEV_FD is not set +# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set +# CONFIG_ZRAM is not set +CONFIG_BLK_DEV_LOOP=y +CONFIG_BLK_DEV_LOOP_MIN_COUNT=8 +# CONFIG_BLK_DEV_DRBD is not set +# CONFIG_BLK_DEV_NBD is not set +# CONFIG_BLK_DEV_RAM is not set +# CONFIG_CDROM_PKTCDVD is not set +# CONFIG_ATA_OVER_ETH is not set +CONFIG_VIRTIO_BLK=y +# CONFIG_BLK_DEV_RBD is not set +# CONFIG_BLK_DEV_UBLK is not set + +# +# NVME Support +# +# CONFIG_BLK_DEV_NVME is not set +# CONFIG_NVME_FC is not set +# CONFIG_NVME_TCP is not set +# end of NVME Support + +# +# Misc devices +# +# CONFIG_DUMMY_IRQ is not set +# CONFIG_IBM_ASM is not set +# CONFIG_PHANTOM is not set +# CONFIG_TIFM_CORE is not set +# CONFIG_ENCLOSURE_SERVICES is not set +# CONFIG_HP_ILO is not set +# CONFIG_SRAM is not set +# CONFIG_DW_XDATA_PCIE is not set +# CONFIG_PCI_ENDPOINT_TEST is not set +# CONFIG_XILINX_SDFEC is not set +CONFIG_SYSGENID=y +# CONFIG_C2PORT is not set + +# +# EEPROM support +# +# CONFIG_EEPROM_93CX6 is not set +# end of EEPROM support + +# CONFIG_CB710_CORE is not set + +# +# Texas Instruments shared transport line discipline +# +# end of Texas Instruments shared transport line discipline + +# +# Altera FPGA firmware download module (requires I2C) +# +# CONFIG_INTEL_MEI is not set +# CONFIG_INTEL_MEI_ME is not set +# CONFIG_INTEL_MEI_TXE is not set +# CONFIG_VMWARE_VMCI is not set +# CONFIG_GENWQE is not set +# CONFIG_ECHO is not set +# CONFIG_BCM_VK is not set +# CONFIG_MISC_ALCOR_PCI is not set +# CONFIG_MISC_RTSX_PCI is not set +# CONFIG_HABANA_AI is not set +# CONFIG_UACCE is not set +# CONFIG_PVPANIC is not set +# end of Misc devices + +# +# SCSI device support +# +CONFIG_SCSI_MOD=y +# CONFIG_RAID_ATTRS is not set +CONFIG_SCSI_COMMON=y +CONFIG_SCSI=y +CONFIG_SCSI_DMA=y +CONFIG_SCSI_PROC_FS=y + +# +# SCSI support type (disk, tape, CD-ROM) +# +# CONFIG_BLK_DEV_SD is not set +# CONFIG_CHR_DEV_ST is not set +# CONFIG_BLK_DEV_SR is not set +# CONFIG_CHR_DEV_SG is not set +CONFIG_BLK_DEV_BSG=y +# CONFIG_CHR_DEV_SCH is not set +# CONFIG_SCSI_CONSTANTS is not set +# CONFIG_SCSI_LOGGING is not set +# CONFIG_SCSI_SCAN_ASYNC is not set + +# +# SCSI Transports +# +# CONFIG_SCSI_SPI_ATTRS is not set +# CONFIG_SCSI_FC_ATTRS is not set +CONFIG_SCSI_ISCSI_ATTRS=y +# CONFIG_SCSI_SAS_ATTRS is not set +# CONFIG_SCSI_SAS_LIBSAS is not set +# CONFIG_SCSI_SRP_ATTRS is not set +# end of SCSI Transports + +CONFIG_SCSI_LOWLEVEL=y +CONFIG_ISCSI_TCP=y +# CONFIG_ISCSI_BOOT_SYSFS is not set +# CONFIG_SCSI_CXGB3_ISCSI is not set +# CONFIG_SCSI_BNX2_ISCSI is not set +# CONFIG_BE2ISCSI is not set +# CONFIG_BLK_DEV_3W_XXXX_RAID is not set +# CONFIG_SCSI_HPSA is not set +# CONFIG_SCSI_3W_9XXX is not set +# CONFIG_SCSI_3W_SAS is not set +# CONFIG_SCSI_ACARD is not set +# CONFIG_SCSI_AACRAID is not set +# CONFIG_SCSI_AIC7XXX is not set +# CONFIG_SCSI_AIC79XX is not set +# CONFIG_SCSI_AIC94XX is not set +# CONFIG_SCSI_MVSAS is not set +# CONFIG_SCSI_MVUMI is not set +# CONFIG_SCSI_ADVANSYS is not set +# CONFIG_SCSI_ARCMSR is not set +# CONFIG_SCSI_ESAS2R is not set +# CONFIG_MEGARAID_NEWGEN is not set +# CONFIG_MEGARAID_LEGACY is not set +# CONFIG_MEGARAID_SAS is not set +# CONFIG_SCSI_MPT3SAS is not set +# CONFIG_SCSI_MPT2SAS is not set +# CONFIG_SCSI_MPI3MR is not set +# CONFIG_SCSI_SMARTPQI is not set +# CONFIG_SCSI_HPTIOP is not set +# CONFIG_SCSI_BUSLOGIC is not set +# CONFIG_SCSI_MYRB is not set +# CONFIG_SCSI_MYRS is not set +# CONFIG_VMWARE_PVSCSI is not set +# CONFIG_SCSI_SNIC is not set +# CONFIG_SCSI_DMX3191D is not set +# CONFIG_SCSI_FDOMAIN_PCI is not set +# CONFIG_SCSI_ISCI is not set +# CONFIG_SCSI_IPS is not set +# CONFIG_SCSI_INITIO is not set +# CONFIG_SCSI_INIA100 is not set +# CONFIG_SCSI_STEX is not set +# CONFIG_SCSI_SYM53C8XX_2 is not set +# CONFIG_SCSI_QLOGIC_1280 is not set +# CONFIG_SCSI_QLA_ISCSI is not set +# CONFIG_SCSI_DC395x is not set +# CONFIG_SCSI_AM53C974 is not set +# CONFIG_SCSI_WD719X is not set +# CONFIG_SCSI_DEBUG is not set +# CONFIG_SCSI_PMCRAID is not set +# CONFIG_SCSI_PM8001 is not set +# CONFIG_SCSI_VIRTIO is not set +# CONFIG_SCSI_DH is not set +# end of SCSI device support + +# CONFIG_ATA is not set +# CONFIG_MD is not set +# CONFIG_TARGET_CORE is not set +# CONFIG_FUSION is not set + +# +# IEEE 1394 (FireWire) support +# +# CONFIG_FIREWIRE is not set +# CONFIG_FIREWIRE_NOSY is not set +# end of IEEE 1394 (FireWire) support + +# CONFIG_MACINTOSH_DRIVERS is not set +CONFIG_NETDEVICES=y +CONFIG_NET_CORE=y +# CONFIG_BONDING is not set +# CONFIG_DUMMY is not set +# CONFIG_WIREGUARD is not set +# CONFIG_EQUALIZER is not set +# CONFIG_NET_FC is not set +# CONFIG_NET_TEAM is not set +# CONFIG_MACVLAN is not set +# CONFIG_IPVLAN is not set +# CONFIG_VXLAN is not set +# CONFIG_GENEVE is not set +# CONFIG_BAREUDP is not set +# CONFIG_GTP is not set +# CONFIG_AMT is not set +# CONFIG_MACSEC is not set +# CONFIG_NETCONSOLE is not set +# CONFIG_TUN is not set +# CONFIG_TUN_VNET_CROSS_LE is not set +CONFIG_VETH=y +CONFIG_VIRTIO_NET=y +# CONFIG_NLMON is not set +# CONFIG_NET_VRF is not set +# CONFIG_ARCNET is not set +# CONFIG_ETHERNET is not set +# CONFIG_FDDI is not set +# CONFIG_HIPPI is not set +# CONFIG_NET_SB1000 is not set +# CONFIG_PHYLIB is not set +# CONFIG_PSE_CONTROLLER is not set +# CONFIG_MDIO_DEVICE is not set + +# +# PCS device drivers +# +# end of PCS device drivers + +# CONFIG_PPP is not set +# CONFIG_SLIP is not set + +# +# Host-side USB support is needed for USB Network Adapter support +# +# CONFIG_WLAN is not set +# CONFIG_WAN is not set + +# +# Wireless WAN +# +# CONFIG_WWAN is not set +# end of Wireless WAN + +# CONFIG_VMXNET3 is not set +# CONFIG_FUJITSU_ES is not set +# CONFIG_NETDEVSIM is not set +CONFIG_NET_FAILOVER=y +# CONFIG_ISDN is not set + +# +# Input device support +# +CONFIG_INPUT=y +CONFIG_INPUT_FF_MEMLESS=y +# CONFIG_INPUT_SPARSEKMAP is not set +# CONFIG_INPUT_MATRIXKMAP is not set + +# +# Userland interfaces +# +# CONFIG_INPUT_MOUSEDEV is not set +# CONFIG_INPUT_JOYDEV is not set +CONFIG_INPUT_EVDEV=y +# CONFIG_INPUT_EVBUG is not set + +# +# Input Device Drivers +# +# CONFIG_INPUT_KEYBOARD is not set +# CONFIG_INPUT_MOUSE is not set +# CONFIG_INPUT_JOYSTICK is not set +# CONFIG_INPUT_TABLET is not set +# CONFIG_INPUT_TOUCHSCREEN is not set +CONFIG_INPUT_MISC=y +# CONFIG_INPUT_AD714X is not set +# CONFIG_INPUT_E3X0_BUTTON is not set +# CONFIG_INPUT_PCSPKR is not set +# CONFIG_INPUT_ATLAS_BTNS is not set +# CONFIG_INPUT_ATI_REMOTE2 is not set +# CONFIG_INPUT_KEYSPAN_REMOTE is not set +# CONFIG_INPUT_POWERMATE is not set +# CONFIG_INPUT_YEALINK is not set +# CONFIG_INPUT_CM109 is not set +# CONFIG_INPUT_UINPUT is not set +# CONFIG_INPUT_ADXL34X is not set +# CONFIG_INPUT_CMA3000 is not set +# CONFIG_RMI4_CORE is not set + +# +# Hardware I/O ports +# +# CONFIG_SERIO is not set +CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y +# CONFIG_GAMEPORT is not set +# end of Hardware I/O ports +# end of Input device support + +# +# Character devices +# +CONFIG_TTY=y +CONFIG_VT=y +CONFIG_CONSOLE_TRANSLATIONS=y +CONFIG_VT_CONSOLE=y +CONFIG_VT_CONSOLE_SLEEP=y +CONFIG_HW_CONSOLE=y +CONFIG_VT_HW_CONSOLE_BINDING=y +CONFIG_UNIX98_PTYS=y +# CONFIG_LEGACY_PTYS is not set +CONFIG_LDISC_AUTOLOAD=y + +# +# Serial drivers +# +CONFIG_SERIAL_EARLYCON=y +CONFIG_SERIAL_8250=y +# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set +CONFIG_SERIAL_8250_PNP=y +# CONFIG_SERIAL_8250_16550A_VARIANTS is not set +# CONFIG_SERIAL_8250_FINTEK is not set +CONFIG_SERIAL_8250_CONSOLE=y +CONFIG_SERIAL_8250_DMA=y +CONFIG_SERIAL_8250_PCI=y +CONFIG_SERIAL_8250_EXAR=y +CONFIG_SERIAL_8250_NR_UARTS=1 +CONFIG_SERIAL_8250_RUNTIME_UARTS=1 +# CONFIG_SERIAL_8250_EXTENDED is not set +CONFIG_SERIAL_8250_DWLIB=y +# CONFIG_SERIAL_8250_DW is not set +# CONFIG_SERIAL_8250_RT288X is not set +CONFIG_SERIAL_8250_LPSS=y +CONFIG_SERIAL_8250_MID=y +CONFIG_SERIAL_8250_PERICOM=y + +# +# Non-8250 serial port support +# +# CONFIG_SERIAL_UARTLITE is not set +CONFIG_SERIAL_CORE=y +CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_JSM is not set +# CONFIG_SERIAL_LANTIQ is not set +# CONFIG_SERIAL_SCCNXP is not set +# CONFIG_SERIAL_ALTERA_JTAGUART is not set +# CONFIG_SERIAL_ALTERA_UART is not set +# CONFIG_SERIAL_ARC is not set +# CONFIG_SERIAL_RP2 is not set +# CONFIG_SERIAL_FSL_LPUART is not set +# CONFIG_SERIAL_FSL_LINFLEXUART is not set +# CONFIG_SERIAL_SPRD is not set +# end of Serial drivers + +# CONFIG_SERIAL_NONSTANDARD is not set +# CONFIG_N_GSM is not set +# CONFIG_NOZOMI is not set +# CONFIG_NULL_TTY is not set +CONFIG_HVC_DRIVER=y +CONFIG_SERIAL_DEV_BUS=y +CONFIG_SERIAL_DEV_CTRL_TTYPORT=y +CONFIG_VIRTIO_CONSOLE=y +# CONFIG_IPMI_HANDLER is not set +CONFIG_HW_RANDOM=y +# CONFIG_HW_RANDOM_TIMERIOMEM is not set +CONFIG_HW_RANDOM_INTEL=y +CONFIG_HW_RANDOM_AMD=y +# CONFIG_HW_RANDOM_BA431 is not set +# CONFIG_HW_RANDOM_VIA is not set +CONFIG_HW_RANDOM_VIRTIO=y +# CONFIG_HW_RANDOM_XIPHERA is not set +# CONFIG_APPLICOM is not set +# CONFIG_MWAVE is not set +CONFIG_DEVMEM=y +# CONFIG_NVRAM is not set +CONFIG_DEVPORT=y +# CONFIG_HPET is not set +# CONFIG_HANGCHECK_TIMER is not set +# CONFIG_TCG_TPM is not set +# CONFIG_TELCLOCK is not set +# CONFIG_XILLYBUS is not set +CONFIG_RANDOM_TRUST_CPU=y +CONFIG_RANDOM_TRUST_BOOTLOADER=y +# end of Character devices + +# +# I2C support +# +# CONFIG_I2C is not set +# end of I2C support + +# CONFIG_I3C is not set +# CONFIG_SPI is not set +# CONFIG_SPMI is not set +# CONFIG_HSI is not set +CONFIG_PPS=y +# CONFIG_PPS_DEBUG is not set + +# +# PPS clients support +# +# CONFIG_PPS_CLIENT_KTIMER is not set +# CONFIG_PPS_CLIENT_LDISC is not set +# CONFIG_PPS_CLIENT_GPIO is not set + +# +# PPS generators support +# + +# +# PTP clock support +# +CONFIG_PTP_1588_CLOCK=y +CONFIG_PTP_1588_CLOCK_OPTIONAL=y + +# +# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks. +# +CONFIG_PTP_1588_CLOCK_KVM=y +CONFIG_PTP_1588_CLOCK_VMCLOCK=y +# CONFIG_PTP_1588_CLOCK_VMW is not set +# end of PTP clock support + +# CONFIG_PINCTRL is not set +# CONFIG_GPIOLIB is not set +# CONFIG_W1 is not set +CONFIG_POWER_RESET=y +# CONFIG_POWER_RESET_RESTART is not set +CONFIG_POWER_SUPPLY=y +# CONFIG_POWER_SUPPLY_DEBUG is not set +# CONFIG_PDA_POWER is not set +# CONFIG_TEST_POWER is not set +# CONFIG_BATTERY_DS2780 is not set +# CONFIG_BATTERY_DS2781 is not set +# CONFIG_BATTERY_SAMSUNG_SDI is not set +# CONFIG_BATTERY_BQ27XXX is not set +# CONFIG_CHARGER_MAX8903 is not set +# CONFIG_BATTERY_GOLDFISH is not set +# CONFIG_HWMON is not set +CONFIG_THERMAL=y +# CONFIG_THERMAL_NETLINK is not set +# CONFIG_THERMAL_STATISTICS is not set +CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0 +CONFIG_THERMAL_WRITABLE_TRIPS=y +CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y +# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set +# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set +CONFIG_THERMAL_GOV_FAIR_SHARE=y +CONFIG_THERMAL_GOV_STEP_WISE=y +# CONFIG_THERMAL_GOV_BANG_BANG is not set +CONFIG_THERMAL_GOV_USER_SPACE=y +# CONFIG_THERMAL_EMULATION is not set + +# +# Intel thermal drivers +# +# CONFIG_INTEL_POWERCLAMP is not set +CONFIG_X86_THERMAL_VECTOR=y +CONFIG_X86_PKG_TEMP_THERMAL=y +# CONFIG_INTEL_SOC_DTS_THERMAL is not set + +# +# ACPI INT340X thermal drivers +# +# CONFIG_INT340X_THERMAL is not set +# end of ACPI INT340X thermal drivers + +# CONFIG_INTEL_PCH_THERMAL is not set +# CONFIG_INTEL_TCC_COOLING is not set +# CONFIG_INTEL_HFI_THERMAL is not set +# end of Intel thermal drivers + +CONFIG_WATCHDOG=y +CONFIG_WATCHDOG_CORE=y +# CONFIG_WATCHDOG_NOWAYOUT is not set +CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y +CONFIG_WATCHDOG_OPEN_TIMEOUT=0 +CONFIG_WATCHDOG_SYSFS=y +# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set + +# +# Watchdog Pretimeout Governors +# +# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set + +# +# Watchdog Device Drivers +# +# CONFIG_SOFT_WATCHDOG is not set +# CONFIG_WDAT_WDT is not set +# CONFIG_XILINX_WATCHDOG is not set +# CONFIG_CADENCE_WATCHDOG is not set +# CONFIG_DW_WATCHDOG is not set +# CONFIG_MAX63XX_WATCHDOG is not set +# CONFIG_ACQUIRE_WDT is not set +# CONFIG_ADVANTECH_WDT is not set +# CONFIG_ALIM1535_WDT is not set +# CONFIG_ALIM7101_WDT is not set +# CONFIG_EBC_C384_WDT is not set +# CONFIG_EXAR_WDT is not set +# CONFIG_F71808E_WDT is not set +# CONFIG_SP5100_TCO is not set +# CONFIG_SBC_FITPC2_WATCHDOG is not set +# CONFIG_EUROTECH_WDT is not set +# CONFIG_IB700_WDT is not set +# CONFIG_IBMASR is not set +# CONFIG_WAFER_WDT is not set +# CONFIG_I6300ESB_WDT is not set +# CONFIG_IE6XX_WDT is not set +# CONFIG_ITCO_WDT is not set +# CONFIG_IT8712F_WDT is not set +# CONFIG_IT87_WDT is not set +# CONFIG_HP_WATCHDOG is not set +# CONFIG_SC1200_WDT is not set +# CONFIG_PC87413_WDT is not set +# CONFIG_NV_TCO is not set +# CONFIG_60XX_WDT is not set +# CONFIG_CPU5_WDT is not set +# CONFIG_SMSC_SCH311X_WDT is not set +# CONFIG_SMSC37B787_WDT is not set +# CONFIG_TQMX86_WDT is not set +# CONFIG_VIA_WDT is not set +# CONFIG_W83627HF_WDT is not set +# CONFIG_W83877F_WDT is not set +# CONFIG_W83977F_WDT is not set +# CONFIG_MACHZ_WDT is not set +# CONFIG_SBC_EPX_C3_WATCHDOG is not set +# CONFIG_NI903X_WDT is not set +# CONFIG_NIC7018_WDT is not set + +# +# PCI-based Watchdog Cards +# +# CONFIG_PCIPCWATCHDOG is not set +# CONFIG_WDTPCI is not set +CONFIG_SSB_POSSIBLE=y +# CONFIG_SSB is not set +CONFIG_BCMA_POSSIBLE=y +# CONFIG_BCMA is not set + +# +# Multifunction device drivers +# +# CONFIG_MFD_MADERA is not set +# CONFIG_HTC_PASIC3 is not set +# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set +# CONFIG_LPC_ICH is not set +# CONFIG_LPC_SCH is not set +# CONFIG_MFD_INTEL_LPSS_ACPI is not set +# CONFIG_MFD_INTEL_LPSS_PCI is not set +# CONFIG_MFD_INTEL_PMC_BXT is not set +# CONFIG_MFD_JANZ_CMODIO is not set +# CONFIG_MFD_KEMPLD is not set +# CONFIG_MFD_MT6397 is not set +# CONFIG_MFD_RDC321X is not set +# CONFIG_MFD_SM501 is not set +# CONFIG_MFD_SYSCON is not set +# CONFIG_MFD_TQMX86 is not set +# CONFIG_MFD_VX855 is not set +# CONFIG_RAVE_SP_CORE is not set +# end of Multifunction device drivers + +# CONFIG_REGULATOR is not set +# CONFIG_RC_CORE is not set + +# +# CEC support +# +# CONFIG_MEDIA_CEC_SUPPORT is not set +# end of CEC support + +# CONFIG_MEDIA_SUPPORT is not set + +# +# Graphics support +# +# CONFIG_AGP is not set +# CONFIG_VGA_SWITCHEROO is not set +# CONFIG_DRM is not set + +# +# ARM devices +# +# end of ARM devices + +# +# Frame buffer Devices +# +# CONFIG_FB is not set +# end of Frame buffer Devices + +# +# Backlight & LCD device support +# +# CONFIG_LCD_CLASS_DEVICE is not set +# CONFIG_BACKLIGHT_CLASS_DEVICE is not set +# end of Backlight & LCD device support + +# +# Console display driver support +# +CONFIG_VGA_CONSOLE=y +CONFIG_DUMMY_CONSOLE=y +CONFIG_DUMMY_CONSOLE_COLUMNS=80 +CONFIG_DUMMY_CONSOLE_ROWS=25 +# end of Console display driver support +# end of Graphics support + +# CONFIG_SOUND is not set + +# +# HID support +# +CONFIG_HID=y +# CONFIG_HID_BATTERY_STRENGTH is not set +CONFIG_HIDRAW=y +# CONFIG_UHID is not set +# CONFIG_HID_GENERIC is not set + +# +# Special HID drivers +# +# CONFIG_HID_A4TECH is not set +# CONFIG_HID_ACRUX is not set +# CONFIG_HID_AUREAL is not set +# CONFIG_HID_BELKIN is not set +# CONFIG_HID_CHERRY is not set +# CONFIG_HID_COUGAR is not set +# CONFIG_HID_MACALLY is not set +# CONFIG_HID_CMEDIA is not set +# CONFIG_HID_CYPRESS is not set +# CONFIG_HID_DRAGONRISE is not set +# CONFIG_HID_EMS_FF is not set +# CONFIG_HID_ELECOM is not set +# CONFIG_HID_EZKEY is not set +# CONFIG_HID_GEMBIRD is not set +# CONFIG_HID_GFRM is not set +# CONFIG_HID_GLORIOUS is not set +# CONFIG_HID_VIVALDI is not set +# CONFIG_HID_KEYTOUCH is not set +# CONFIG_HID_KYE is not set +# CONFIG_HID_WALTOP is not set +# CONFIG_HID_VIEWSONIC is not set +# CONFIG_HID_VRC2 is not set +# CONFIG_HID_XIAOMI is not set +# CONFIG_HID_GYRATION is not set +# CONFIG_HID_ICADE is not set +# CONFIG_HID_ITE is not set +# CONFIG_HID_JABRA is not set +# CONFIG_HID_TWINHAN is not set +# CONFIG_HID_KENSINGTON is not set +# CONFIG_HID_LCPOWER is not set +# CONFIG_HID_LENOVO is not set +# CONFIG_HID_MAGICMOUSE is not set +# CONFIG_HID_MALTRON is not set +# CONFIG_HID_MAYFLASH is not set +# CONFIG_HID_REDRAGON is not set +# CONFIG_HID_MICROSOFT is not set +# CONFIG_HID_MONTEREY is not set +# CONFIG_HID_MULTITOUCH is not set +# CONFIG_HID_NTI is not set +# CONFIG_HID_ORTEK is not set +# CONFIG_HID_PANTHERLORD is not set +# CONFIG_HID_PETALYNX is not set +# CONFIG_HID_PICOLCD is not set +# CONFIG_HID_PLANTRONICS is not set +# CONFIG_HID_PXRC is not set +# CONFIG_HID_RAZER is not set +# CONFIG_HID_PRIMAX is not set +# CONFIG_HID_SAITEK is not set +# CONFIG_HID_SEMITEK is not set +# CONFIG_HID_SPEEDLINK is not set +# CONFIG_HID_STEAM is not set +# CONFIG_HID_STEELSERIES is not set +# CONFIG_HID_SUNPLUS is not set +# CONFIG_HID_RMI is not set +# CONFIG_HID_GREENASIA is not set +# CONFIG_HID_SMARTJOYPLUS is not set +# CONFIG_HID_TIVO is not set +# CONFIG_HID_TOPSEED is not set +# CONFIG_HID_TOPRE is not set +# CONFIG_HID_UDRAW_PS3 is not set +# CONFIG_HID_XINMO is not set +# CONFIG_HID_ZEROPLUS is not set +# CONFIG_HID_ZYDACRON is not set +# CONFIG_HID_SENSOR_HUB is not set +# CONFIG_HID_ALPS is not set +# end of Special HID drivers + +# +# Intel ISH HID support +# +# CONFIG_INTEL_ISH_HID is not set +# end of Intel ISH HID support + +# +# AMD SFH HID Support +# +# CONFIG_AMD_SFH_HID is not set +# end of AMD SFH HID Support +# end of HID support + +CONFIG_USB_OHCI_LITTLE_ENDIAN=y +CONFIG_USB_SUPPORT=y +# CONFIG_USB_ULPI_BUS is not set +CONFIG_USB_ARCH_HAS_HCD=y +# CONFIG_USB is not set +CONFIG_USB_PCI=y + +# +# USB port drivers +# + +# +# USB Physical Layer drivers +# +# CONFIG_NOP_USB_XCEIV is not set +# end of USB Physical Layer drivers + +# CONFIG_USB_GADGET is not set +# CONFIG_TYPEC is not set +# CONFIG_USB_ROLE_SWITCH is not set +# CONFIG_MMC is not set +# CONFIG_SCSI_UFSHCD is not set +# CONFIG_MEMSTICK is not set +# CONFIG_NEW_LEDS is not set +# CONFIG_ACCESSIBILITY is not set +# CONFIG_INFINIBAND is not set +CONFIG_EDAC_ATOMIC_SCRUB=y +CONFIG_EDAC_SUPPORT=y +# CONFIG_EDAC is not set +CONFIG_RTC_LIB=y +CONFIG_RTC_MC146818_LIB=y +# CONFIG_RTC_CLASS is not set +CONFIG_DMADEVICES=y +# CONFIG_DMADEVICES_DEBUG is not set + +# +# DMA Devices +# +CONFIG_DMA_ENGINE=y +CONFIG_DMA_VIRTUAL_CHANNELS=y +CONFIG_DMA_ACPI=y +# CONFIG_ALTERA_MSGDMA is not set +# CONFIG_INTEL_IDMA64 is not set +# CONFIG_INTEL_IDXD_COMPAT is not set +# CONFIG_INTEL_IOATDMA is not set +# CONFIG_PLX_DMA is not set +# CONFIG_AMD_PTDMA is not set +# CONFIG_QCOM_HIDMA_MGMT is not set +# CONFIG_QCOM_HIDMA is not set +CONFIG_DW_DMAC_CORE=y +# CONFIG_DW_DMAC is not set +# CONFIG_DW_DMAC_PCI is not set +# CONFIG_DW_EDMA is not set +# CONFIG_DW_EDMA_PCIE is not set +CONFIG_HSU_DMA=y +# CONFIG_SF_PDMA is not set +# CONFIG_INTEL_LDMA is not set + +# +# DMA Clients +# +# CONFIG_ASYNC_TX_DMA is not set +# CONFIG_DMATEST is not set + +# +# DMABUF options +# +CONFIG_SYNC_FILE=y +# CONFIG_SW_SYNC is not set +# CONFIG_UDMABUF is not set +# CONFIG_DMABUF_MOVE_NOTIFY is not set +# CONFIG_DMABUF_DEBUG is not set +# CONFIG_DMABUF_SELFTESTS is not set +# CONFIG_DMABUF_HEAPS is not set +# CONFIG_DMABUF_SYSFS_STATS is not set +# end of DMABUF options + +CONFIG_AUXDISPLAY=y +# CONFIG_IMG_ASCII_LCD is not set +CONFIG_CHARLCD_BL_OFF=y +# CONFIG_CHARLCD_BL_ON is not set +# CONFIG_CHARLCD_BL_FLASH is not set +# CONFIG_UIO is not set +# CONFIG_VFIO is not set +CONFIG_VIRT_DRIVERS=y +CONFIG_VMGENID=y +# CONFIG_VBOXGUEST is not set +# CONFIG_NITRO_ENCLAVES is not set +CONFIG_VIRTIO_ANCHOR=y +CONFIG_VIRTIO=y +CONFIG_VIRTIO_PCI_LIB=y +CONFIG_VIRTIO_PCI_LIB_LEGACY=y +CONFIG_VIRTIO_MENU=y +CONFIG_VIRTIO_PCI=y +CONFIG_VIRTIO_PCI_LEGACY=y +CONFIG_VIRTIO_PMEM=y +CONFIG_VIRTIO_BALLOON=y +CONFIG_VIRTIO_MEM=y +# CONFIG_VIRTIO_INPUT is not set +CONFIG_VIRTIO_MMIO=y +# CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES is not set +# CONFIG_VDPA is not set +CONFIG_VHOST_MENU=y +# CONFIG_VHOST_NET is not set +# CONFIG_VHOST_VSOCK is not set +# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set + +# +# Microsoft Hyper-V guest support +# +# CONFIG_HYPERV is not set +# end of Microsoft Hyper-V guest support + +# CONFIG_GREYBUS is not set +# CONFIG_COMEDI is not set +CONFIG_STAGING=y +# CONFIG_RTS5208 is not set +# CONFIG_STAGING_MEDIA is not set +# CONFIG_FIELDBUS_DEV is not set +# CONFIG_VME_BUS is not set +# CONFIG_CHROME_PLATFORMS is not set +# CONFIG_MELLANOX_PLATFORM is not set +CONFIG_SURFACE_PLATFORMS=y +# CONFIG_SURFACE_GPE is not set +# CONFIG_SURFACE_PRO3_BUTTON is not set +# CONFIG_SURFACE_AGGREGATOR is not set +CONFIG_X86_PLATFORM_DEVICES=y +# CONFIG_ACPI_WMI is not set +# CONFIG_ACERHDF is not set +# CONFIG_ACER_WIRELESS is not set +# CONFIG_AMD_PMF is not set +# CONFIG_AMD_HSMP is not set +# CONFIG_ADV_SWBUTTON is not set +# CONFIG_ASUS_WIRELESS is not set +# CONFIG_X86_PLATFORM_DRIVERS_DELL is not set +# CONFIG_FUJITSU_TABLET is not set +# CONFIG_GPD_POCKET_FAN is not set +# CONFIG_X86_PLATFORM_DRIVERS_HP is not set +# CONFIG_WIRELESS_HOTKEY is not set +# CONFIG_IBM_RTL is not set +# CONFIG_SENSORS_HDAPS is not set +# CONFIG_INTEL_SAR_INT1092 is not set +# CONFIG_INTEL_PMC_CORE is not set + +# +# Intel Speed Select Technology interface support +# +# CONFIG_INTEL_SPEED_SELECT_INTERFACE is not set +# end of Intel Speed Select Technology interface support + +# +# Intel Uncore Frequency Control +# +# CONFIG_INTEL_UNCORE_FREQ_CONTROL is not set +# end of Intel Uncore Frequency Control + +# CONFIG_INTEL_PUNIT_IPC is not set +# CONFIG_INTEL_RST is not set +# CONFIG_INTEL_SMARTCONNECT is not set +CONFIG_INTEL_TURBO_MAX_3=y +# CONFIG_INTEL_VSEC is not set +# CONFIG_SAMSUNG_Q10 is not set +# CONFIG_TOSHIBA_BT_RFKILL is not set +# CONFIG_TOSHIBA_HAPS is not set +# CONFIG_ACPI_CMPC is not set +# CONFIG_TOPSTAR_LAPTOP is not set +# CONFIG_INTEL_IPS is not set +# CONFIG_INTEL_SCU_PCI is not set +# CONFIG_INTEL_SCU_PLATFORM is not set +# CONFIG_SIEMENS_SIMATIC_IPC is not set +# CONFIG_WINMATE_FM07_KEYS is not set +# CONFIG_P2SB is not set +CONFIG_HAVE_CLK=y +CONFIG_HAVE_CLK_PREPARE=y +CONFIG_COMMON_CLK=y +# CONFIG_XILINX_VCU is not set +# CONFIG_HWSPINLOCK is not set + +# +# Clock Source drivers +# +CONFIG_CLKEVT_I8253=y +CONFIG_I8253_LOCK=y +CONFIG_CLKBLD_I8253=y +# end of Clock Source drivers + +CONFIG_MAILBOX=y +CONFIG_PCC=y +# CONFIG_ALTERA_MBOX is not set +CONFIG_IOMMU_IOVA=y +CONFIG_IOMMU_API=y +CONFIG_IOMMU_SUPPORT=y + +# +# Generic IOMMU Pagetable Support +# +# end of Generic IOMMU Pagetable Support + +# CONFIG_IOMMU_DEBUGFS is not set +# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set +CONFIG_IOMMU_DEFAULT_DMA_LAZY=y +# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set +CONFIG_IOMMU_DMA=y +# CONFIG_AMD_IOMMU is not set +# CONFIG_INTEL_IOMMU is not set +# CONFIG_IRQ_REMAP is not set +# CONFIG_VIRTIO_IOMMU is not set + +# +# Remoteproc drivers +# +# CONFIG_REMOTEPROC is not set +# end of Remoteproc drivers + +# +# Rpmsg drivers +# +# CONFIG_RPMSG_QCOM_GLINK_RPM is not set +# CONFIG_RPMSG_VIRTIO is not set +# end of Rpmsg drivers + +# CONFIG_SOUNDWIRE is not set + +# +# SOC (System On Chip) specific Drivers +# + +# +# Amlogic SoC drivers +# +# end of Amlogic SoC drivers + +# +# Broadcom SoC drivers +# +# end of Broadcom SoC drivers + +# +# NXP/Freescale QorIQ SoC drivers +# +# end of NXP/Freescale QorIQ SoC drivers + +# +# fujitsu SoC drivers +# +# end of fujitsu SoC drivers + +# +# i.MX SoC drivers +# +# end of i.MX SoC drivers + +# +# Enable LiteX SoC Builder specific drivers +# +# end of Enable LiteX SoC Builder specific drivers + +# +# Qualcomm SoC drivers +# +# end of Qualcomm SoC drivers + +# CONFIG_SOC_TI is not set + +# +# Xilinx SoC drivers +# +# end of Xilinx SoC drivers +# end of SOC (System On Chip) specific Drivers + +# CONFIG_PM_DEVFREQ is not set +# CONFIG_EXTCON is not set +# CONFIG_MEMORY is not set +# CONFIG_IIO is not set +# CONFIG_NTB is not set +# CONFIG_PWM is not set + +# +# IRQ chip support +# +# end of IRQ chip support + +# CONFIG_IPACK_BUS is not set +# CONFIG_RESET_CONTROLLER is not set + +# +# PHY Subsystem +# +# CONFIG_GENERIC_PHY is not set +# CONFIG_USB_LGM_PHY is not set +# CONFIG_PHY_CAN_TRANSCEIVER is not set + +# +# PHY drivers for Broadcom platforms +# +# CONFIG_BCM_KONA_USB2_PHY is not set +# end of PHY drivers for Broadcom platforms + +# CONFIG_PHY_PXA_28NM_HSIC is not set +# CONFIG_PHY_PXA_28NM_USB2 is not set +# CONFIG_PHY_INTEL_LGM_EMMC is not set +# end of PHY Subsystem + +# CONFIG_POWERCAP is not set +# CONFIG_MCB is not set + +# +# Performance monitor support +# +# end of Performance monitor support + +CONFIG_RAS=y +# CONFIG_USB4 is not set + +# +# Android +# +# CONFIG_ANDROID_BINDER_IPC is not set +# end of Android + +CONFIG_LIBNVDIMM=y +CONFIG_BLK_DEV_PMEM=y +CONFIG_ND_CLAIM=y +CONFIG_ND_BTT=y +CONFIG_BTT=y +CONFIG_ND_PFN=y +CONFIG_NVDIMM_PFN=y +CONFIG_NVDIMM_DAX=y +CONFIG_NVDIMM_KEYS=y +CONFIG_DAX=y +CONFIG_DEV_DAX=y +CONFIG_DEV_DAX_PMEM=y +CONFIG_DEV_DAX_KMEM=y +# CONFIG_NVMEM is not set + +# +# HW tracing support +# +# CONFIG_STM is not set +# CONFIG_INTEL_TH is not set +# end of HW tracing support + +# CONFIG_FPGA is not set +# CONFIG_TEE is not set +# CONFIG_SIOX is not set +# CONFIG_SLIMBUS is not set +# CONFIG_INTERCONNECT is not set +# CONFIG_COUNTER is not set +# CONFIG_PECI is not set +# CONFIG_HTE is not set +# CONFIG_AMAZON_DRIVER_UPDATES is not set +# end of Device Drivers + +# +# File systems +# +CONFIG_DCACHE_WORD_ACCESS=y +CONFIG_VALIDATE_FS_PARSER=y +CONFIG_FS_IOMAP=y +# CONFIG_EXT2_FS is not set +# CONFIG_EXT3_FS is not set +CONFIG_EXT4_FS=y +CONFIG_EXT4_USE_FOR_EXT2=y +CONFIG_EXT4_FS_POSIX_ACL=y +CONFIG_EXT4_FS_SECURITY=y +CONFIG_EXT4_DEBUG=y +CONFIG_JBD2=y +CONFIG_JBD2_DEBUG=y +CONFIG_FS_MBCACHE=y +# CONFIG_REISERFS_FS is not set +# CONFIG_JFS_FS is not set +CONFIG_XFS_FS=y +CONFIG_XFS_SUPPORT_V4=y +CONFIG_XFS_QUOTA=y +CONFIG_XFS_POSIX_ACL=y +# CONFIG_XFS_RT is not set +# CONFIG_XFS_ONLINE_SCRUB is not set +# CONFIG_XFS_WARN is not set +# CONFIG_XFS_DEBUG is not set +# CONFIG_GFS2_FS is not set +# CONFIG_BTRFS_FS is not set +# CONFIG_NILFS2_FS is not set +# CONFIG_F2FS_FS is not set +CONFIG_FS_DAX=y +CONFIG_FS_DAX_PMD=y +CONFIG_FS_POSIX_ACL=y +CONFIG_EXPORTFS=y +# CONFIG_EXPORTFS_BLOCK_OPS is not set +CONFIG_FILE_LOCKING=y +CONFIG_FS_ENCRYPTION=y +CONFIG_FS_ENCRYPTION_ALGS=y +# CONFIG_FS_VERITY is not set +CONFIG_FSNOTIFY=y +CONFIG_DNOTIFY=y +CONFIG_INOTIFY_USER=y +CONFIG_FANOTIFY=y +CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y +CONFIG_QUOTA=y +CONFIG_QUOTA_NETLINK_INTERFACE=y +# CONFIG_PRINT_QUOTA_WARNING is not set +# CONFIG_QUOTA_DEBUG is not set +# CONFIG_QFMT_V1 is not set +# CONFIG_QFMT_V2 is not set +CONFIG_QUOTACTL=y +CONFIG_AUTOFS4_FS=y +CONFIG_AUTOFS_FS=y +CONFIG_FUSE_FS=y +# CONFIG_CUSE is not set +# CONFIG_VIRTIO_FS is not set +CONFIG_OVERLAY_FS=y +# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set +CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y +# CONFIG_OVERLAY_FS_INDEX is not set +# CONFIG_OVERLAY_FS_XINO_AUTO is not set +# CONFIG_OVERLAY_FS_METACOPY is not set + +# +# Caches +# +# CONFIG_FSCACHE is not set +# end of Caches + +# +# CD-ROM/DVD Filesystems +# +# CONFIG_ISO9660_FS is not set +# CONFIG_UDF_FS is not set +# end of CD-ROM/DVD Filesystems + +# +# DOS/FAT/EXFAT/NT Filesystems +# +# CONFIG_MSDOS_FS is not set +# CONFIG_VFAT_FS is not set +# CONFIG_EXFAT_FS is not set +# CONFIG_NTFS_FS is not set +# CONFIG_NTFS3_FS is not set +# end of DOS/FAT/EXFAT/NT Filesystems + +# +# Pseudo filesystems +# +CONFIG_PROC_FS=y +CONFIG_PROC_KCORE=y +CONFIG_PROC_SYSCTL=y +CONFIG_PROC_PAGE_MONITOR=y +CONFIG_PROC_CHILDREN=y +CONFIG_PROC_PID_ARCH_STATUS=y +CONFIG_KERNFS=y +CONFIG_SYSFS=y +CONFIG_TMPFS=y +CONFIG_TMPFS_POSIX_ACL=y +CONFIG_TMPFS_XATTR=y +# CONFIG_TMPFS_INODE64 is not set +CONFIG_HUGETLBFS=y +CONFIG_HUGETLB_PAGE=y +CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y +CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y +# CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON is not set +CONFIG_MEMFD_CREATE=y +CONFIG_ARCH_HAS_GIGANTIC_PAGE=y +# CONFIG_CONFIGFS_FS is not set +# end of Pseudo filesystems + +CONFIG_MISC_FILESYSTEMS=y +# CONFIG_ORANGEFS_FS is not set +# CONFIG_ADFS_FS is not set +# CONFIG_AFFS_FS is not set +# CONFIG_ECRYPT_FS is not set +# CONFIG_HFS_FS is not set +# CONFIG_HFSPLUS_FS is not set +# CONFIG_BEFS_FS is not set +# CONFIG_BFS_FS is not set +# CONFIG_EFS_FS is not set +# CONFIG_CRAMFS is not set +CONFIG_SQUASHFS=y +CONFIG_SQUASHFS_FILE_CACHE=y +# CONFIG_SQUASHFS_FILE_DIRECT is not set +CONFIG_SQUASHFS_DECOMP_SINGLE=y +# CONFIG_SQUASHFS_DECOMP_MULTI is not set +# CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU is not set +CONFIG_SQUASHFS_XATTR=y +CONFIG_SQUASHFS_ZLIB=y +CONFIG_SQUASHFS_LZ4=y +CONFIG_SQUASHFS_LZO=y +CONFIG_SQUASHFS_XZ=y +CONFIG_SQUASHFS_ZSTD=y +# CONFIG_SQUASHFS_4K_DEVBLK_SIZE is not set +# CONFIG_SQUASHFS_EMBEDDED is not set +CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3 +# CONFIG_VXFS_FS is not set +# CONFIG_MINIX_FS is not set +# CONFIG_OMFS_FS is not set +# CONFIG_HPFS_FS is not set +# CONFIG_QNX4FS_FS is not set +# CONFIG_QNX6FS_FS is not set +# CONFIG_ROMFS_FS is not set +CONFIG_PSTORE=y +CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240 +CONFIG_PSTORE_DEFLATE_COMPRESS=y +# CONFIG_PSTORE_LZO_COMPRESS is not set +# CONFIG_PSTORE_LZ4_COMPRESS is not set +# CONFIG_PSTORE_LZ4HC_COMPRESS is not set +# CONFIG_PSTORE_842_COMPRESS is not set +# CONFIG_PSTORE_ZSTD_COMPRESS is not set +CONFIG_PSTORE_COMPRESS=y +CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y +CONFIG_PSTORE_COMPRESS_DEFAULT="deflate" +# CONFIG_PSTORE_CONSOLE is not set +# CONFIG_PSTORE_PMSG is not set +# CONFIG_PSTORE_RAM is not set +# CONFIG_PSTORE_BLK is not set +# CONFIG_SYSV_FS is not set +# CONFIG_UFS_FS is not set +# CONFIG_EROFS_FS is not set +CONFIG_NETWORK_FILESYSTEMS=y +CONFIG_NFS_FS=y +# CONFIG_NFS_V2 is not set +# CONFIG_NFS_V3 is not set +CONFIG_NFS_V4=y +CONFIG_NFS_SWAP=y +CONFIG_NFS_V4_1=y +CONFIG_NFS_V4_2=y +CONFIG_PNFS_FILE_LAYOUT=y +CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org" +# CONFIG_NFS_V4_1_MIGRATION is not set +CONFIG_NFS_V4_SECURITY_LABEL=y +CONFIG_ROOT_NFS=y +# CONFIG_NFS_USE_LEGACY_DNS is not set +CONFIG_NFS_USE_KERNEL_DNS=y +CONFIG_NFS_DISABLE_UDP_SUPPORT=y +# CONFIG_NFS_V4_2_READ_PLUS is not set +# CONFIG_NFSD is not set +CONFIG_GRACE_PERIOD=y +CONFIG_LOCKD=y +CONFIG_NFS_COMMON=y +CONFIG_NFS_V4_2_SSC_HELPER=y +CONFIG_SUNRPC=y +CONFIG_SUNRPC_GSS=y +CONFIG_SUNRPC_BACKCHANNEL=y +CONFIG_SUNRPC_SWAP=y +# CONFIG_SUNRPC_DEBUG is not set +# CONFIG_CEPH_FS is not set +# CONFIG_CIFS is not set +# CONFIG_SMB_SERVER is not set +# CONFIG_CODA_FS is not set +# CONFIG_AFS_FS is not set +CONFIG_NLS=y +CONFIG_NLS_DEFAULT="utf8" +# CONFIG_NLS_CODEPAGE_437 is not set +# CONFIG_NLS_CODEPAGE_737 is not set +# CONFIG_NLS_CODEPAGE_775 is not set +# CONFIG_NLS_CODEPAGE_850 is not set +# CONFIG_NLS_CODEPAGE_852 is not set +# CONFIG_NLS_CODEPAGE_855 is not set +# CONFIG_NLS_CODEPAGE_857 is not set +# CONFIG_NLS_CODEPAGE_860 is not set +# CONFIG_NLS_CODEPAGE_861 is not set +# CONFIG_NLS_CODEPAGE_862 is not set +# CONFIG_NLS_CODEPAGE_863 is not set +# CONFIG_NLS_CODEPAGE_864 is not set +# CONFIG_NLS_CODEPAGE_865 is not set +# CONFIG_NLS_CODEPAGE_866 is not set +# CONFIG_NLS_CODEPAGE_869 is not set +# CONFIG_NLS_CODEPAGE_936 is not set +# CONFIG_NLS_CODEPAGE_950 is not set +# CONFIG_NLS_CODEPAGE_932 is not set +# CONFIG_NLS_CODEPAGE_949 is not set +# CONFIG_NLS_CODEPAGE_874 is not set +# CONFIG_NLS_ISO8859_8 is not set +# CONFIG_NLS_CODEPAGE_1250 is not set +# CONFIG_NLS_CODEPAGE_1251 is not set +# CONFIG_NLS_ASCII is not set +# CONFIG_NLS_ISO8859_1 is not set +# CONFIG_NLS_ISO8859_2 is not set +# CONFIG_NLS_ISO8859_3 is not set +# CONFIG_NLS_ISO8859_4 is not set +# CONFIG_NLS_ISO8859_5 is not set +# CONFIG_NLS_ISO8859_6 is not set +# CONFIG_NLS_ISO8859_7 is not set +# CONFIG_NLS_ISO8859_9 is not set +# CONFIG_NLS_ISO8859_13 is not set +# CONFIG_NLS_ISO8859_14 is not set +# CONFIG_NLS_ISO8859_15 is not set +# CONFIG_NLS_KOI8_R is not set +# CONFIG_NLS_KOI8_U is not set +# CONFIG_NLS_MAC_ROMAN is not set +# CONFIG_NLS_MAC_CELTIC is not set +# CONFIG_NLS_MAC_CENTEURO is not set +# CONFIG_NLS_MAC_CROATIAN is not set +# CONFIG_NLS_MAC_CYRILLIC is not set +# CONFIG_NLS_MAC_GAELIC is not set +# CONFIG_NLS_MAC_GREEK is not set +# CONFIG_NLS_MAC_ICELAND is not set +# CONFIG_NLS_MAC_INUIT is not set +# CONFIG_NLS_MAC_ROMANIAN is not set +# CONFIG_NLS_MAC_TURKISH is not set +# CONFIG_NLS_UTF8 is not set +# CONFIG_UNICODE is not set +CONFIG_IO_WQ=y +# end of File systems + +# +# Security options +# +CONFIG_KEYS=y +# CONFIG_KEYS_REQUEST_CACHE is not set +CONFIG_PERSISTENT_KEYRINGS=y +# CONFIG_TRUSTED_KEYS is not set +CONFIG_ENCRYPTED_KEYS=y +# CONFIG_USER_DECRYPTED_DATA is not set +# CONFIG_KEY_DH_OPERATIONS is not set +# CONFIG_SECURITY_DMESG_RESTRICT is not set +CONFIG_PROC_MEM_ALWAYS_FORCE=y +# CONFIG_PROC_MEM_FORCE_PTRACE is not set +# CONFIG_PROC_MEM_NO_FORCE is not set +CONFIG_SECURITY=y +CONFIG_SECURITY_WRITABLE_HOOKS=y +CONFIG_SECURITYFS=y +CONFIG_SECURITY_NETWORK=y +CONFIG_SECURITY_NETWORK_XFRM=y +# CONFIG_SECURITY_PATH is not set +CONFIG_LSM_MMAP_MIN_ADDR=65536 +CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y +CONFIG_HARDENED_USERCOPY=y +CONFIG_FORTIFY_SOURCE=y +# CONFIG_STATIC_USERMODEHELPER is not set +CONFIG_SECURITY_SELINUX=y +CONFIG_SECURITY_SELINUX_BOOTPARAM=y +CONFIG_SECURITY_SELINUX_DISABLE=y +CONFIG_SECURITY_SELINUX_DEVELOP=y +CONFIG_SECURITY_SELINUX_AVC_STATS=y +CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1 +CONFIG_SECURITY_SELINUX_SIDTAB_HASH_BITS=9 +CONFIG_SECURITY_SELINUX_SID2STR_CACHE_SIZE=256 +# CONFIG_SECURITY_SMACK is not set +# CONFIG_SECURITY_TOMOYO is not set +# CONFIG_SECURITY_APPARMOR is not set +# CONFIG_SECURITY_LOADPIN is not set +# CONFIG_SECURITY_YAMA is not set +# CONFIG_SECURITY_SAFESETID is not set +# CONFIG_SECURITY_LOCKDOWN_LSM is not set +# CONFIG_SECURITY_LANDLOCK is not set +# CONFIG_INTEGRITY is not set +CONFIG_DEFAULT_SECURITY_SELINUX=y +# CONFIG_DEFAULT_SECURITY_DAC is not set +CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf" + +# +# Kernel hardening options +# + +# +# Memory initialization +# +CONFIG_INIT_STACK_NONE=y +# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set +# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set +CONFIG_CC_HAS_ZERO_CALL_USED_REGS=y +# CONFIG_ZERO_CALL_USED_REGS is not set +# end of Memory initialization + +CONFIG_RANDSTRUCT_NONE=y +# end of Kernel hardening options +# end of Security options + +CONFIG_CRYPTO=y + +# +# Crypto core or helper +# +CONFIG_CRYPTO_FIPS=y +CONFIG_CRYPTO_FIPS_NAME="Linux Kernel Cryptographic API" +# CONFIG_CRYPTO_FIPS_CUSTOM_VERSION is not set +CONFIG_CRYPTO_ALGAPI=y +CONFIG_CRYPTO_ALGAPI2=y +CONFIG_CRYPTO_AEAD=y +CONFIG_CRYPTO_AEAD2=y +CONFIG_CRYPTO_SKCIPHER=y +CONFIG_CRYPTO_SKCIPHER2=y +CONFIG_CRYPTO_HASH=y +CONFIG_CRYPTO_HASH2=y +CONFIG_CRYPTO_RNG=y +CONFIG_CRYPTO_RNG2=y +CONFIG_CRYPTO_RNG_DEFAULT=y +CONFIG_CRYPTO_AKCIPHER2=y +CONFIG_CRYPTO_AKCIPHER=y +CONFIG_CRYPTO_KPP2=y +CONFIG_CRYPTO_KPP=y +CONFIG_CRYPTO_ACOMP2=y +CONFIG_CRYPTO_MANAGER=y +CONFIG_CRYPTO_MANAGER2=y +# CONFIG_CRYPTO_USER is not set +# CONFIG_CRYPTO_MANAGER_DISABLE_TESTS is not set +# CONFIG_CRYPTO_MANAGER_EXTRA_TESTS is not set +CONFIG_CRYPTO_NULL=y +CONFIG_CRYPTO_NULL2=y +# CONFIG_CRYPTO_PCRYPT is not set +# CONFIG_CRYPTO_CRYPTD is not set +# CONFIG_CRYPTO_AUTHENC is not set +# end of Crypto core or helper + +# +# Public-key cryptography +# +CONFIG_CRYPTO_RSA=y +CONFIG_CRYPTO_DH=y +# CONFIG_CRYPTO_DH_RFC7919_GROUPS is not set +CONFIG_CRYPTO_ECC=y +CONFIG_CRYPTO_ECDH=y +# CONFIG_CRYPTO_ECDSA is not set +# CONFIG_CRYPTO_ECRDSA is not set +# CONFIG_CRYPTO_SM2 is not set +# CONFIG_CRYPTO_CURVE25519 is not set +# end of Public-key cryptography + +# +# Block ciphers +# +CONFIG_CRYPTO_AES=y +CONFIG_CRYPTO_AES_TI=y +# CONFIG_CRYPTO_ARIA is not set +# CONFIG_CRYPTO_BLOWFISH is not set +# CONFIG_CRYPTO_CAMELLIA is not set +# CONFIG_CRYPTO_CAST5 is not set +# CONFIG_CRYPTO_CAST6 is not set +# CONFIG_CRYPTO_DES is not set +# CONFIG_CRYPTO_FCRYPT is not set +# CONFIG_CRYPTO_SERPENT is not set +# CONFIG_CRYPTO_SM4_GENERIC is not set +# CONFIG_CRYPTO_TWOFISH is not set +# end of Block ciphers + +# +# Length-preserving ciphers and modes +# +# CONFIG_CRYPTO_ADIANTUM is not set +# CONFIG_CRYPTO_CHACHA20 is not set +CONFIG_CRYPTO_CBC=y +# CONFIG_CRYPTO_CFB is not set +CONFIG_CRYPTO_CTR=y +CONFIG_CRYPTO_CTS=y +CONFIG_CRYPTO_ECB=y +# CONFIG_CRYPTO_HCTR2 is not set +# CONFIG_CRYPTO_KEYWRAP is not set +# CONFIG_CRYPTO_LRW is not set +# CONFIG_CRYPTO_OFB is not set +# CONFIG_CRYPTO_PCBC is not set +CONFIG_CRYPTO_XTS=y +# end of Length-preserving ciphers and modes + +# +# AEAD (authenticated encryption with associated data) ciphers +# +# CONFIG_CRYPTO_AEGIS128 is not set +# CONFIG_CRYPTO_CHACHA20POLY1305 is not set +# CONFIG_CRYPTO_CCM is not set +# CONFIG_CRYPTO_GCM is not set +CONFIG_CRYPTO_SEQIV=y +# CONFIG_CRYPTO_ECHAINIV is not set +# CONFIG_CRYPTO_ESSIV is not set +# end of AEAD (authenticated encryption with associated data) ciphers + +# +# Hashes, digests, and MACs +# +# CONFIG_CRYPTO_BLAKE2B is not set +# CONFIG_CRYPTO_CMAC is not set +# CONFIG_CRYPTO_GHASH is not set +CONFIG_CRYPTO_HMAC=y +# CONFIG_CRYPTO_MD4 is not set +CONFIG_CRYPTO_MD5=y +# CONFIG_CRYPTO_MICHAEL_MIC is not set +# CONFIG_CRYPTO_POLY1305 is not set +# CONFIG_CRYPTO_RMD160 is not set +CONFIG_CRYPTO_SHA1=y +CONFIG_CRYPTO_SHA256=y +CONFIG_CRYPTO_SHA512=y +CONFIG_CRYPTO_SHA3=y +# CONFIG_CRYPTO_SM3_GENERIC is not set +# CONFIG_CRYPTO_STREEBOG is not set +# CONFIG_CRYPTO_VMAC is not set +# CONFIG_CRYPTO_WP512 is not set +# CONFIG_CRYPTO_XCBC is not set +CONFIG_CRYPTO_XXHASH=y +# end of Hashes, digests, and MACs + +# +# CRCs (cyclic redundancy checks) +# +CONFIG_CRYPTO_CRC32C=y +# CONFIG_CRYPTO_CRC32 is not set +CONFIG_CRYPTO_CRCT10DIF=y +# end of CRCs (cyclic redundancy checks) + +# +# Compression +# +CONFIG_CRYPTO_DEFLATE=y +CONFIG_CRYPTO_LZO=y +# CONFIG_CRYPTO_842 is not set +# CONFIG_CRYPTO_LZ4 is not set +# CONFIG_CRYPTO_LZ4HC is not set +# CONFIG_CRYPTO_ZSTD is not set +# end of Compression + +# +# Random number generation +# +# CONFIG_CRYPTO_ANSI_CPRNG is not set +CONFIG_CRYPTO_DRBG_MENU=y +CONFIG_CRYPTO_DRBG_HMAC=y +CONFIG_CRYPTO_DRBG_HASH=y +CONFIG_CRYPTO_DRBG_CTR=y +CONFIG_CRYPTO_DRBG=y +CONFIG_CRYPTO_JITTERENTROPY=y +CONFIG_CRYPTO_JITTERENTROPY_OSR=1 +# end of Random number generation + +# +# Userspace interface +# +# CONFIG_CRYPTO_USER_API_HASH is not set +# CONFIG_CRYPTO_USER_API_SKCIPHER is not set +# CONFIG_CRYPTO_USER_API_RNG is not set +# CONFIG_CRYPTO_USER_API_AEAD is not set +# end of Userspace interface + +CONFIG_CRYPTO_HASH_INFO=y + +# +# Accelerated Cryptographic Algorithms for CPU (x86) +# +# CONFIG_CRYPTO_CURVE25519_X86 is not set +# CONFIG_CRYPTO_AES_NI_INTEL is not set +# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set +# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set +# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set +# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set +# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set +# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set +# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set +# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set +# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set +# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set +# CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64 is not set +# CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64 is not set +# CONFIG_CRYPTO_TWOFISH_X86_64 is not set +# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set +# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set +# CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64 is not set +# CONFIG_CRYPTO_CHACHA20_X86_64 is not set +# CONFIG_CRYPTO_AEGIS128_AESNI_SSE2 is not set +# CONFIG_CRYPTO_NHPOLY1305_SSE2 is not set +# CONFIG_CRYPTO_NHPOLY1305_AVX2 is not set +# CONFIG_CRYPTO_BLAKE2S_X86 is not set +# CONFIG_CRYPTO_POLYVAL_CLMUL_NI is not set +# CONFIG_CRYPTO_POLY1305_X86_64 is not set +CONFIG_CRYPTO_SHA1_SSSE3=y +CONFIG_CRYPTO_SHA256_SSSE3=y +CONFIG_CRYPTO_SHA512_SSSE3=y +# CONFIG_CRYPTO_SM3_AVX_X86_64 is not set +# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set +# CONFIG_CRYPTO_CRC32C_INTEL is not set +# CONFIG_CRYPTO_CRC32_PCLMUL is not set +CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y +# end of Accelerated Cryptographic Algorithms for CPU (x86) + +# CONFIG_CRYPTO_HW is not set +CONFIG_ASYMMETRIC_KEY_TYPE=y +CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y +CONFIG_X509_CERTIFICATE_PARSER=y +# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set +CONFIG_PKCS7_MESSAGE_PARSER=y +# CONFIG_FIPS_SIGNATURE_SELFTEST is not set + +# +# Certificates for signature checking +# +CONFIG_SYSTEM_TRUSTED_KEYRING=y +CONFIG_SYSTEM_TRUSTED_KEYS="" +# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set +# CONFIG_SECONDARY_TRUSTED_KEYRING is not set +CONFIG_SYSTEM_BLACKLIST_KEYRING=y +CONFIG_SYSTEM_BLACKLIST_HASH_LIST="" +# CONFIG_SYSTEM_REVOCATION_LIST is not set +# end of Certificates for signature checking + +CONFIG_BINARY_PRINTF=y + +# +# Library routines +# +# CONFIG_PACKING is not set +CONFIG_BITREVERSE=y +CONFIG_GENERIC_STRNCPY_FROM_USER=y +CONFIG_GENERIC_STRNLEN_USER=y +CONFIG_GENERIC_NET_UTILS=y +# CONFIG_CORDIC is not set +# CONFIG_PRIME_NUMBERS is not set +CONFIG_RATIONAL=y +CONFIG_GENERIC_PCI_IOMAP=y +CONFIG_GENERIC_IOMAP=y +CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y +CONFIG_ARCH_HAS_FAST_MULTIPLIER=y +CONFIG_ARCH_USE_SYM_ANNOTATIONS=y + +# +# Crypto library routines +# +CONFIG_CRYPTO_LIB_UTILS=y +CONFIG_CRYPTO_LIB_AES=y +CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y +# CONFIG_CRYPTO_LIB_CHACHA is not set +# CONFIG_CRYPTO_LIB_CURVE25519 is not set +CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11 +# CONFIG_CRYPTO_LIB_POLY1305 is not set +# CONFIG_CRYPTO_LIB_CHACHA20POLY1305 is not set +CONFIG_CRYPTO_LIB_SHA1=y +CONFIG_CRYPTO_LIB_SHA256=y +# end of Crypto library routines + +CONFIG_CRC_CCITT=y +CONFIG_CRC16=y +CONFIG_CRC_T10DIF=y +# CONFIG_CRC64_ROCKSOFT is not set +# CONFIG_CRC_ITU_T is not set +CONFIG_CRC32=y +# CONFIG_CRC32_SELFTEST is not set +CONFIG_CRC32_SLICEBY8=y +# CONFIG_CRC32_SLICEBY4 is not set +# CONFIG_CRC32_SARWATE is not set +# CONFIG_CRC32_BIT is not set +# CONFIG_CRC64 is not set +# CONFIG_CRC4 is not set +# CONFIG_CRC7 is not set +CONFIG_LIBCRC32C=y +# CONFIG_CRC8 is not set +CONFIG_XXHASH=y +# CONFIG_RANDOM32_SELFTEST is not set +CONFIG_ZLIB_INFLATE=y +CONFIG_ZLIB_DEFLATE=y +CONFIG_LZO_COMPRESS=y +CONFIG_LZO_DECOMPRESS=y +CONFIG_LZ4_DECOMPRESS=y +CONFIG_ZSTD_COMMON=y +CONFIG_ZSTD_DECOMPRESS=y +CONFIG_XZ_DEC=y +CONFIG_XZ_DEC_X86=y +CONFIG_XZ_DEC_POWERPC=y +CONFIG_XZ_DEC_IA64=y +CONFIG_XZ_DEC_ARM=y +CONFIG_XZ_DEC_ARMTHUMB=y +CONFIG_XZ_DEC_SPARC=y +# CONFIG_XZ_DEC_MICROLZMA is not set +CONFIG_XZ_DEC_BCJ=y +# CONFIG_XZ_DEC_TEST is not set +CONFIG_DECOMPRESS_GZIP=y +CONFIG_DECOMPRESS_BZIP2=y +CONFIG_DECOMPRESS_LZMA=y +CONFIG_DECOMPRESS_XZ=y +CONFIG_DECOMPRESS_LZO=y +CONFIG_DECOMPRESS_LZ4=y +CONFIG_DECOMPRESS_ZSTD=y +CONFIG_XARRAY_MULTI=y +CONFIG_ASSOCIATIVE_ARRAY=y +CONFIG_HAS_IOMEM=y +CONFIG_HAS_IOPORT_MAP=y +CONFIG_HAS_DMA=y +CONFIG_DMA_OPS=y +# CONFIG_DMA_PAGE_TOUCHING is not set +CONFIG_NEED_SG_DMA_LENGTH=y +CONFIG_NEED_DMA_MAP_STATE=y +CONFIG_ARCH_DMA_ADDR_T_64BIT=y +CONFIG_SWIOTLB=y +# CONFIG_DMA_API_DEBUG is not set +# CONFIG_DMA_MAP_BENCHMARK is not set +CONFIG_SGL_ALLOC=y +# CONFIG_FORCE_NR_CPUS is not set +CONFIG_CPU_RMAP=y +CONFIG_DQL=y +CONFIG_NLATTR=y +CONFIG_CLZ_TAB=y +CONFIG_IRQ_POLL=y +CONFIG_MPILIB=y +CONFIG_OID_REGISTRY=y +CONFIG_HAVE_GENERIC_VDSO=y +CONFIG_GENERIC_GETTIMEOFDAY=y +CONFIG_GENERIC_VDSO_TIME_NS=y +CONFIG_SG_POOL=y +CONFIG_ARCH_HAS_PMEM_API=y +CONFIG_MEMREGION=y +CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y +CONFIG_ARCH_HAS_COPY_MC=y +CONFIG_ARCH_STACKWALK=y +CONFIG_STACKDEPOT=y +CONFIG_SBITMAP=y +# end of Library routines + +# +# Kernel hacking +# + +# +# printk and dmesg options +# +CONFIG_PRINTK_TIME=y +# CONFIG_PRINTK_CALLER is not set +# CONFIG_STACKTRACE_BUILD_ID is not set +CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7 +CONFIG_CONSOLE_LOGLEVEL_QUIET=4 +CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4 +# CONFIG_BOOT_PRINTK_DELAY is not set +# CONFIG_DYNAMIC_DEBUG is not set +# CONFIG_DYNAMIC_DEBUG_CORE is not set +CONFIG_SYMBOLIC_ERRNAME=y +CONFIG_DEBUG_BUGVERBOSE=y +# end of printk and dmesg options + +CONFIG_DEBUG_KERNEL=y +CONFIG_DEBUG_MISC=y + +# +# Compile-time checks and compiler options +# +CONFIG_AS_HAS_NON_CONST_LEB128=y +CONFIG_DEBUG_INFO_NONE=y +# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set +# CONFIG_DEBUG_INFO_DWARF4 is not set +# CONFIG_DEBUG_INFO_DWARF5 is not set +CONFIG_FRAME_WARN=2048 +CONFIG_STRIP_ASM_SYMS=y +# CONFIG_READABLE_ASM is not set +# CONFIG_HEADERS_INSTALL is not set +CONFIG_DEBUG_SECTION_MISMATCH=y +CONFIG_SECTION_MISMATCH_WARN_ONLY=y +CONFIG_ARCH_WANT_FRAME_POINTERS=y +CONFIG_FRAME_POINTER=y +CONFIG_OBJTOOL=y +CONFIG_STACK_VALIDATION=y +# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set +# end of Compile-time checks and compiler options + +# +# Generic Kernel Debugging Instruments +# +CONFIG_MAGIC_SYSRQ=y +CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1 +CONFIG_MAGIC_SYSRQ_SERIAL=y +CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE="" +CONFIG_DEBUG_FS=y +CONFIG_DEBUG_FS_ALLOW_ALL=y +# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set +# CONFIG_DEBUG_FS_ALLOW_NONE is not set +CONFIG_HAVE_ARCH_KGDB=y +# CONFIG_KGDB is not set +CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y +# CONFIG_UBSAN is not set +CONFIG_HAVE_ARCH_KCSAN=y +CONFIG_HAVE_KCSAN_COMPILER=y +# CONFIG_KCSAN is not set +# end of Generic Kernel Debugging Instruments + +# +# Networking Debugging +# +# CONFIG_NET_DEV_REFCNT_TRACKER is not set +# CONFIG_NET_NS_REFCNT_TRACKER is not set +# CONFIG_DEBUG_NET is not set +# end of Networking Debugging + +# +# Memory Debugging +# +# CONFIG_PAGE_EXTENSION is not set +# CONFIG_DEBUG_PAGEALLOC is not set +CONFIG_SLUB_DEBUG=y +# CONFIG_SLUB_DEBUG_ON is not set +# CONFIG_PAGE_OWNER is not set +# CONFIG_PAGE_TABLE_CHECK is not set +# CONFIG_PAGE_POISONING is not set +# CONFIG_DEBUG_RODATA_TEST is not set +CONFIG_ARCH_HAS_DEBUG_WX=y +# CONFIG_DEBUG_WX is not set +CONFIG_GENERIC_PTDUMP=y +# CONFIG_PTDUMP_DEBUGFS is not set +# CONFIG_DEBUG_OBJECTS is not set +# CONFIG_SHRINKER_DEBUG is not set +CONFIG_HAVE_DEBUG_KMEMLEAK=y +# CONFIG_DEBUG_KMEMLEAK is not set +# CONFIG_DEBUG_STACK_USAGE is not set +CONFIG_SCHED_STACK_END_CHECK=y +CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y +# CONFIG_DEBUG_VM is not set +# CONFIG_DEBUG_VM_PGTABLE is not set +CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y +# CONFIG_DEBUG_VIRTUAL is not set +CONFIG_DEBUG_MEMORY_INIT=y +# CONFIG_DEBUG_PER_CPU_MAPS is not set +CONFIG_ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP=y +# CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is not set +CONFIG_HAVE_ARCH_KASAN=y +CONFIG_HAVE_ARCH_KASAN_VMALLOC=y +CONFIG_CC_HAS_KASAN_GENERIC=y +CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y +# CONFIG_KASAN is not set +CONFIG_HAVE_ARCH_KFENCE=y +# CONFIG_KFENCE is not set +CONFIG_HAVE_ARCH_KMSAN=y +# end of Memory Debugging + +# CONFIG_DEBUG_SHIRQ is not set + +# +# Debug Oops, Lockups and Hangs +# +# CONFIG_PANIC_ON_OOPS is not set +CONFIG_PANIC_ON_OOPS_VALUE=0 +CONFIG_PANIC_TIMEOUT=0 +CONFIG_LOCKUP_DETECTOR=y +CONFIG_SOFTLOCKUP_DETECTOR=y +# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set +CONFIG_HARDLOCKUP_DETECTOR_PERF=y +CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y +CONFIG_HARDLOCKUP_DETECTOR=y +# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set +CONFIG_DETECT_HUNG_TASK=y +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120 +# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set +CONFIG_WQ_WATCHDOG=y +# end of Debug Oops, Lockups and Hangs + +# +# Scheduler Debugging +# +# CONFIG_SCHED_DEBUG is not set +CONFIG_SCHED_INFO=y +# CONFIG_SCHEDSTATS is not set +# end of Scheduler Debugging + +# CONFIG_DEBUG_TIMEKEEPING is not set +# CONFIG_DEBUG_PREEMPT is not set + +# +# Lock Debugging (spinlocks, mutexes, etc...) +# +CONFIG_LOCK_DEBUGGING_SUPPORT=y +# CONFIG_PROVE_LOCKING is not set +# CONFIG_LOCK_STAT is not set +# CONFIG_DEBUG_RT_MUTEXES is not set +# CONFIG_DEBUG_SPINLOCK is not set +# CONFIG_DEBUG_MUTEXES is not set +# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set +# CONFIG_DEBUG_RWSEMS is not set +# CONFIG_DEBUG_LOCK_ALLOC is not set +# CONFIG_DEBUG_ATOMIC_SLEEP is not set +# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set +# CONFIG_LOCK_TORTURE_TEST is not set +# CONFIG_WW_MUTEX_SELFTEST is not set +# CONFIG_SCF_TORTURE_TEST is not set +# CONFIG_CSD_LOCK_WAIT_DEBUG is not set +# end of Lock Debugging (spinlocks, mutexes, etc...) + +# CONFIG_DEBUG_IRQFLAGS is not set +CONFIG_STACKTRACE=y +# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set +# CONFIG_DEBUG_KOBJECT is not set + +# +# Debug kernel data structures +# +CONFIG_DEBUG_LIST=y +# CONFIG_DEBUG_PLIST is not set +# CONFIG_DEBUG_SG is not set +# CONFIG_DEBUG_NOTIFIERS is not set +CONFIG_BUG_ON_DATA_CORRUPTION=y +# CONFIG_DEBUG_MAPLE_TREE is not set +# end of Debug kernel data structures + +# CONFIG_DEBUG_CREDENTIALS is not set + +# +# RCU Debugging +# +# CONFIG_RCU_SCALE_TEST is not set +# CONFIG_RCU_TORTURE_TEST is not set +# CONFIG_RCU_REF_SCALE_TEST is not set +CONFIG_RCU_CPU_STALL_TIMEOUT=59 +CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0 +# CONFIG_RCU_TRACE is not set +# CONFIG_RCU_EQS_DEBUG is not set +# end of RCU Debugging + +# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set +# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set +# CONFIG_LATENCYTOP is not set +CONFIG_USER_STACKTRACE_SUPPORT=y +CONFIG_HAVE_RETHOOK=y +CONFIG_HAVE_FUNCTION_TRACER=y +CONFIG_HAVE_DYNAMIC_FTRACE=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y +CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y +CONFIG_HAVE_DYNAMIC_FTRACE_NO_PATCHABLE=y +CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y +CONFIG_HAVE_SYSCALL_TRACEPOINTS=y +CONFIG_HAVE_FENTRY=y +CONFIG_HAVE_OBJTOOL_MCOUNT=y +CONFIG_HAVE_C_RECORDMCOUNT=y +CONFIG_HAVE_BUILDTIME_MCOUNT_SORT=y +CONFIG_TRACING_SUPPORT=y +# CONFIG_FTRACE is not set +# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set +# CONFIG_SAMPLES is not set +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y +CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y +CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y +CONFIG_STRICT_DEVMEM=y +# CONFIG_IO_STRICT_DEVMEM is not set + +# +# x86 Debugging +# +CONFIG_X86_VERBOSE_BOOTUP=y +CONFIG_EARLY_PRINTK=y +# CONFIG_EARLY_PRINTK_DBGP is not set +# CONFIG_EARLY_PRINTK_USB_XDBC is not set +# CONFIG_DEBUG_TLBFLUSH is not set +CONFIG_HAVE_MMIOTRACE_SUPPORT=y +# CONFIG_X86_DECODER_SELFTEST is not set +CONFIG_IO_DELAY_0X80=y +# CONFIG_IO_DELAY_0XED is not set +# CONFIG_IO_DELAY_UDELAY is not set +# CONFIG_IO_DELAY_NONE is not set +# CONFIG_DEBUG_BOOT_PARAMS is not set +# CONFIG_CPA_DEBUG is not set +# CONFIG_DEBUG_ENTRY is not set +# CONFIG_DEBUG_NMI_SELFTEST is not set +# CONFIG_X86_DEBUG_FPU is not set +# CONFIG_PUNIT_ATOM_DEBUG is not set +# CONFIG_UNWINDER_ORC is not set +CONFIG_UNWINDER_FRAME_POINTER=y +# end of x86 Debugging + +# +# Kernel Testing and Coverage +# +# CONFIG_KUNIT is not set +# CONFIG_NOTIFIER_ERROR_INJECTION is not set +# CONFIG_FAULT_INJECTION is not set +CONFIG_ARCH_HAS_KCOV=y +CONFIG_CC_HAS_SANCOV_TRACE_PC=y +# CONFIG_KCOV is not set +# CONFIG_RUNTIME_TESTING_MENU is not set +CONFIG_ARCH_USE_MEMTEST=y +# CONFIG_MEMTEST is not set +# end of Kernel Testing and Coverage + +# +# Rust hacking +# +# end of Rust hacking +# end of Kernel hacking diff --git a/cosign.pub b/cosign.pub new file mode 100644 index 0000000..daea5ef --- /dev/null +++ b/cosign.pub @@ -0,0 +1,4 @@ +-----BEGIN PUBLIC KEY----- +MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAElWFSLKLosBrdjfuF8ZS6U01Ufky4 +zNeVPCkA6HEJ/oe634fRqwFxkXKGWg03eGFSnlwRxnUxN2+duXQSsR0pzQ== +-----END PUBLIC KEY----- diff --git a/docs/advanced.md b/docs/advanced.md new file mode 100644 index 0000000..c05b8b5 --- /dev/null +++ b/docs/advanced.md @@ -0,0 +1,103 @@ +# Advanced flows + +`banger vm run` covers the common sandbox case. This doc is for the +rest: scripting, arbitrary images, custom rootfs stacks, long-lived +guest processes. + +Host-side assumption for everything below: the supported runtime model +is still the two-service `systemd` install: + +- `bangerd.service` running as the owner user +- `bangerd-root.service` running as the privileged host helper + +These advanced flows widen what you do with banger, not which host +init systems or privilege model are supported. + +## `vm create` — the low-level primitive + +Use when you want to provision without starting, or when you need to +script VM creation piecewise. + +```bash +banger vm create --image debian-bookworm --name testbox --no-start +banger vm start testbox +banger vm ssh testbox +banger vm stop testbox +banger vm delete testbox +``` + +Sweep every non-running VM (stopped, created, error) with: + +```bash +banger vm prune # interactive confirmation +banger vm prune -f # skip the prompt +``` + +`vm create` is synchronous by default, but on a TTY it shows live +progress until the VM is fully ready. + +## `image pull ` — arbitrary container images + +For images outside banger's catalog, pull from any OCI registry: + +```bash +banger image pull docker.io/library/alpine:3.20 --kernel-ref generic-6.12 +``` + +Layers are flattened, ownership is fixed (setuid binaries, root-owned +config preserved), banger's guest agents are injected, and a first-boot +systemd service installs `openssh-server` via the guest's package +manager so the VM is reachable on first boot. + +See [`docs/oci-import.md`](oci-import.md) for supported distros, +caveats, and the `internal/imagepull` design. + +## `image register` — existing host-side stack + +If you already have an ext4 rootfs, a kernel, optional initrd, and +optional modules as files on disk: + +```bash +banger image register --name base \ + --rootfs /abs/path/rootfs.ext4 \ + --kernel-ref generic-6.12 +``` + +You can mix `--kernel-ref` (a cataloged kernel) with `--rootfs` from +disk, or pass `--kernel /abs/path/vmlinux` for a one-off kernel. + +For reproducible custom images, write a Dockerfile and publish it to +an image catalog. See [`docs/image-catalog.md`](image-catalog.md). + +## Workspace primitive + +`vm run ./repo` (see README) handles the common case. For a manual +flow against an already-running VM, `vm workspace prepare` +materialises a local git checkout into the guest: + +```bash +banger vm workspace prepare ./other-repo --guest-path /root/repo +``` + +Default guest path is `/root/repo`; default mode is a shallow +metadata copy plus a tracked-files overlay. Untracked files are +skipped by default — pass `--include-untracked` to ship untracked +non-ignored files too. Pass `--dry-run` to list the exact file set +without touching the guest. For repositories with submodules, pass +`--mode full_copy`. + +## Inspecting boot failures + +When a VM's create flow errors ("ssh did not come up within 90s" or +similar), the VM is kept alive for inspection: + +- `banger vm logs ` — the firecracker serial console output, + the best window into a stuck boot (systemd unit failures, kernel + panics, missing modules). +- `banger vm ports ` — what's listening in the guest. Works as + long as banger's vsock agent has come up, even if SSH is wedged. +- `banger vm show ` — daemon-side state (IP, PID, overlay + paths). + +`--rm` on `vm run` intentionally does NOT fire when the initial ssh +wait times out, so the VM stays around for post-mortem. diff --git a/docs/config.md b/docs/config.md new file mode 100644 index 0000000..ad980b2 --- /dev/null +++ b/docs/config.md @@ -0,0 +1,153 @@ +# Config reference + +banger reads `~/.config/banger/config.toml` at daemon start; every key is +optional. Defaults are applied for anything you omit. Path: see also +[docs/privileges.md](privileges.md) > Filesystem mutations. + +--- + +## Top-level keys + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `log_level` | string | `"info"` | Daemon log verbosity; overridden at runtime by `BANGER_LOG_LEVEL`. Accepted values are the standard slog levels: `debug`, `info`, `warn`, `error`. | +| `firecracker_bin` | string | auto-detected from `PATH` | Path to the `firecracker` binary. Accepts absolute paths or `~/`-anchored paths. If unset, banger resolves `firecracker` on `PATH` at startup. | +| `jailer_bin` | string | `"/usr/bin/jailer"` | Path to the Firecracker `jailer` binary used to sandbox each VM process. | +| `jailer_enabled` | bool | `true` | When `false`, VMs are launched directly without the jailer. Disabling the jailer removes the seccomp/namespace sandbox; only for debugging or environments where jailer is unavailable. | +| `jailer_chroot_base` | string | `"/jail"` | Base directory under which the jailer creates per-VM chroot trees. Must be on the same filesystem as the image store to allow hard-linking without crossing device boundaries. | +| `ssh_key_path` | string | `"/ssh/id_ed25519"` (auto-generated) | Host SSH key used to reach guest VMs. Accepts absolute paths or `~/`-anchored paths; `~/foo` expands against `$HOME`. Relative paths are rejected. If unset, banger auto-generates an ed25519 keypair on first start. | +| `default_image_name` | string | `"debian-bookworm"` | Image used when `--image` is omitted from `vm run` / `vm create`. The named image is auto-pulled from the catalog if not already local. | +| `auto_stop_stale_after` | duration | `"0"` (disabled) | If non-zero, the daemon automatically stops VMs that have not been touched within this duration. Accepts Go duration strings (`"24h"`, `"2h30m"`). | +| `stats_poll_interval` | duration | `"10s"` | How often the daemon collects CPU and memory stats for running VMs. Accepts Go duration strings (`"30s"`, `"1m"`). | +| `bridge_name` | string | `"br-fc"` | Name of the Linux bridge device banger creates for the VM network. | +| `bridge_ip` | string | `"172.16.0.1"` | IP address assigned to the host side of the bridge (the gateway VMs see). | +| `cidr` | string | `"24"` | Prefix length for the VM subnet (combined with `bridge_ip` to define the network, e.g. `172.16.0.0/24`). | +| `tap_pool_size` | int | `4` | Number of TAP network devices pre-allocated in the pool. Increase if you routinely run more concurrent VMs than this value. | +| `default_dns` | string | `"1.1.1.1"` | DNS resolver address advertised to guest VMs via DHCP. | + +--- + +## `[vm_defaults]` + +The optional `[vm_defaults]` block sets the sizing floor for every new VM. +When a key is omitted (or zero), banger falls back to host-derived heuristics +and then to built-in constants. `banger doctor` prints the effective defaults +with their provenance. + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `vcpu` | int | host heuristic (≈ `cpus/4`, max 4) | Number of vCPUs assigned to each new VM. Must be ≥ 0; 0 means "let banger decide." | +| `memory_mib` | int | host heuristic (≈ `ram/8`, max 8192) | RAM in mebibytes assigned to each new VM. Must be ≥ 0; 0 means "let banger decide." | +| `disk_size` | string | `"8G"` | Size of the per-VM work disk. Accepts K/M/G suffixes (`"16G"`, `"512M"`). Maximum is 128 GiB. | +| `system_overlay_size` | string | `"8G"` | Size of the copy-on-write overlay layered over the read-only root filesystem. Accepts K/M/G suffixes. Maximum is 128 GiB. | + +--- + +## `[[file_sync]]` + +Each `[[file_sync]]` entry copies a file or directory from the host into +the VM's work disk at `vm create` time. You may declare any number of +entries; the default is none. Missing host paths are skipped with a warning +rather than failing the create. + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `host` | string | **required** | Source path on the host. Must be absolute or `~/`-anchored, and must resolve inside the installed owner's home directory. Top-level symlinks are followed only when their target stays inside that home. | +| `guest` | string | **required** | Destination path inside the VM. Must be absolute or `~/`-anchored, and must resolve under `/root` (the work disk mount point). | +| `mode` | string | `"0600"` for files, `"0755"` for directories | Unix permission bits applied to the destination. Must be a 3- or 4-digit octal string (`"0755"`, `"600"`). | + +--- + +## Example + +A fully annotated `config.toml` showing every section. Omit any key to keep +the built-in default. + +```toml +# ~/.config/banger/config.toml + +# ── Binaries ────────────────────────────────────────────────────────────────── + +# Override the auto-resolved firecracker binary. +# firecracker_bin = "/usr/local/bin/firecracker" + +# Override the default jailer binary path. +# jailer_bin = "/usr/bin/jailer" + +# Disable the jailer (removes seccomp/namespace sandbox — debug only). +# jailer_enabled = false + +# Base directory for per-VM jailer chroot trees. +# jailer_chroot_base = "/var/lib/banger/jail" + +# ── Identity ────────────────────────────────────────────────────────────────── + +# SSH key used to reach VMs. Auto-generated as an ed25519 key if unset. +# ssh_key_path = "~/.local/state/banger/ssh/id_ed25519" + +# Default image for `vm run` / `vm create` when --image is omitted. +# default_image_name = "debian-bookworm" + +# ── Logging ─────────────────────────────────────────────────────────────────── + +# Daemon log verbosity: debug | info | warn | error +# log_level = "info" + +# ── Lifecycle ───────────────────────────────────────────────────────────────── + +# Automatically stop VMs not touched within this window. 0 disables auto-stop. +# auto_stop_stale_after = "24h" + +# How often to collect CPU/memory stats for running VMs. +# stats_poll_interval = "10s" + +# ── Networking ──────────────────────────────────────────────────────────────── + +# Name of the Linux bridge device created for the VM network. +# bridge_name = "br-fc" + +# Host-side IP address of the bridge (the gateway VMs see). +# bridge_ip = "172.16.0.1" + +# Subnet prefix length combined with bridge_ip. +# cidr = "24" + +# TAP device pool size — increase if you run more concurrent VMs than this. +# tap_pool_size = 4 + +# DNS resolver advertised to guests. +# default_dns = "1.1.1.1" + +# ── VM sizing defaults ──────────────────────────────────────────────────────── + +[vm_defaults] +# vCPUs per VM. 0 = let banger decide from host heuristics. +vcpu = 2 + +# RAM in MiB per VM. 0 = let banger decide from host heuristics. +memory_mib = 2048 + +# Work disk size (K/M/G suffix). Max 128G. +disk_size = "8G" + +# Copy-on-write overlay over the root filesystem (K/M/G suffix). Max 128G. +system_overlay_size = "8G" + +# ── Host → guest file copies ────────────────────────────────────────────────── + +# Copy an entire directory (recursive). +[[file_sync]] +host = "~/.aws" +guest = "~/.aws" + +# Copy a single file with explicit permissions. +[[file_sync]] +host = "~/.config/gh/hosts.yml" +guest = "~/.config/gh/hosts.yml" + +# Copy a script and make it executable. +[[file_sync]] +host = "~/bin/my-script" +guest = "~/bin/my-script" +mode = "0755" +``` diff --git a/docs/dns-routing.md b/docs/dns-routing.md new file mode 100644 index 0000000..5321327 --- /dev/null +++ b/docs/dns-routing.md @@ -0,0 +1,161 @@ +# DNS routing — resolving `.vm` hostnames from the host + +banger's owner daemon runs a local DNS server on `127.0.0.1:42069` that +answers queries under the `.vm` zone. Every VM you create gets a +record: + +``` +devbox.vm → 172.16.0.9 (whatever guest IP it was assigned) +``` + +With that plus host-side DNS routing, you can: + +```bash +ssh root@devbox.vm +curl http://devbox.vm:3000 +``` + +from anywhere on the host without copy-pasting guest IPs. + +## Supported path + +The supported host-side path is: + +- `systemd` on the host +- `bangerd.service` running as the owner user +- `bangerd-root.service` running as the privileged host helper +- `systemd-resolved` handling `.vm` routing via `resolvectl` + +If you're on a non-`systemd` host or a host without `systemd-resolved`, +the recipes below are best-effort guidance, not the primary supported +deployment model. + +## systemd-resolved hosts — nothing to configure + +If your host uses `systemd-resolved` (most modern Linux desktops — +Ubuntu ≥18.04, Fedora, Arch with the service enabled), banger +auto-wires it. When the banger services start, the owner daemon asks +the root helper to apply the equivalent of: + +``` +sudo resolvectl dns 127.0.0.1:42069 +sudo resolvectl domain ~vm +sudo resolvectl default-route no +``` + +against the banger bridge (`br-fc` by default). systemd-resolved +routes only `.vm` lookups to banger's DNS; everything else goes to +your normal upstream. No other changes needed. + +Verify: `resolvectl status br-fc` should list `127.0.0.1:42069` under +**Current DNS Server** and `~vm` under **DNS Domain**. + +Stopping or uninstalling the services reverts the bridge's +`resolvectl` state on shutdown: + +```bash +sudo banger daemon stop +sudo banger system uninstall +``` + +## Non-systemd-resolved hosts + +banger detects `resolvectl`'s absence and skips the auto-wire. You +configure your own resolver. Below are recipes for the common cases. +They can be useful in local experiments, but this is outside banger's +supported host/runtime path. + +In every case the goal is the same: **route `.vm` queries to +`127.0.0.1` port `42069`, leave everything else alone**. + +### dnsmasq + +Add a stanza to your dnsmasq config (e.g. +`/etc/dnsmasq.d/banger-vm.conf`): + +``` +server=/vm/127.0.0.1#42069 +``` + +Reload dnsmasq (`sudo systemctl reload dnsmasq` or equivalent) and +test: + +``` +dig devbox.vm +``` + +### NetworkManager with dnsmasq plugin + +Same file as above; NetworkManager picks it up automatically if it's +configured to use the dnsmasq plugin (`dns=dnsmasq` in +`/etc/NetworkManager/NetworkManager.conf`). Restart NetworkManager +after editing. + +### Raw `/etc/resolv.conf` + +If you edit `resolv.conf` directly, there's no per-domain routing — +you'd have to point ALL DNS through banger, which you probably don't +want. Install `dnsmasq` instead and use the stanza above. + +### macOS (if you ever run banger on a Linux VM hosted on macOS) + +macOS supports per-TLD resolvers out of the box. Create +`/etc/resolver/vm` (as root): + +``` +nameserver 127.0.0.1 +port 42069 +``` + +No daemon reload needed — `scutil --dns` should list `.vm` under +"Resolver configurations" immediately. + +### Windows/WSL + +WSL2 inherits the Windows resolver by default and cannot be told to +route `.vm` anywhere. Options: + +1. Run banger inside WSL but resolve manually: `ssh root@172.16.0.9`. +2. Set up `dnsmasq` on the WSL distro and point its resolv.conf at + it; then follow the dnsmasq recipe above. + +## Verifying the DNS server + +Regardless of host-side routing, you can always query banger's DNS +server directly: + +```bash +dig @127.0.0.1 -p 42069 devbox.vm +``` + +Returns the guest IP if the VM is running. If it returns NXDOMAIN, +the VM either doesn't exist under that name or isn't running yet. + +`banger vm list` shows the VM names banger knows about. + +## Troubleshooting + +- **`resolvectl` errors about "system has not been booted with systemd + as init system"** — you're probably inside a container or on a + non-`systemd` host. Manual resolver setup may still work, but that's + outside the supported path. +- **Port 42069 already in use** — another daemon is bound there + (previous banger instance not shut down cleanly, or an unrelated + app). `ss -ulpn | grep 42069` shows who. `sudo banger daemon stop` + stops both banger services and cleans up banger's own listener. +- **`devbox.vm` resolves but SSH hangs** — DNS is fine; the VM + might not be up yet or the bridge NAT is misconfigured. + `banger vm ssh devbox` uses the guest IP directly and bypasses + DNS — try that to isolate. +- **Changes to `default_dns` don't affect `.vm` resolution** — + `default_dns` is the upstream the GUEST uses; it's unrelated to + host-side `.vm` routing. + +## Port and bridge tuning + +| Setting | Default | Notes | +|---|---|---| +| DNS listen addr | `127.0.0.1:42069` | Not configurable in v1. Edit `internal/vmdns/server.go` if you really need to change it. | +| Bridge name | `br-fc` | Configurable via `bridge_name` in `~/.config/banger/config.toml`. | +| Bridge IP | `172.16.0.1` | Configurable via `bridge_ip`. | +| Resolver route domain | `~vm` | Not configurable. | diff --git a/docs/image-catalog.md b/docs/image-catalog.md new file mode 100644 index 0000000..a0d81ac --- /dev/null +++ b/docs/image-catalog.md @@ -0,0 +1,123 @@ +# Image catalog + +The image catalog ships pre-built banger rootfs bundles so users don't +have to register or build anything. It's the fast path behind +`banger vm run` (auto-pull) and `banger image pull `. The +catalog is embedded into the banger binary and updated each release. + +End-user flow: + +```bash +banger image pull debian-bookworm # explicit +banger vm run --name sandbox # implicit (auto-pulls) +``` + +## Architecture + +Two parts — the same shape as the kernel catalog: + +1. **`internal/imagecat/catalog.json`** — JSON manifest embedded into + the banger binary via `go:embed`. Each entry: name, distro, arch, + kernel_ref (a `kernelcat` entry name), tarball URL, tarball + sha256, size. + +2. **Tarballs at `https://images.thaloco.com/`** — Cloudflare R2 + bucket `banger-images`, fronted by a public custom domain. Each + tarball is `--.tar.zst` (content- + addressed filename so CDN edge cache can never serve stale bytes + for the URL the catalog points at). Contents at the archive root: + `rootfs.ext4` (finalized: flattened + ownership-fixed + agent- + injected at build time) and `manifest.json`. + +The `banger image pull` bundle path streams the tarball, verifies +sha256 against the catalog entry, extracts both files into a staging +dir, resolves the kernel via `kernel_ref` (auto-pulling from +`kernelcat` if the user hasn't pulled it yet), stages boot artifacts +alongside, and registers the result as a managed image. + +The same `image pull` command transparently falls through to the +existing OCI-pull path when `` doesn't match a catalog entry — +see [`docs/oci-import.md`](oci-import.md). + +## Adding or updating an entry + +The repo has no CI for bundle publishing yet. Catalog updates are +manual. + +```bash +# 1. Build the bundle + upload + patch catalog.json in one shot. +scripts/publish-golden-image.sh + +# 2. Review and commit the catalog change. +git diff -- internal/imagecat/catalog.json +git add internal/imagecat/catalog.json +git commit -m 'imagecat: publish debian-bookworm' + +# 3. Rebuild so the new catalog is embedded. +make build +``` + +`scripts/publish-golden-image.sh` wraps `scripts/make-golden-bundle.sh` +(which runs `docker build` on `images/golden/Dockerfile` then pipes +`docker export` into `banger internal make-bundle`), computes the +bundle's sha256, uses the first 12 hex chars as a cache-busting +filename suffix, uploads via `rclone` to R2, HEAD-checks the public +URL, and patches `internal/imagecat/catalog.json`. + +Environment overrides if the defaults need to change: +`RCLONE_REMOTE`, `RCLONE_BUCKET`, `BASE_URL`. + +`--skip-upload` builds the bundle into `dist/` and stops — useful for +local testing without touching R2 or the catalog. + +## Bundle format + +A bundle is a tar+zstd archive with exactly two entries at the root: + +``` +rootfs.ext4 # finalized banger rootfs +manifest.json # {name, distro, arch, kernel_ref, description} +``` + +`rootfs.ext4` is fully prepared at build time: ownership fixed via +`debugfs sif`, banger guest agents (vsock agent, network bootstrap, +first-boot unit) already injected and enabled in +`multi-user.target.wants`. The pull path only has to place the file +and register the image — no mkfs, no ownership pass, no injection on +the daemon host. + +## Removing an entry + +1. Remove the entry from `internal/imagecat/catalog.json` and commit. +2. Delete the tarball from R2: + `rclone delete banger-images:banger-images/--.tar.zst`. +3. Rebuild banger. + +Already-pulled local images are not invalidated — users keep using +them until they run `banger image delete `. + +## Versioning conventions + +- **Entry names**: `-` (e.g. `debian-bookworm`). + Per-release names make it trivial to publish `debian-trixie` + alongside without collisions. +- **Content-addressed filenames**: the `-` suffix is + mandatory (set by `publish-golden-image.sh`). Never reuse a URL for + different bytes. +- **Architecture**: `x86_64` only today. The `arch` field is additive + — adding `arm64` is a config change, not a schema change. + +## Trust model + +Same as the kernel catalog: the embedded `catalog.json` carries each +bundle's sha256, and `imagecat.Fetch` rejects any download whose hash +doesn't match. This protects against transport corruption and against +an attacker swapping an R2 object without landing a commit in the +banger repo. GPG/sigstore signing is deferred until banger is public +and the threat model justifies the operational overhead. + +## Hosting + +Tarballs live in Cloudflare R2 (bucket `banger-images`), served at +`images.thaloco.com`. The bucket is publicly readable; writes require +the R2 API token configured on the `banger-images` rclone remote. diff --git a/docs/kernel-catalog.md b/docs/kernel-catalog.md new file mode 100644 index 0000000..7bfea51 --- /dev/null +++ b/docs/kernel-catalog.md @@ -0,0 +1,142 @@ +# Kernel catalog + +The kernel catalog ships pre-built Firecracker-ready kernel bundles so users +don't have to compile anything. The catalog is embedded into the banger +binary and updated each release. + +End-user flow: + +```bash +banger kernel list --available # browse the catalog +banger kernel pull generic-6.12 # download a bundle (no sudo, no make) +banger image register --name myimg --rootfs … --kernel-ref generic-6.12 +``` + +## Architecture + +Two parts: + +1. **`internal/kernelcat/catalog.json`** — a JSON manifest embedded into the + banger binary via `go:embed`. Each entry carries a name, distro, arch, + kernel version, tarball URL, and tarball SHA256. Updating the catalog + means editing this file in the repo and rebuilding banger. + +2. **Tarballs at `https://kernels.thaloco.com/`** — Cloudflare R2 bucket + `banger-kernels`, fronted by a public custom domain. Each tarball is + `-.tar.zst` and contains `vmlinux`, optional `initrd.img`, + and an optional `modules/` tree at the archive root. + +The `banger kernel pull` flow streams the tarball, verifies its SHA256 +against the embedded catalog entry, decompresses it (zstd), extracts it +into `~/.local/state/banger/kernels//`, and writes a manifest. Path +traversal entries and unsafe symlinks are rejected. + +## Kernel types + +**`generic-`** — built from upstream kernel.org sources with +Firecracker's official config. All essential drivers (virtio_blk, +virtio_net, ext4, vsock) compiled in — no modules, no initramfs. This +is the kernel the golden image pairs with and the recommended kernel +for OCI-pulled images. Build with `scripts/make-generic-kernel.sh`. + +## Adding or updating an entry + +The repo has no CI for kernel publishing yet. Catalog updates are manual +and infrequent (kernel version bumps every few weeks at most). + +```bash +# 1. Build the kernel locally. +scripts/make-generic-kernel.sh + +# 2. Import it into the local catalog so the canonical layout exists. +banger kernel import generic-6.12 \ + --from build/manual/generic-kernel \ + --distro generic \ + --arch x86_64 + +# 3. Package, upload, patch catalog.json. +scripts/publish-kernel.sh generic-6.12 \ + --description "Generic Firecracker kernel 6.12 (all drivers built-in, no initrd)" + +# 4. Review and commit the catalog change. +git diff -- internal/kernelcat/catalog.json +git add internal/kernelcat/catalog.json +git commit -m 'kernel catalog: add/update generic-6.12' + +# 5. Rebuild so the new catalog is embedded. +make build +``` + +`scripts/publish-kernel.sh` reads the locally-imported entry under +`~/.local/state/banger/kernels//`, builds a tar+zstd archive, uploads +it to R2 via `rclone`, HEAD-checks the public URL, and patches +`internal/kernelcat/catalog.json` with the new URL, SHA256, and size. + +Environment overrides if the defaults need to change: +`RCLONE_REMOTE`, `RCLONE_BUCKET`, `BASE_URL`, `BANGER_KERNELS_DIR`. + +## Removing an entry + +1. Delete the line from `internal/kernelcat/catalog.json` and commit. +2. Delete the tarball from R2: `rclone delete r2:banger-kernels/-.tar.zst`. +3. Rebuild banger. + +Already-pulled local copies on user machines are not invalidated — they +keep working until the user runs `banger kernel rm `. That's +intentional: pulling is idempotent, removing should not break anyone in +the middle of a workflow. + +## Versioning conventions + +- **Entry names**: `-` (e.g. `generic-6.12`). + The major.minor is the kernel line. Patch-level bumps reuse the + entry name and replace the tarball; minor bumps create a new entry + (`generic-6.13`). +- **Architecture**: only `x86_64` is published today. The `arch` field in + the catalog schema is additive — adding `arm64` later is a config + change, not a schema change. +- **Tarball layout**: contents at the archive root (no top-level + versioned directory). `vmlinux` is required; `initrd.img` and + `modules/` are optional. Symlinks inside `modules/` are allowed but + must resolve within the archive. + +## Trust model + +The embedded `catalog.json` carries the SHA256 of each tarball. `banger +kernel pull` rejects any download whose hash doesn't match. This protects +against transport corruption and against an attacker swapping a tarball +on R2 without also pushing a banger release. + +It does **not** protect against a compromise of the banger source repo +itself — an attacker who can land a commit can change both the catalog +SHA256 and the tarball. GPG/sigstore signing of the published catalog +tarballs is deferred until banger is public and the threat model +justifies the operational overhead. + +Upstream kernel sources *are* verified: `scripts/make-generic-kernel.sh` +fetches the detached PGP signature alongside the tarball from +kernel.org and rejects the build if gpg can't verify it against one +of the three known release signing keys (Greg KH / Linus / Sasha +Levin). So a compromised kernel.org mirror can't slip a backdoored +tarball past a maintainer rebuilding the kernel locally. + +## Hosting + +Tarballs live in Cloudflare R2 (bucket `banger-kernels`), served at the +custom domain `kernels.thaloco.com`. The bucket is publicly readable; +writes require the `banger-kernels-publish` API token (kept locally, +never committed). R2's free tier covers the expected traffic comfortably +(zero egress fees, generous storage). + +If hosting ever moves, catalog entries can be migrated by reuploading the +tarballs and editing the URLs in `catalog.json` — no other code changes +required. + +## Tech debt + +- Kernel publishing is manual; there is no CI yet. `scripts/make-generic-kernel.sh` + plus `scripts/publish-kernel.sh` is fine while refreshes are + infrequent and maintainer-only. CI becomes relevant once banger + goes public. +- `make lint-shell` runs at `--severity=error` only. Tightening to + `--severity=warning` is a nice-to-have but low priority. diff --git a/docs/oci-import-internals.md b/docs/oci-import-internals.md new file mode 100644 index 0000000..2607aa1 --- /dev/null +++ b/docs/oci-import-internals.md @@ -0,0 +1,46 @@ +# OCI import — internals + +> **Advanced reading.** This document describes implementation details of the +> OCI import pipeline. It is not needed for day-to-day use of +> `banger image pull`. User-facing documentation is in +> [`docs/oci-import.md`](oci-import.md). + +## Architecture + +`internal/imagepull/` owns the mechanics: + +- **`Pull`** wraps `go-containerregistry`'s `remote.Image` with the + `linux/amd64` platform pinned. Layer blobs cache under + `/var/cache/banger/oci/blobs/` (system install) or + `~/.cache/banger/oci/blobs/` (dev mode) and populate lazily during + flatten. +- **`Flatten`** replays layers oldest-first into a staging directory, + applies whiteouts, rejects unsafe paths plus filenames that banger's + debugfs ownership fixup cannot encode safely. Returns a `Metadata` + map of per-file uid/gid/mode from tar headers. +- **`BuildExt4`** runs `mkfs.ext4 -F -d -E root_owner=0:0` + at the size of the pre-truncated file — no mount, no sudo, no + loopback. Requires `e2fsprogs ≥ 1.43`. +- **`ApplyOwnership`** streams a batched `set_inode_field` script to + `debugfs -w` to rewrite per-file uid/gid/mode to the captured tar- + header values. +- **`InjectGuestAgents`** uses the same `debugfs` scripting to drop + banger's guest assets into the ext4 with root ownership: + vsock agent binary, network bootstrap + unit, first-boot script + + unit, `multi-user.target.wants` symlinks, vsock modules-load + config, `/var/lib/banger/first-boot-pending` marker. + +`internal/daemon/images_pull.go` orchestrates `pullFromOCI`: + +1. Parse + validate the OCI ref, derive a default name when `--name` + is omitted (`debian-bookworm` from + `docker.io/library/debian:bookworm`). +2. Resolve kernel info via `resolveKernelInputs` (auto-pulls from + `kernelcat` if `--kernel-ref` names a catalog entry that isn't + yet local). +3. Stage at `/.staging`; extract layers to a temp + tree under `$TMPDIR`. +4. `BuildExt4` → `ApplyOwnership` → `InjectGuestAgents`. +5. `imagemgr.StageBootArtifacts` stages the kernel triple alongside. +6. Atomic `os.Rename` publishes the artifact dir. +7. Persist a `model.Image{Managed: true, …}` record. diff --git a/docs/oci-import.md b/docs/oci-import.md new file mode 100644 index 0000000..841aed7 --- /dev/null +++ b/docs/oci-import.md @@ -0,0 +1,135 @@ +# OCI import (`banger image pull`) + +`banger image pull` has two paths. The primary one — catalog bundle — +is documented in [`docs/image-catalog.md`](image-catalog.md). This +doc covers the fallthrough: OCI-registry pull for arbitrary container +images. + +## When to use it + +Use the OCI path when you need a distro or image that isn't in the +catalog. The catalog covers the common happy path +(`debian-bookworm`); anything else (`alpine`, `fedora`, `ubuntu`, +custom corporate images) goes through OCI pull. + +```bash +banger image pull docker.io/library/alpine:3.20 --kernel-ref generic-6.12 +banger image pull ghcr.io/myorg/devimg:v2 --kernel-ref generic-6.12 +``` + +`banger image pull` dispatches based on the reference: + +- `banger image pull debian-bookworm` → catalog (fast path). +- `banger image pull docker.io/library/foo:bar` → OCI (anything not + in the catalog). + +## What works + +- Any public OCI image that exposes a `linux/amd64` manifest. +- Correct layer replay with whiteout semantics (`.wh.*` deletes, + `.wh..wh..opq` opaque-dir markers). +- Path-traversal, debugfs-hostile filename, and relative-symlink-escape protection. +- Content-aware default sizing (`content × 1.5`, floor 1 GiB). +- Layer caching on disk, keyed by blob sha256. +- **Ownership preservation** — tar-header uid/gid/mode captured + during flatten, applied to the ext4 via a `debugfs` pass, so + setuid binaries (`sudo`, `passwd`) and root-owned config + (`/etc/shadow`, `/etc/sudoers`) end up correctly owned. +- **Pre-injected banger agents** — the pulled ext4 ships with + `banger-vsock-agent`, `banger-network.service`, and the + `banger-first-boot` unit already enabled. +- **First-boot sshd install** — a one-shot systemd service installs + `openssh-server` via the guest's package manager on first boot. + Dispatches on `/etc/os-release` → `apt-get` / `apk` / `dnf` / + `pacman` / `zypper`. Subsequent boots skip the install. + +## What doesn't yet work + +- **Private registries**. Anonymous pulls only. Docker Hub, GHCR + (public), quay.io (public) all work. Adding auth via + `authn.DefaultKeychain` (from `go-containerregistry`) is a cheap + follow-up when someone needs it. +- **Non-`linux/amd64`**. The kernel catalog is x86_64-only, so pulled + rootfses match. `arm64` is additive in the schema. +- **Non-systemd rootfses**. The injected units assume systemd as + PID 1. Alpine ≥3.20 ships systemd; older alpine + void + busybox- + init images won't honour the banger-* units. +- **First boot needs network access**. The first-boot sshd install + reaches out to the distro's package repo. VMs without NAT or + without the bridge reaching the internet time out. The marker file + stays in place so a later restart retries. + +## Architecture + +> Implementation details live in [`docs/oci-import-internals.md`](oci-import-internals.md). + +## Guest-side boot sequence + +On first boot of a pulled image: + +1. **`banger-network.service`** — brings the guest interface up with + the IP assigned by banger's VM-create lifecycle. +2. **`banger-first-boot.service`** (first boot only) — reads + `/etc/os-release`, dispatches to the native package manager, + installs `openssh-server`, enables `ssh.service`. +3. **`banger-vsock-agent.service`** — the health-check daemon banger + uses to confirm the VM is alive. + +Subsequent boots skip step 2. + +## Adding distro support to first-boot + +`internal/imagepull/assets/first-boot.sh` is the POSIX-sh dispatch. +Add a new `ID=` branch and its install command, then rebuild banger +(the asset is `go:embed`-ed). + +Supported `ID` values today: `debian`, `ubuntu`, `kali`, `raspbian`, +`linuxmint`, `pop`, `alpine`, `fedora`, `rhel`, `centos`, `rocky`, +`almalinux`, `arch`, `archlinux`, `manjaro`, `opensuse*`, `suse`. +Unknown distros fall back to `ID_LIKE`, then error cleanly. + +## Paths + +Paths below assume the system install (`banger system install`). When +running `bangerd` directly without the helper, the same files live +under `~/.cache/banger/` and `~/.local/state/banger/` instead. + +| What | Where | +|------|-------| +| Layer blob cache | `/var/cache/banger/oci/blobs/sha256/` | +| Staging dir | `/var/lib/banger/images/.staging/` | +| Extraction scratch | `$TMPDIR/banger-pull-/` | +| Published image | `/var/lib/banger/images//rootfs.ext4` | + +## Cache lifecycle + +OCI layer blobs accumulate as you pull images. Banger flattens every +pull into a self-contained ext4, so the cache is purely a re-pull +avoidance — losing it only costs network round-trips on the next +pull of the same image. Reclaim disk with: + +``` +banger image cache prune --dry-run # report size only +banger image cache prune # remove every cached blob +``` + +Run with the daemon idle; an in-flight pull racing against prune may +fail and need a retry. + +## Tech debt + +- **Auth**. When we add private-registry support, the natural path + is `authn.DefaultKeychain`, which honours `~/.docker/config.json` + and the standard credential helpers. +- **Non-systemd rootfses**. The guest agents assume systemd. Adding + openrc / s6 / busybox-init variants means keeping parallel unit + trees keyed on `/etc/os-release`. + +## Trust model + +`image pull` (OCI path) delegates trust to the registry the user +selected. `go-containerregistry` verifies layer digests against the +manifest during download, so a tampered mirror can't ship modified +layers without breaking the sha256 chain. Banger does not verify OCI +image signatures (cosign/sigstore) — users who care should verify +references out-of-band. diff --git a/docs/privileges.md b/docs/privileges.md new file mode 100644 index 0000000..51da232 --- /dev/null +++ b/docs/privileges.md @@ -0,0 +1,379 @@ +# Privileges + +This document describes exactly what banger does with the privileges it +asks for, what runs where, and how to undo it. The aim is to give a +reader enough information to grant — or refuse — the privileges with +their eyes open. + +## Two services, two trust boundaries + +`banger system install` lays down two systemd units: + +| Unit | User | Socket | Purpose | +|---|---|---|---| +| `bangerd.service` | owner user (chosen at install) | `/run/banger/bangerd.sock` (0600, owner) | Orchestration: VM/image lifecycle, store, RPC to the CLI. | +| `bangerd-root.service` | `root` | `/run/banger-root/bangerd-root.sock` (0600, owner; root-owned dir at 0711) | Narrow root helper: bridge/tap, DM snapshots, NAT, Firecracker launch. | + +The owner daemon does all the business logic. It never runs as root. +The root helper runs as root but only accepts a fixed list of operations +and rejects every input that isn't a banger-managed path or name. + +The CLI (`banger ...`) talks to the owner daemon. The owner daemon +talks to the root helper for the handful of things only root can do. +Users and CI scripts never call the root helper directly. + +### Why two daemons + +Before this split the owner daemon shelled `sudo` for every device or +network operation. That meant the user's `sudo` config gated daily +work, and an attacker who compromised the owner daemon inherited +arbitrary `sudo` reach. After the split, the owner daemon has no +ambient root. The only way for it to make a privileged change is to +ask the helper, and the helper only honours requests that fit a +specific shape. + +## Authentication + +The root helper: + +- Listens on a Unix socket at `/run/banger-root/bangerd-root.sock`, + mode 0600, owned by the registered owner UID, in a root-owned + runtime dir at 0711. +- Reads `SO_PEERCRED` on every accepted connection and rejects any + caller whose UID is not 0 or the owner UID recorded in + `/etc/banger/install.toml`. The match is by UID, not username. +- Decodes one JSON request per connection and dispatches it through a + named-method switch. Unknown methods return `unknown_method`. + +The owner daemon: + +- Listens on `/run/banger/bangerd.sock`, mode 0600, owned by the + install-time owner user. Other host users cannot connect. +- Reads `SO_PEERCRED` on every accepted connection and rejects any + caller whose UID is not 0 or the install-time owner UID. The + filesystem perms already gate access; the peer-cred read is + belt-and-braces in case the socket FD is ever leaked to a + non-owner process. +- Resolves the helper socket path from the install metadata and + retries with backoff if the helper hasn't started yet. + +There is no network listener. Every banger control surface is a Unix +socket on the local host. + +## What the root helper will do, exactly + +The helper exposes a fixed list of RPC methods (see +`internal/roothelper/roothelper.go` for the canonical set). Each is +shaped so the owner daemon can name a banger-managed object but +cannot pass an arbitrary host path or interface name. Every input +that names a path, device, PID, or interface is checked against a +validator before the helper touches the host. + +| Method | Effect | Validation gate | +|---|---|---| +| `priv.ensure_bridge` | Create the configured Linux bridge if missing; assign the bridge IP. | Bridge name must equal `br-fc` or start with `br-fc-` (so a compromised daemon can't drive `ip link` against `eth0` / `docker0` / `lo`). Bridge IP must parse as IPv4. CIDR prefix must be a number in `[8, 32]`. | +| `priv.create_tap` | `ip link add tap NAME tuntap` and add to bridge, owned by the owner user. | Tap name must match `tap-fc-*` or `tap-pool-*`. Bridge config (name + IP + CIDR) passes the same banger-managed check as `priv.ensure_bridge`, otherwise the new tap could be `master`-attached to an arbitrary host iface. | +| `priv.delete_tap` | `ip link del NAME`. | Same prefix check on the tap name. | +| `priv.sync_resolver_routing` | `resolvectl dns/domain/default-route` on the configured bridge. | Bridge name must equal `br-fc` or start with `br-fc-` (same banger-managed check). Resolver address must parse via `net.ParseIP`. | +| `priv.clear_resolver_routing` | `resolvectl revert` on the bridge. | Same banger-managed bridge-name check. | +| `priv.ensure_nat` | `iptables -t nat MASQUERADE` for `(guest_ip, tap)` plus matching FORWARD rules; `enable=false` removes them. | Tap must be banger-prefixed. Guest IP must parse as IPv4. | +| `priv.create_dm_snapshot` | Create a `dmsetup` device-mapper snapshot from `rootfs.ext4` with COW backing file. | Both paths must be inside `/var/lib/banger`; DM name must start with `fc-rootfs-`. | +| `priv.cleanup_dm_snapshot` | `dmsetup remove` and `losetup -d` for a snapshot the helper itself just created. | Every non-empty `dmsnap.Handles` field is checked: DM name `fc-rootfs-*`, DM device `/dev/mapper/fc-rootfs-*`, loops `/dev/loopN`. | +| `priv.remove_dm_snapshot` | `dmsetup remove` by target. | Target must be either a `fc-rootfs-*` name or a `/dev/mapper/fc-rootfs-*` path. | +| `priv.fsck_snapshot` | `e2fsck -fy` against the DM device. | DM device path must match `/dev/mapper/fc-rootfs-*`. Exit 1 (filesystem cleaned) is tolerated. | +| `priv.read_ext4_file` | Read a file from inside an ext4 image via `debugfs cat`. | Image path must be inside `/var/lib/banger` or a managed DM device. Guest path is rejected if it contains debugfs-hostile chars (`"`/`\`/newline). | +| `priv.write_ext4_files` | Batch write files into an ext4 image, root:root, mode-controlled. | Same image-path validator. | +| `priv.resolve_firecracker_binary` | Stat and return the firecracker binary path. | Path is opened with `O_PATH \| O_NOFOLLOW` (refusing symlinks) and Fstat'd through the resulting fd: must be a regular file, executable, root-owned, not group/world-writable. | +| `priv.launch_firecracker` | Start the firecracker process for a VM (jailer-wrapped). | Socket and vsock paths must be inside `/run/banger`. Log/metrics/kernel/initrd paths must be inside `/var/lib/banger`. Tap name must be banger-prefixed. Drives must be inside the state dir or be a `/dev/mapper/fc-rootfs-*` device. Jailer chroot base must be inside the system state/runtime dirs; jailer UID/GID must equal the registered owner. Binary must pass the same root-owned-executable check. | +| `priv.ensure_socket_access` | `chown` and `chmod 0600` on a firecracker API or vsock socket so the owner user can talk to it. | Path must be inside `/run/banger` and not a symlink. The helper opens it with `O_PATH \| O_NOFOLLOW`, refuses anything that isn't a unix socket, and chmod/chown via the resulting fd (no symlink-follow). The local-priv fallback uses `chown -h`. | +| `priv.cleanup_jailer_chroot` | Detach every mount under the per-VM jailer chroot via direct `umount2(MNT_DETACH \| UMOUNT_NOFOLLOW)` syscalls (deepest-first), then `rm -rf` the tree. | Path must be inside the system state/runtime dirs and not a symlink — including no symlinks at intermediate components (resolved with `EvalSymlinks` and re-checked). `UMOUNT_NOFOLLOW` makes the unmounts symlink-safe even if a path is swapped after validation. A `findmnt` guard refuses to `rm -rf` if any mount remains underneath. | +| `priv.find_firecracker_pid` | Resolve a firecracker PID by API socket path. | Filters to processes whose cmdline mentions the requested API socket. | +| `priv.kill_process` / `priv.signal_process` | Send SIGKILL or a named signal to a PID. | PID must refer to a running process whose `/proc//cmdline` mentions `firecracker`. | +| `priv.process_running` | Check whether a PID is alive (no host mutation). | Read-only; same cmdline filter. | + +Anything outside this list returns `unknown_method` and is logged. The +helper does not run a shell, does not exec helper scripts, and does +not accept commands as strings. + +## Filesystem mutations + +Path used | Owner | What is created or changed +---|---|--- +`/etc/banger/install.toml` | root, 0644 | Written once by `banger system install`. Holds owner UID/GID/home, install timestamp, version. Read by both daemons at startup. +`/etc/systemd/system/bangerd.service` | root, 0644 | Owner-daemon unit. Contents are deterministic; see below. +`/etc/systemd/system/bangerd-root.service` | root, 0644 | Root-helper unit. +`/usr/local/bin/banger` | root, 0755 | Copy of the build output. +`/usr/local/bin/bangerd` | root, 0755 | Same binary, second name. +`/usr/local/lib/banger/banger-vsock-agent` | root, 0755 | Companion agent injected into guests at image-pull time. +`/var/lib/banger/...` | owner (via systemd `StateDirectory=banger`), 0700 | Image artifacts, VM dirs, work disks, kernels, OCI cache, SSH key + known_hosts. +`/var/cache/banger/...` | owner, 0700 | Bundle and OCI download cache. +`/run/banger/...` | owner, 0700 | Owner daemon socket and per-VM firecracker API + vsock sockets. +`/run/banger-root/...` | root, 0711 | Root-helper socket dir; the socket itself is 0600. +`~/.config/banger/config.toml` | owner | Optional user config. Read by the owner daemon at startup. + +Outside these directories, banger does not write to the host filesystem +during normal operation. The two exceptions are file-sync (the user +explicitly opts in to copying paths from their home into a guest, which +the owner daemon validates is inside the owner home before reading) +and the install/uninstall actions above. + +### Why the owner home is locked down + +The `[[file_sync]]` config lets users mirror host files into guests. +banger refuses to follow paths that escape the owner home, including +through symlinks: + +- `ResolveFileSyncHostPath` (`internal/config/config.go`) expands a + leading `~/` and rejects any candidate that resolves outside the + configured `OwnerHomeDir`. +- `ResolveExistingFileSyncHostPath` re-checks after `EvalSymlinks` so + a symlink inside `~/.aws` that points at `/etc/shadow` cannot leak + out. + +This means an installed banger never reads outside the owner home in +the file-sync path, even if the owner edits config to try. + +## Network mutations + +For each running VM banger creates: + +- One bridge (default `br-fc`, configurable). Created on first VM + start, never deleted automatically. +- One tap interface named `tap-fc-`. Created on VM start, + deleted on VM stop or crash recovery. +- One iptables MASQUERADE rule per VM, only when `--nat` was passed. + Removed by the symmetric `EnsureNAT(enable=false)` call at stop. +- Optionally, `resolvectl` routing entries that send `*.vm` lookups to + banger's in-process DNS server on the bridge. Reverted at stop. + +Banger does not touch UFW, firewalld, or other rule managers. It only +edits the iptables tables it created the rules in. + +## Cleanup and uninstall + +Per-VM cleanup happens at: + +- `banger vm stop ` — stops firecracker, removes the per-VM tap, + drops the NAT rule, removes the DM snapshot, removes per-VM + sockets, leaves the work disk. +- `banger vm delete ` — same as stop, plus deletes the per-VM + state directory under `/var/lib/banger/vms/` (work disk, + metadata). +- `banger vm prune` — bulk version. +- Crash recovery: on daemon start, `reconcile` runs the same teardown + for any VM whose firecracker process is no longer alive. + +System-level uninstall: + +``` +sudo banger system uninstall # remove services, units, binaries +sudo banger system uninstall --purge # also remove /var/lib/banger, + # /var/cache/banger, /run/banger +``` + +Without `--purge`, the state dirs survive so a reinstall can pick up +where the previous one left off. With `--purge`, banger leaves no +files behind under `/var/lib`, `/var/cache`, or `/run`. + +What `uninstall` does, in order: + +1. `systemctl disable --now bangerd.service bangerd-root.service`. +2. Remove `/etc/systemd/system/bangerd.service` and `bangerd-root.service`. +3. Remove `/etc/banger/install.toml` and `/etc/banger/`. +4. `systemctl daemon-reload`. +5. Remove `/usr/local/bin/banger`, `/usr/local/bin/bangerd`, + `/usr/local/lib/banger/`. +6. With `--purge` only: remove the system state, cache, and runtime + dirs. + +What `uninstall` does NOT do automatically: + +- It does not delete the bridge or any iptables rules. Stop your VMs + first (`banger vm prune` or `banger vm stop ` for each VM) so + the per-VM teardown drops them. The bridge itself is intentionally + persistent — a future reinstall reuses it. To remove it manually: + `sudo ip link del br-fc`. +- It does not undo `resolvectl` routing on a bridge that no longer + exists; the entries are harmless if the bridge is gone. +- It does not remove the owner user, the owner's home, or anything + the user wrote into a guest from inside the guest. + +## Updating banger + +`banger update` is a user-triggered, manually-invoked operation. It +never runs in the background and never auto-checks for new releases. + +The flow: + +1. **Discover.** GET `https://releases.thaloco.com/banger/manifest.json` + over HTTPS. The URL is hardcoded in the binary at compile time — + a compromised daemon config can't redirect the updater. Manifest + schema_version gates forward compat: a CLI that doesn't recognise + the server's schema_version refuses to update. +2. **In-flight gate.** `daemon.operations.list` RPC. If any operation + is not Done, refuse with the operation list. `--force` overrides. +3. **Download.** Capped GET on the tarball + `SHA256SUMS` (≤ 256 MiB + and ≤ 16 KiB respectively). Tarball is sha256-verified on the fly + against the digest published in `SHA256SUMS`; partial files are + removed on any verification failure. +4. **Cosign signature.** `SHA256SUMS.sig` is fetched (≤ 1 KiB) and + verified against the `BangerReleasePublicKey` embedded in the + running banger binary. The signature is an ECDSA P-256 / SHA-256 + blob signature produced by `cosign sign-blob` — verified by Go's + stdlib `crypto/ecdsa.VerifyASN1`, no third-party crypto deps. A + missing signature URL or a verification failure aborts the update + before any binary is touched. +5. **Sanity-run.** Staged `banger --version` must mention the + expected version; staged `bangerd --check-migrations --system` + must exit 0 (compatible) or 1 (will auto-migrate). Exit 2 + (incompatible — DB has migrations the new binary doesn't know) + aborts the swap; the running install is untouched. +6. **Swap.** Atomic `os.Rename` for each of the three binaries + (banger-vsock-agent → bangerd → banger), with `.previous` backups. +7. **Restart.** `systemctl restart bangerd-root.service` then + `bangerd.service`. Wait for the new daemon socket to answer + `ping`. Running VMs survive the daemon restart — they're each + their own firecracker process and live in `bangerd-root.service`'s + cgroup; restart's `KillMode=control-group` doesn't reach them. + The new daemon's `reconcile` step re-attaches by reading the + per-VM `handles.json` scratch file and verifying the firecracker + process is still alive. +8. **Verify.** Run `banger doctor` against the just-installed CLI. + FAIL triggers auto-rollback: restore `.previous` backups, restart + services again so the OLD binaries take over. The original error + bubbles to the operator; `--force` skips this step. +9. **Finalise.** Update `/etc/banger/install.toml`'s Version / + Commit / BuiltAt. Remove `.previous` backups. Wipe the staging + directory under `/var/cache/banger/updates/`. + +What you're trusting in this flow: + +- The cosign **public key** baked into the binary you're updating + FROM. The maintainer rotates it by cutting a new release with a + new key embedded; from then on, only signatures made with the + new private key are accepted. v0.1.x predates a clean rotation + story. +- TLS to `releases.thaloco.com` for transport. The cosign signature + is the actual integrity check; TLS just gets us the bytes faster. +- The systemd unit owners (root for the helper, owner for the + daemon). `banger update` requires root because it writes + `/usr/local/bin` and talks to systemctl; it does NOT run via the + helper RPC interface. + +What `banger update` deliberately does NOT do: + +- No background check timers. Operators run `banger update --check` + on a schedule themselves if they want. +- No update across MINOR boundaries without an explicit `--to` + flag. v0.x is pre-stable; we don't promise that v0.1.5 → v0.2.0 + is automatic. +- No state-DB downgrade. Schema migrations are forward-only; + `--check-migrations` refuses to swap a binary that's older than + the running schema. +- No agent re-injection into existing VMs. The vsock agent inside + each VM is the version banger had at image-pull time, not the + current install. v0.1.x doesn't enforce or detect skew here; the + agent's HTTP API is small enough that compat across MINORs is + expected. + +## Running outside the system install + +Everything above describes the supported deployment: `banger system +install` lays down both systemd units and the helper takes over every +privileged operation. + +It is also possible to run `bangerd` directly without installing the +helper — the binary still works as a per-user daemon and shells `sudo +-n` for each privileged operation it would otherwise hand off +(`iptables`, `ip`, `mount`, `mknod`, `dmsetup`, `e2fsck`, `kill`, +`chown -h`, `chmod`, `losetup`, `chown`, `chmod`, `firecracker`). +This mode is intended for ad-hoc developer machines while iterating on +banger itself. + +It carries a different trust model: + +- It needs `NOPASSWD` sudoers entries for the developer (otherwise + every VM action prompts for a password). +- Once those entries exist, **any** process running as the developer + can invoke those commands with arbitrary arguments — banger's input + validators only constrain what banger itself sends. They are no + defence against a different program on the same account. +- The helper's `SO_PEERCRED` boundary, the systemd hardening + (`NoNewPrivileges`, `ProtectSystem=strict`, the narrow + `CapabilityBoundingSet`), and the helper's own input validators are + all bypassed. + +If you care about isolating banger's blast radius from anything else +running as your user, use the system install. If you only need +banger to work on your own dev box, the non-system mode is fine — +just don't run it on a shared or production host. + +## Hardening of the systemd units + +The two units ship with restrictive defaults; they are written by +banger at install time and the contents are deterministic. + +Owner daemon (`bangerd.service`): + +- `User=` is the install-time owner; never `root`. +- `NoNewPrivileges=yes`. +- `ProtectSystem=strict` — system directories are read-only. +- `ProtectHome=read-only` — owner home is read-only to the daemon + unit. The daemon writes only to `StateDirectory`, `CacheDirectory`, + `RuntimeDirectory`, plus owner config that the user edits. +- `ProtectControlGroups`, `ProtectKernelLogs`, `ProtectKernelModules`, + `ProtectClock`, `ProtectHostname`, `RestrictSUIDSGID`, + `LockPersonality`. +- `RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK AF_VSOCK`. +- No `AmbientCapabilities`. + +Root helper (`bangerd-root.service`): + +- Same hardening as above, plus `ProtectHome=yes` (no host-home + visibility at all from the helper). +- `CapabilityBoundingSet=CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_KILL CAP_MKNOD CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_ADMIN CAP_SYS_CHROOT`. + Only the capabilities required for tap/bridge, iptables, dmsetup, + loop devices, ownership fixups, device node creation, and Firecracker + process management. No `CAP_SYS_BOOT`, no `CAP_SYS_PTRACE`, + no `CAP_SYS_MODULE`, no `CAP_NET_BIND_SERVICE`. +- `ReadWritePaths=/var/lib/banger`. + +## What this leaves you trusting + +If you install banger as root, you are trusting: + +1. The two binaries banger drops under `/usr/local/bin` and the + companion agent under `/usr/local/lib/banger`. These should match + the build artifacts you reviewed. +2. The path/identifier validators in + `internal/roothelper/roothelper.go` to be tight: `validateManagedPath`, + `validateTapName`, `validateDMName`, `validateDMDevicePath`, + `validateLoopDevicePath`, `validateDMRemoveTarget`, + `validateDMSnapshotHandles`, `validateRootExecutable`, + `validateNotSymlink`, `validateExt4ImagePath`, + `validateLinuxIfaceName`, `validateBangerBridgeName`, + `validateNetworkConfig`, `validateCIDRPrefix`, `validateIPv4`, + `validateResolverAddr`, `validateSignalName`, and + `validateFirecrackerPID`. If any of these are bypassed, the helper + would carry out a privileged op against an unmanaged target. They + are unit-tested in `internal/roothelper/roothelper_test.go`. +3. The Firecracker binary banger executes. The helper refuses to launch + anything that isn't a regular, executable, root-owned, not + world-writable file — but the binary's own behaviour is your + responsibility. +4. Your own owner-user account. The owner can ask the helper to + create taps, run firecracker, and edit ext4 images under + `/var/lib/banger`. Anyone with the owner's UID can do those + things; treat that account as semi-privileged. + +What you do **not** have to trust: + +- The CLI process. It only talks Unix-socket RPC. +- Other host users. The helper socket is 0600 root and the owner + socket is 0700 owner. +- The contents of the user's home, except the file paths that + `[[file_sync]]` explicitly names — and even those are clamped to + the owner home. +- The guest. Guests cannot reach the helper or the owner daemon; the + only host endpoint a guest sees is the in-process DNS server on the + bridge IP and the bridge itself for outbound NAT. diff --git a/docs/release-process.md b/docs/release-process.md new file mode 100644 index 0000000..510ac06 --- /dev/null +++ b/docs/release-process.md @@ -0,0 +1,189 @@ +# Release process + +Maintainer-facing runbook for cutting and publishing a new banger +release. End users don't need any of this — they pick up new releases +through `banger update` or the curl-piped `install.sh`. + +## What ships in a release + +Each release publishes four objects to the R2 bucket served at +`https://releases.thaloco.com/banger/`: + +| Object | Path | Notes | +|---|---|---| +| Tarball | `/banger--linux-amd64.tar.gz` | `banger`, `bangerd`, `banger-vsock-agent` at the root, no subdirs | +| Hashes | `/SHA256SUMS` | One line for the tarball, GNU `sha256sum` format | +| Signature | `/SHA256SUMS.sig` | base64-encoded ASN.1 ECDSA cosign-blob signature over `SHA256SUMS` | +| Manifest | `manifest.json` (bucket root) | Describes every published release; `latest_stable` points at the most recent | + +`install.sh` lives at the bucket root too (unversioned) so the +`curl … | bash` URL stays stable across releases. + +## Trust model recap + +Every release is cosign-signed. The public key is pinned in two places +that MUST stay in sync: + +- `internal/updater/verify_signature.go` — `BangerReleasePublicKey` + used by `banger update`. +- `scripts/install.sh` — embedded copy used by the curl-piped installer + before any banger binary is on disk. + +`scripts/publish-banger-release.sh` aborts the upload if the two copies +diverge — that's the only mechanism keeping them coupled, so don't +edit either alone. + +The signed payload is `SHA256SUMS`, which in turn covers the tarball. +Verification uses the Go standard library (`crypto/ecdsa.VerifyASN1`) +on the update path and `openssl dgst -verify` on the install-script +path. cosign is needed only for **signing**. + +## Pre-flight checklist + +Run these before tagging or publishing: + +1. **`make smoke`** — the full systemd-driven scenario suite must be + green. The smoke harness exercises the real install + update path + end to end; if it's red, do not cut. +2. **CHANGELOG entry.** Add a `## [vX.Y.Z] - YYYY-MM-DD` section under + `## [Unreleased]` describing what changed. Use the + [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) sub-headings + (`### Added`, `### Fixed`, `### Notes`). +3. **Bump the link table** at the bottom of `CHANGELOG.md`: + ```markdown + [Unreleased]: …/compare/vX.Y.Z...HEAD + [vX.Y.Z]: …/releases/tag/vX.Y.Z + ``` +4. **Note unit-file changes loudly** in the CHANGELOG entry. `banger + update` swaps binaries only — it does NOT rewrite + `/etc/systemd/system/bangerd*.service`. If this release changed + `renderSystemdUnit` / `renderRootHelperSystemdUnit`, the entry must + tell existing-install users to run `sudo banger system install` + once after updating to pick up the new units. v0.1.4 and v0.1.6 + are reference examples. + +Commit the CHANGELOG change, push to `main`, and confirm CI is green. + +## Cutting the release + +Order matters: publish first, then tag. + +1. **Run the publish script:** + + ```sh + scripts/publish-banger-release.sh vX.Y.Z + ``` + + The script: + - Builds `banger`, `bangerd`, `banger-vsock-agent` with `-ldflags` + baking the version, the current commit SHA, and a UTC build + timestamp into `internal/buildinfo`. + - Tarballs the three binaries (bare basenames at the tar root — + `internal/updater/StageTarball` rejects anything else). + - Computes `SHA256SUMS`, signs it with `cosign sign-blob` (no + transparency log, no bundle format — banger verifies the bare + ASN.1 DER signature directly). + - Verifies the signature against the public key extracted from + `internal/updater/verify_signature.go`, then diffs that against + the public key embedded in `scripts/install.sh`. Either failure + aborts before upload. + - Pulls the existing `manifest.json` from the bucket, appends the + new release entry, points `latest_stable` at it, and uploads + everything via rclone. + - Uploads `scripts/install.sh` to the bucket root so the curl-piped + installer stays current. + +2. **Tag and push:** + + ```sh + git tag vX.Y.Z + git push --tags + ``` + + Tagging happens AFTER publishing so the tag only exists if the + release actually shipped. + +3. **Verify from a clean machine:** + + ```sh + curl -fsSL https://releases.thaloco.com/banger/manifest.json | jq .latest_stable + curl -fsSL https://releases.thaloco.com/banger/install.sh | head -20 + banger update --check # on an existing install + ``` + +## Verification releases + +If a release fixes anything in the update flow itself — +`runUpdate` (`internal/cli/commands_update.go`), the systemd unit +templates, or the helper/daemon restart sequencing — cut a follow-up +no-op verification release immediately. The reason: `banger update` +runs the OLD binary as the driver of the swap. A fix in vN can't be +observed end-to-end on a vN-1 host updating to vN, because vN-1 is +still in the driver seat. vN+1 with no functional changes lets a host +on vN update to it and observe the fix live with vN as the driver. + +Examples in CHANGELOG.md: v0.1.3 follows v0.1.2's update-flow fix; +v0.1.5 follows v0.1.4's daemon-restart fix. + +The verification-release CHANGELOG section is short and explicit: +> No functional changes. Verification release for vN: … + +## Patch vs minor + +banger follows [SemVer](https://semver.org/spec/v2.0.0.html). For +v0.1.x, the practical contract: + +- **Patch (v0.1.x):** bug fixes, internal refactors, anything that + doesn't change the exposed API/CLI behavior. +- **Minor (v0.2.x):** any change to the **exposed API behavior or + contract**. The vsock guest-agent protocol is the canonical example — + a minor bump means existing VMs created against the older minor need + to be re-pulled. Other minor-trigger changes: removing a CLI flag, + changing a stable RPC method's request/response shape, breaking the + on-disk store schema in a non-forward-compatible way. + +If in doubt, prefer the higher bump. Patch releases that turn out to +have broken a contract are the worst-of-both: users update without +warning, then break. + +## Sibling catalogs + +Kernel and golden-image releases ship through the same gate. The +`internal/kernelcat/catalog.json` and `internal/imagecat/catalog.json` +manifests are `go:embed`-ed at build time, so a new entry only +reaches users when banger itself is re-released. In practice: + +1. Run `scripts/publish-kernel.sh ` or + `scripts/publish-golden-image.sh …` to upload the artefact and + patch the appropriate `catalog.json` in the working tree. +2. Commit the catalog change with whatever banger fix or feature it's + landing alongside. +3. Cut a banger release the normal way; the new catalog entry ships + with the next `banger` binary. + +The kernel and image catalogs each have their own R2 bucket +(`kernels.thaloco.com`, `images.thaloco.com`) so versioning of the +artefacts is independent of banger's release cadence — but +**discoverability** is gated by the banger release that embeds the +catalog pointer. + +## When something goes wrong mid-release + +- **Signature verification fails locally** in + `publish-banger-release.sh`: confirm `internal/updater/verify_signature.go` + contains the same public key as `cosign.pub` in the repo root. If + the script reports drift between `verify_signature.go` and + `install.sh`, run `diff` between the two `BEGIN PUBLIC KEY` blocks + and resolve before rerunning. +- **rclone upload fails partway through:** the script uploads tarball, + hashes, signature, and manifest in that order. Re-running is safe; + rclone will overwrite. Until the manifest is uploaded, no client + sees the new release — so a partial upload is invisible. +- **Manifest already names the version** (re-cutting): the publish + script's `jq` filter dedupes by `version`, so re-running with the + same `vX.Y.Z` cleanly replaces the entry. +- **Already tagged but the release is bad:** delete the tag locally + AND on the remote (`git push --delete origin vX.Y.Z`), revert the + CHANGELOG entry, fix the bug, and start the cycle over with a fresh + patch number. Do NOT re-use the version — installed clients have + already cached its `SHA256SUMS` against the manifest. diff --git a/examples/alpine.config.toml b/examples/alpine.config.toml deleted file mode 100644 index c4e1011..0000000 --- a/examples/alpine.config.toml +++ /dev/null @@ -1,9 +0,0 @@ -# Experimental Alpine Linux guest profile for local testing. -# -# Register or promote a complete `alpine` image first, then point the daemon -# at it by name. Firecracker is resolved from PATH by default; set -# `firecracker_bin` only if you need an override. - -default_image_name = "alpine" -# firecracker_bin = "/usr/bin/firecracker" -# ssh_key_path = "/abs/path/to/private/key" diff --git a/examples/void-exp.config.toml b/examples/void-exp.config.toml deleted file mode 100644 index 1266ada..0000000 --- a/examples/void-exp.config.toml +++ /dev/null @@ -1,9 +0,0 @@ -# Experimental Void Linux guest profile for local testing. -# -# Register or promote a complete `void-exp` image first, then point the daemon -# at it by name. Firecracker is resolved from PATH by default; set -# `firecracker_bin` only if you need an override. - -default_image_name = "void-exp" -# firecracker_bin = "/usr/bin/firecracker" -# ssh_key_path = "/abs/path/to/private/key" diff --git a/go.mod b/go.mod index 3a07334..6067e9e 100644 --- a/go.mod +++ b/go.mod @@ -4,12 +4,14 @@ go 1.25.0 require ( github.com/firecracker-microvm/firecracker-go-sdk v1.0.0 + github.com/google/go-containerregistry v0.21.5 + github.com/klauspost/compress v1.18.5 github.com/miekg/dns v1.1.72 github.com/pelletier/go-toml v1.9.5 github.com/sirupsen/logrus v1.9.4 - github.com/spf13/cobra v1.8.1 - golang.org/x/crypto v0.46.0 - golang.org/x/sys v0.39.0 + github.com/spf13/cobra v1.10.2 + golang.org/x/crypto v0.50.0 + golang.org/x/sys v0.43.0 modernc.org/sqlite v1.38.2 ) @@ -18,8 +20,11 @@ require ( github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578 // indirect github.com/asaskevich/govalidator v0.0.0-20210307081110-f21760c49a8d // indirect github.com/containerd/fifo v1.0.0 // indirect + github.com/containerd/stargz-snapshotter/estargz v0.18.2 // indirect github.com/containernetworking/cni v1.0.1 // indirect github.com/containernetworking/plugins v1.0.1 // indirect + github.com/docker/cli v29.4.0+incompatible // indirect + github.com/docker/docker-credential-helpers v0.9.3 // indirect github.com/dustin/go-humanize v1.0.1 // indirect github.com/go-openapi/analysis v0.21.2 // indirect github.com/go-openapi/errors v0.20.2 // indirect @@ -41,22 +46,26 @@ require ( github.com/mattn/go-isatty v0.0.20 // indirect github.com/mdlayher/socket v0.2.0 // indirect github.com/mdlayher/vsock v1.1.1 // indirect + github.com/mitchellh/go-homedir v1.1.0 // indirect github.com/mitchellh/mapstructure v1.4.3 // indirect github.com/ncruces/go-strftime v0.1.9 // indirect github.com/oklog/ulid v1.3.1 // indirect + github.com/opencontainers/go-digest v1.0.0 // indirect + github.com/opencontainers/image-spec v1.1.1 // indirect github.com/opentracing/opentracing-go v1.2.0 // indirect github.com/pkg/errors v0.9.1 // indirect github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect - github.com/spf13/pflag v1.0.5 // indirect + github.com/spf13/pflag v1.0.10 // indirect + github.com/vbatts/tar-split v0.12.2 // indirect github.com/vishvananda/netlink v1.1.1-0.20210330154013-f5de75959ad5 // indirect github.com/vishvananda/netns v0.0.0-20210104183010-2eb08e3e575f // indirect go.mongodb.org/mongo-driver v1.8.3 // indirect golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b // indirect - golang.org/x/mod v0.31.0 // indirect - golang.org/x/net v0.48.0 // indirect - golang.org/x/sync v0.19.0 // indirect - golang.org/x/text v0.32.0 // indirect - golang.org/x/tools v0.40.0 // indirect + golang.org/x/mod v0.35.0 // indirect + golang.org/x/net v0.53.0 // indirect + golang.org/x/sync v0.20.0 // indirect + golang.org/x/text v0.36.0 // indirect + golang.org/x/tools v0.44.0 // indirect gopkg.in/yaml.v2 v2.4.0 // indirect modernc.org/libc v1.66.3 // indirect modernc.org/mathutil v1.7.1 // indirect diff --git a/go.sum b/go.sum index 44fbb17..ca330a4 100644 --- a/go.sum +++ b/go.sum @@ -162,6 +162,8 @@ github.com/containerd/imgcrypt v1.1.1/go.mod h1:xpLnwiQmEUJPvQoAapeb2SNCxz7Xr6PJ github.com/containerd/nri v0.0.0-20201007170849-eb1350a75164/go.mod h1:+2wGSDGFYfE5+So4M5syatU0N0f0LbWpuqyMi4/BE8c= github.com/containerd/nri v0.0.0-20210316161719-dbaa18c31c14/go.mod h1:lmxnXF6oMkbqs39FiCt1s0R2HSMhcLel9vNL3m4AaeY= github.com/containerd/nri v0.1.0/go.mod h1:lmxnXF6oMkbqs39FiCt1s0R2HSMhcLel9vNL3m4AaeY= +github.com/containerd/stargz-snapshotter/estargz v0.18.2 h1:yXkZFYIzz3eoLwlTUZKz2iQ4MrckBxJjkmD16ynUTrw= +github.com/containerd/stargz-snapshotter/estargz v0.18.2/go.mod h1:XyVU5tcJ3PRpkA9XS2T5us6Eg35yM0214Y+wvrZTBrY= github.com/containerd/ttrpc v0.0.0-20190828154514-0e0f228740de/go.mod h1:PvCDdDGpgqzQIzDW1TphrGLssLDZp2GuS+X5DkEJB8o= github.com/containerd/ttrpc v0.0.0-20190828172938-92c8520ef9f8/go.mod h1:PvCDdDGpgqzQIzDW1TphrGLssLDZp2GuS+X5DkEJB8o= github.com/containerd/ttrpc v0.0.0-20191028202541-4f1b8fe65a5c/go.mod h1:LPm1u0xBw8r8NOKoOdNMeVHSawSsltak+Ihv+etqsE8= @@ -206,7 +208,7 @@ github.com/coreos/pkg v0.0.0-20160727233714-3ac0863d7acf/go.mod h1:E3G3o1h8I7cfc github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f/go.mod h1:E3G3o1h8I7cfcXa63jLwjI0eiQQMgzzUDFVpN/nH/eA= github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU= -github.com/cpuguy83/go-md2man/v2 v2.0.4/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= +github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g= github.com/creack/pty v1.1.7/go.mod h1:lj5s0c3V2DBrqTV7llrYr5NG6My20zk30Fl46Y7DoTY= github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E= github.com/cyphar/filepath-securejoin v0.2.2/go.mod h1:FpkQEhXnPnOthhzymB7CGsFk2G9VLXONKD9G7QGMM+4= @@ -222,9 +224,13 @@ github.com/dgrijalva/jwt-go v0.0.0-20170104182250-a601269ab70c/go.mod h1:E3ru+11 github.com/dgrijalva/jwt-go v3.2.0+incompatible/go.mod h1:E3ru+11k8xSBh+hMPgOLZmtrrCbhqsmaPHjLKYnJCaQ= github.com/dgryski/go-sip13 v0.0.0-20181026042036-e10d5fee7954/go.mod h1:vAd38F8PWV+bWy6jNmig1y/TA+kYO4g3RSRF0IAv0no= github.com/dnaeon/go-vcr v1.0.1/go.mod h1:aBB1+wY4s93YsC3HHjMBMrwTj2R9FHDzUr9KyGc8n1E= +github.com/docker/cli v29.4.0+incompatible h1:+IjXULMetlvWJiuSI0Nbor36lcJ5BTcVpUmB21KBoVM= +github.com/docker/cli v29.4.0+incompatible/go.mod h1:JLrzqnKDaYBop7H2jaqPtU4hHvMKP+vjCwu2uszcLI8= github.com/docker/distribution v0.0.0-20190905152932-14b96e55d84c/go.mod h1:0+TTO4EOBfRPhZXAeF1Vu+W3hHZ8eLp8PgKVZlcvtFY= github.com/docker/distribution v2.7.1-0.20190205005809-0d3efadf0154+incompatible/go.mod h1:J2gT2udsDAN96Uj4KfcMRqY0/ypR+oyYUYmja8H+y+w= github.com/docker/distribution v2.7.1+incompatible/go.mod h1:J2gT2udsDAN96Uj4KfcMRqY0/ypR+oyYUYmja8H+y+w= +github.com/docker/docker-credential-helpers v0.9.3 h1:gAm/VtF9wgqJMoxzT3Gj5p4AqIjCBS4wrsOh9yRqcz8= +github.com/docker/docker-credential-helpers v0.9.3/go.mod h1:x+4Gbw9aGmChi3qTLZj8Dfn0TD20M/fuWy0E5+WDeCo= github.com/docker/go-events v0.0.0-20170721190031-9461782956ad/go.mod h1:Uw6UezgYA44ePAFQYUehOuCzmy5zmg/+nl2ZfMWGkpA= github.com/docker/go-events v0.0.0-20190806004212-e31b211e4f1c/go.mod h1:Uw6UezgYA44ePAFQYUehOuCzmy5zmg/+nl2ZfMWGkpA= github.com/docker/go-metrics v0.0.0-20180209012529-399ea8c73916/go.mod h1:/u0gXw0Gay3ceNrsHubL3BtdOL2fHf93USgMTe0W5dI= @@ -384,8 +390,10 @@ github.com/google/go-cmp v0.5.4/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/ github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= github.com/google/go-cmp v0.5.6/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= github.com/google/go-cmp v0.5.7/go.mod h1:n+brtR0CgQNWTVd5ZUFpTBC8YFBDLK/h/bpaJ8/DtOE= -github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI= -github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY= +github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8= +github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU= +github.com/google/go-containerregistry v0.21.5 h1:KTJG9Pn/jC0VdZR6ctV3/jcN+q6/Iqlx0sTVz3ywZlM= +github.com/google/go-containerregistry v0.21.5/go.mod h1:ySvMuiWg+dOsRW0Hw8GYwfMwBlNRTmpYBFJPlkco5zU= github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= github.com/google/gofuzz v1.1.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= github.com/google/martian v2.1.0+incompatible/go.mod h1:9I4somxYTbIHy5NJKHRl3wXiIaQGbYVAs8BPL6v8lEs= @@ -462,6 +470,8 @@ github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+o github.com/klauspost/compress v1.11.3/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYsUV+/s2qKfXs= github.com/klauspost/compress v1.11.13/go.mod h1:aoV0uJVorq1K+umq18yTdKaF57EivdYsUV+/s2qKfXs= github.com/klauspost/compress v1.13.6/go.mod h1:/3/Vjq9QcHkK5uEr5lBEmyoZ1iFhe47etQ6QUkpK6sk= +github.com/klauspost/compress v1.18.5 h1:/h1gH5Ce+VWNLSWqPzOVn6XBO+vJbCNGvjoaGBFW2IE= +github.com/klauspost/compress v1.18.5/go.mod h1:cwPg85FWrGar70rWktvGQj8/hthj3wpl0PGDogxkrSQ= github.com/konsorten/go-windows-terminal-sequences v1.0.1/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= github.com/konsorten/go-windows-terminal-sequences v1.0.2/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= github.com/konsorten/go-windows-terminal-sequences v1.0.3/go.mod h1:T0+1ngSBFLxvqU3pZ+m/2kptfBszLMUkC4ZK/EgS/cQ= @@ -501,6 +511,7 @@ github.com/miekg/dns v1.1.72 h1:vhmr+TF2A3tuoGNkLDFK9zi36F2LS+hKTRW0Uf8kbzI= github.com/miekg/dns v1.1.72/go.mod h1:+EuEPhdHOsfk6Wk5TT2CzssZdqkmFhf8r+aVyDEToIs= github.com/miekg/pkcs11 v1.0.3/go.mod h1:XsNlhZGX73bx86s2hdc/FuaLm2CPZJemRLMA+WTFxgs= github.com/mistifyio/go-zfs v2.1.2-0.20190413222219-f784269be439+incompatible/go.mod h1:8AuVvqP/mXw1px98n46wfvcGfQ4ci2FwoAjKYxuo3Z4= +github.com/mitchellh/go-homedir v1.1.0 h1:lukF9ziXFxDFPkA1vsr5zpc1XuPDn/wFntq5mG+4E0Y= github.com/mitchellh/go-homedir v1.1.0/go.mod h1:SfyaCUpYCn1Vlf4IUYiD9fPX4A5wJrkLzIz1N1q0pr0= github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y= github.com/mitchellh/mapstructure v1.3.3/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo= @@ -556,9 +567,12 @@ github.com/opencontainers/go-digest v0.0.0-20170106003457-a6d0ee40d420/go.mod h1 github.com/opencontainers/go-digest v0.0.0-20180430190053-c9281466c8b2/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s= github.com/opencontainers/go-digest v1.0.0-rc1/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s= github.com/opencontainers/go-digest v1.0.0-rc1.0.20180430190053-c9281466c8b2/go.mod h1:cMLVZDEM3+U2I4VmLI6N8jQYUd2OVphdqWwCJHrFt2s= +github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U= github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM= github.com/opencontainers/image-spec v1.0.0/go.mod h1:BtxoFyWECRxE4U/7sNtV5W15zMzWCbyJoFRP3s7yZA0= github.com/opencontainers/image-spec v1.0.1/go.mod h1:BtxoFyWECRxE4U/7sNtV5W15zMzWCbyJoFRP3s7yZA0= +github.com/opencontainers/image-spec v1.1.1 h1:y0fUlFfIZhPF1W537XOLg0/fcx6zcHCJwooC2xJA040= +github.com/opencontainers/image-spec v1.1.1/go.mod h1:qpqAh3Dmcf36wStyyWU+kCeDgrGnAve2nCC8+7h8Q0M= github.com/opencontainers/runc v0.0.0-20190115041553-12f6a991201f/go.mod h1:qT5XzbpPznkRYVz/mWwUaVBUv2rmF59PVA73FjuZG0U= github.com/opencontainers/runc v0.1.1/go.mod h1:qT5XzbpPznkRYVz/mWwUaVBUv2rmF59PVA73FjuZG0U= github.com/opencontainers/runc v1.0.0-rc8.0.20190926000215-3e425f80a8c9/go.mod h1:qT5XzbpPznkRYVz/mWwUaVBUv2rmF59PVA73FjuZG0U= @@ -652,15 +666,17 @@ github.com/spf13/cast v1.3.0/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkU github.com/spf13/cobra v0.0.2-0.20171109065643-2da4a54c5cee/go.mod h1:1l0Ry5zgKvJasoi3XT1TypsSe7PqH0Sj9dhYf7v3XqQ= github.com/spf13/cobra v0.0.3/go.mod h1:1l0Ry5zgKvJasoi3XT1TypsSe7PqH0Sj9dhYf7v3XqQ= github.com/spf13/cobra v1.0.0/go.mod h1:/6GTrnGXV9HjY+aR4k0oJ5tcvakLuG6EuKReYlHNrgE= -github.com/spf13/cobra v1.8.1 h1:e5/vxKd/rZsfSJMUX1agtjeTDf+qv1/JdBF8gg5k9ZM= -github.com/spf13/cobra v1.8.1/go.mod h1:wHxEcudfqmLYa8iTfL+OuZPbBZkmvliBWKIezN3kD9Y= +github.com/spf13/cobra v1.10.2 h1:DMTTonx5m65Ic0GOoRY2c16WCbHxOOw6xxezuLaBpcU= +github.com/spf13/cobra v1.10.2/go.mod h1:7C1pvHqHw5A4vrJfjNwvOdzYu0Gml16OCs2GRiTUUS4= github.com/spf13/jwalterweatherman v1.0.0/go.mod h1:cQK4TGJAtQXfYWX+Ddv3mKDzgVb68N+wFjFa4jdeBTo= github.com/spf13/pflag v0.0.0-20170130214245-9ff6c6923cff/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= github.com/spf13/pflag v1.0.1-0.20171106142849-4c012f6dcd95/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= github.com/spf13/pflag v1.0.1/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= -github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA= github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= +github.com/spf13/pflag v1.0.9/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= +github.com/spf13/pflag v1.0.10 h1:4EBh2KAYBwaONj6b2Ye1GiHfwjqyROoF4RwYO+vPwFk= +github.com/spf13/pflag v1.0.10/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= github.com/spf13/viper v1.4.0/go.mod h1:PTJ7Z/lr49W6bUbkmS1V3by4uWynFiR9p7+dSq/yZzE= github.com/stefanberger/go-pkcs11uri v0.0.0-20201008174630-78d3cae3a980/go.mod h1:AO3tvPzVZ/ayst6UlUKUv6rcPQInYe3IknH3jYhAKu8= github.com/stretchr/objx v0.0.0-20180129172003-8a3f7159479f/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= @@ -692,6 +708,8 @@ github.com/urfave/cli v0.0.0-20171014202726-7bc6a0acffa5/go.mod h1:70zkFmudgCuE/ github.com/urfave/cli v1.20.0/go.mod h1:70zkFmudgCuE/ngEzBv17Jvp/497gISqfk5gWijbERA= github.com/urfave/cli v1.22.1/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0= github.com/urfave/cli v1.22.2/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0= +github.com/vbatts/tar-split v0.12.2 h1:w/Y6tjxpeiFMR47yzZPlPj/FcPLpXbTUi/9H7d3CPa4= +github.com/vbatts/tar-split v0.12.2/go.mod h1:eF6B6i6ftWQcDqEn3/iGFRFRo8cBIMSJVOpnNdfTMFA= github.com/vishvananda/netlink v0.0.0-20181108222139-023a6dafdcdf/go.mod h1:+SR5DhBJrl6ZM7CoCKvpw5BKroDKQ+PJqOg65H/2ktk= github.com/vishvananda/netlink v1.1.0/go.mod h1:cTgwzPIzzgDAYoQrMm0EdrjRUBkTqKYppBueQtXaqoE= github.com/vishvananda/netlink v1.1.1-0.20201029203352-d40f9887b852/go.mod h1:twkDnbuQxJYemMlGd4JFIcuhgX83tXhKS2B/PRMpOho= @@ -735,6 +753,7 @@ go.uber.org/atomic v1.3.2/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE= go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE= go.uber.org/multierr v1.1.0/go.mod h1:wR5kodmAFQ0UK8QlbwjlSNy0Z68gJhDJUG5sjR94q/0= go.uber.org/zap v1.10.0/go.mod h1:vwi/ZaCAaUcBkycHslxD9B2zi4UTXhF60s6SWpuDF0Q= +go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg= golang.org/x/crypto v0.0.0-20171113213409-9f005a07e0d3/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= golang.org/x/crypto v0.0.0-20180904163835-0709b304e793/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= golang.org/x/crypto v0.0.0-20181009213950-7c1a557ab941/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4= @@ -752,8 +771,8 @@ golang.org/x/crypto v0.0.0-20201002170205-7f63de1d35b0/go.mod h1:LzIPMQfyMNhhGPh golang.org/x/crypto v0.0.0-20201216223049-8b5274cf687f/go.mod h1:jdWPYTVW3xRLrWPugEBEK3UY2ZEsg3UU495nc5E+M+I= golang.org/x/crypto v0.0.0-20210322153248-0c34fe9e7dc2/go.mod h1:T9bdIzuCu7OtxOm1hfPfRQxPLYneinmdGuTeoZ9dtd4= golang.org/x/crypto v0.0.0-20220622213112-05595931fe9d/go.mod h1:IxCIyHEi3zRg3s0A5j5BB6A9Jmi73HwBIUl50j+osU4= -golang.org/x/crypto v0.46.0 h1:cKRW/pmt1pKAfetfu+RCEvjvZkA9RimPbh7bhFjGVBU= -golang.org/x/crypto v0.46.0/go.mod h1:Evb/oLKmMraqjZ2iQTwDwvCtJkczlDuTmdJXoZVzqU0= +golang.org/x/crypto v0.50.0 h1:zO47/JPrL6vsNkINmLoo/PH1gcxpls50DNogFvB5ZGI= +golang.org/x/crypto v0.50.0/go.mod h1:3muZ7vA7PBCE6xgPX7nkzzjiUq87kRItoJQM1Yo8S+Q= golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8= @@ -786,8 +805,8 @@ golang.org/x/mod v0.1.1-0.20191105210325-c90efee705ee/go.mod h1:QqPTAvyqsEbceGzB golang.org/x/mod v0.1.1-0.20191107180719-034126e5016b/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg= golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA= -golang.org/x/mod v0.31.0 h1:HaW9xtz0+kOcWKwli0ZXy79Ix+UW/vOfmWI5QVd2tgI= -golang.org/x/mod v0.31.0/go.mod h1:43JraMp9cGx1Rx3AqioxrbrhNsLl2l/iNAvuBkrezpg= +golang.org/x/mod v0.35.0 h1:Ww1D637e6Pg+Zb2KrWfHQUnH2dQRLBQyAtpr/haaJeM= +golang.org/x/mod v0.35.0/go.mod h1:+GwiRhIInF8wPm+4AoT6L0FA1QWAad3OMdTRx4tFYlU= golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4= @@ -828,8 +847,8 @@ golang.org/x/net v0.0.0-20210421230115-4e50805a0758/go.mod h1:72T/g9IO56b78aLF+1 golang.org/x/net v0.0.0-20210428140749-89ef3d95e781/go.mod h1:OJAsFXCWl8Ukc7SiCT/9KSuxbyM7479/AVlXFRxuMCk= golang.org/x/net v0.0.0-20211112202133-69e39bad7dc2/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y= golang.org/x/net v0.0.0-20220127200216-cd36cc0744dd/go.mod h1:CfG3xpIq0wQ8r1q4Su4UZFWDARRcnwPjda9FqA0JpMk= -golang.org/x/net v0.48.0 h1:zyQRTTrjc33Lhh0fBgT/H3oZq9WuvRR5gPC70xpDiQU= -golang.org/x/net v0.48.0/go.mod h1:+ndRgGjkh8FGtu1w1FGbEC31if4VrNVMuKTgcAAnQRY= +golang.org/x/net v0.53.0 h1:d+qAbo5L0orcWAr0a9JweQpjXF19LMXJE8Ey7hwOdUA= +golang.org/x/net v0.53.0/go.mod h1:JvMuJH7rrdiCfbeHoo3fCQU24Lf5JJwT9W3sJFulfgs= golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= @@ -846,8 +865,8 @@ golang.org/x/sync v0.0.0-20200625203802-6e8e738ad208/go.mod h1:RxMgew5VJxzue5/jJ golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20201207232520-09787c993a3a/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= -golang.org/x/sync v0.19.0 h1:vV+1eWNmZ5geRlYjzm2adRgW2/mcpevXNg50YZtPCE4= -golang.org/x/sync v0.19.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI= +golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4= +golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0= golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= @@ -921,13 +940,13 @@ golang.org/x/sys v0.0.0-20210927094055-39ccf1dd6fa6/go.mod h1:oPkhp1MJrh7nUepCBc golang.org/x/sys v0.0.0-20211216021012-1d35b9e2eb4e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20220204135822-1c1b9b1eba6a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= -golang.org/x/sys v0.39.0 h1:CvCKL8MeisomCi6qNZ+wbb0DN9E5AATixKsvNtMoMFk= -golang.org/x/sys v0.39.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks= +golang.org/x/sys v0.43.0 h1:Rlag2XtaFTxp19wS8MXlJwTvoh8ArU6ezoyFsMyCTNI= +golang.org/x/sys v0.43.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw= golang.org/x/term v0.0.0-20201117132131-f5c789dd3221/go.mod h1:Nr5EML6q2oocZ2LXRh80K7BxOlk5/8JxuGnuhpl+muw= golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= -golang.org/x/term v0.38.0 h1:PQ5pkm/rLO6HnxFR7N2lJHOZX6Kez5Y1gDSJla6jo7Q= -golang.org/x/term v0.38.0/go.mod h1:bSEAKrOT1W+VSu9TSCMtoGEOUcKxOKgl3LE5QEF/xVg= +golang.org/x/term v0.42.0 h1:UiKe+zDFmJobeJ5ggPwOshJIVt6/Ft0rcfrXZDLWAWY= +golang.org/x/term v0.42.0/go.mod h1:Dq/D+snpsbazcBG5+F9Q1n2rXV8Ma+71xEjTRufARgY= golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= @@ -937,8 +956,8 @@ golang.org/x/text v0.3.4/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.5/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ= -golang.org/x/text v0.32.0 h1:ZD01bjUt1FQ9WJ0ClOL5vxgxOI/sVCNgX1YtKwcY0mU= -golang.org/x/text v0.32.0/go.mod h1:o/rUWzghvpD5TXrTIBuJU77MTaN0ljMWE47kxGJQ7jY= +golang.org/x/text v0.36.0 h1:JfKh3XmcRPqZPKevfXVpI1wXPTqbkE5f7JA92a55Yxg= +golang.org/x/text v0.36.0/go.mod h1:NIdBknypM8iqVmPiuco0Dh6P5Jcdk8lJL0CUebqK164= golang.org/x/time v0.0.0-20180412165947-fbb02b2291d2/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= @@ -986,8 +1005,8 @@ golang.org/x/tools v0.0.0-20200304193943-95d2e580d8eb/go.mod h1:o4KQGtdN14AW+yjs golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE= golang.org/x/tools v0.0.0-20201224043029-2b0845dc783e/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA= golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA= -golang.org/x/tools v0.40.0 h1:yLkxfA+Qnul4cs9QA3KnlFu0lVmd8JJfoq+E41uSutA= -golang.org/x/tools v0.40.0/go.mod h1:Ik/tzLRlbscWpqqMRjyWYDisX8bG13FrdXp3o4Sr9lc= +golang.org/x/tools v0.44.0 h1:UP4ajHPIcuMjT1GqzDWRlalUEoY+uzoZKnhOjbIPD2c= +golang.org/x/tools v0.44.0/go.mod h1:KA0AfVErSdxRZIsOVipbv3rQhVXTnlU6UhKxHd1seDI= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= @@ -1092,8 +1111,10 @@ gopkg.in/yaml.v3 v3.0.0-20200615113413-eeeca48fe776/go.mod h1:K4uyk7z7BCEPqu6E+C gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= +gotest.tools v2.2.0+incompatible h1:VsBPFP1AI068pPrMxtb/S8Zkgf9xEmTLJjfM+P5UIEo= gotest.tools v2.2.0+incompatible/go.mod h1:DsYFclhRJ6vuDpmuTbkuFWG+y2sxOXAzmJt81HFBacw= gotest.tools/v3 v3.0.2/go.mod h1:3SzNCllyD9/Y+b5r9JIKQ474KzkZyqLqEfYqMsX94Bk= +gotest.tools/v3 v3.0.3 h1:4AuOwCGf4lLR9u3YOe2awrHygurzhO/HeQ6laiA6Sx0= gotest.tools/v3 v3.0.3/go.mod h1:Z7Lb0S5l+klDB31fvDQX8ss/FlKDxtlFlw3Oa8Ymbl8= honnef.co/go/tools v0.0.0-20190102054323-c2f93a96b099/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= honnef.co/go/tools v0.0.0-20190106161140-3f1c8253044a/go.mod h1:rf3lG4BRIbNafJWhAfAdb/ePZxsR/4RtNHQocxwk9r4= diff --git a/images/golden/Dockerfile b/images/golden/Dockerfile new file mode 100644 index 0000000..51c7b3e --- /dev/null +++ b/images/golden/Dockerfile @@ -0,0 +1,149 @@ +# banger golden image — Debian bookworm sandbox for development + testing. +# +# Two sections: +# 1. ESSENTIAL — what banger's lifecycle requires to boot the guest. +# 2. OPINION — developer conveniences curated for banger sandboxes. +# +# Banger's guest agents (vsock agent, network bootstrap, first-boot unit) +# are injected at `banger image pull` time, not baked here. Keeping them +# out means this image stays portable enough to run in other contexts. + +FROM debian:bookworm-slim + +ENV DEBIAN_FRONTEND=noninteractive \ + LANG=C.UTF-8 \ + LC_ALL=C.UTF-8 + +# -------- 1. ESSENTIAL -------- +# Banger needs: an init (systemd + udev + dbus), sshd (the only +# control channel), TLS roots + curl (first-boot installs + mise +# installer), gnupg (build-time signing-key verification for the +# Docker apt repo), iproute2 (debugging; `ip` is still useful even +# when the kernel sets IP via cmdline). +# +# udev is a Recommends of the systemd package on Debian. With +# --no-install-recommends it's skipped — and without it systemd never +# activates device units, so fstab mounts of /dev/vdb (banger's work +# disk) hang forever waiting for a device that is already enumerated +# by the kernel but never "seen" by systemd. dbus gets the same +# treatment for the same reason (system-bus-ness services wedge +# without it). +RUN apt-get update \ + && apt-get install -y --no-install-recommends \ + systemd systemd-sysv udev dbus \ + openssh-server \ + ca-certificates \ + curl \ + gnupg \ + iproute2 \ + && rm -rf /var/lib/apt/lists/* + +# -------- 2. OPINION -------- +# Developer sandbox conveniences. Language runtimes are deliberately +# absent — `mise` (below) handles per-repo `.mise.toml`/`.tool-versions` +# on first `vm run`. + +# Core CLI + search/nav + build toolchain + lint/debug + editor/session. +RUN apt-get update \ + && apt-get install -y --no-install-recommends \ + git jq less tree file unzip zip rsync \ + ripgrep fd-find \ + build-essential pkg-config make \ + shellcheck sqlite3 \ + iputils-ping dnsutils \ + vim-tiny tmux htop \ + && rm -rf /var/lib/apt/lists/* + +# Docker CE (with Compose v2 + buildx) from the official apt repo. +# Nested-VM docker gives Compose workflows hostname/port isolation +# per banger VM, which is a big part of the sandbox story. +# +# The apt key is verified against its published fingerprint before +# we commit it to the signed-by keyring, so a tampered download (or +# a TLS compromise against download.docker.com) cannot silently +# swap in an attacker-controlled signing key. Fingerprint source: +# https://docs.docker.com/engine/install/debian/#install-using-the-repository +RUN set -eu; \ + expected_fpr=9DC858229FC7DD38854AE2D88D81803C0EBFCD88; \ + install -m 0755 -d /etc/apt/keyrings; \ + curl -fsSL https://download.docker.com/linux/debian/gpg -o /tmp/docker.asc; \ + got="$(gpg --with-colons --show-keys --fingerprint /tmp/docker.asc | awk -F: '/^fpr:/ {print $10; exit}')"; \ + if [ "$got" != "$expected_fpr" ]; then \ + echo "docker apt key fingerprint mismatch: got $got, want $expected_fpr" >&2; \ + exit 1; \ + fi; \ + mv /tmp/docker.asc /etc/apt/keyrings/docker.asc; \ + chmod a+r /etc/apt/keyrings/docker.asc; \ + printf 'deb [arch=%s signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian bookworm stable\n' \ + "$(dpkg --print-architecture)" > /etc/apt/sources.list.d/docker.list; \ + apt-get update; \ + apt-get install -y --no-install-recommends \ + docker-ce docker-ce-cli containerd.io \ + docker-buildx-plugin docker-compose-plugin; \ + rm -rf /var/lib/apt/lists/* + +# mise — per-repo version manager. Installed from a pinned GitHub +# release asset rather than `curl https://mise.run | sh` so a compromise +# of the installer endpoint can't silently push arbitrary code into +# the golden image. +# +# Update protocol: bump MISE_VERSION + MISE_SHA256 together. Source +# for the hash is the `digest` field on the release asset from +# `gh release view --repo jdx/mise --json assets`, or compute from +# the downloaded file and cross-reference against SHASUMS256.txt on +# the release page. +ARG MISE_VERSION=v2026.4.18 +ARG MISE_SHA256_AMD64=6ae2d5f0f23a2f2149bc5d9bf264fe0922a1da843f1903e453516c462b23cc1f +RUN set -eux; \ + arch="$(dpkg --print-architecture)"; \ + if [ "$arch" != "amd64" ]; then \ + echo "mise pin only tracks amd64; add a ${arch} hash to refresh" >&2; \ + exit 1; \ + fi; \ + curl -fsSL -o /tmp/mise "https://github.com/jdx/mise/releases/download/${MISE_VERSION}/mise-${MISE_VERSION}-linux-x64"; \ + echo "${MISE_SHA256_AMD64} /tmp/mise" | sha256sum -c -; \ + install -m 0755 /tmp/mise /usr/local/bin/mise; \ + rm /tmp/mise; \ + install -d /etc/profile.d; \ + printf '%s\n' 'if [ -x /usr/local/bin/mise ]; then eval "$(/usr/local/bin/mise activate bash)"; fi' \ + > /etc/profile.d/mise.sh; \ + chmod 0644 /etc/profile.d/mise.sh + +# Default branch for any git init inside the sandbox. +RUN git config --system init.defaultBranch main + +# `fd-find` installs as `fdfind` on Debian to avoid a long-standing name +# clash. Expose the ergonomic name for interactive use. +RUN ln -s /usr/bin/fdfind /usr/local/bin/fd + +# Strip per-image identity so every banger VM gets its own. +# - /etc/machine-id: systemd-firstboot regenerates at boot when empty. +# - SSH host keys: removed here; a ssh.service drop-in (below) runs +# `ssh-keygen -A` before sshd so the VM's first boot generates a +# unique set. +# - /run/sshd tmpfiles entry: Debian's openssh-server package doesn't +# ship one, and ssh.service's own `RuntimeDirectory=sshd` fires too +# late for the ExecStartPre config test, so sshd -t blows up with +# "Missing privilege separation directory: /run/sshd" before the +# daemon ever starts. Creating the dir via tmpfiles.d runs early in +# systemd-tmpfiles-setup, well before ssh.service kicks off. +RUN : > /etc/machine-id \ + && rm -f /etc/ssh/ssh_host_*_key /etc/ssh/ssh_host_*_key.pub \ + && install -d /etc/systemd/system/ssh.service.d \ + && printf '%s\n' \ + '[Service]' \ + '# Reset main unit ExecStartPre list: Debian ships `sshd -t` as' \ + '# the first ExecStartPre, which fails on missing host keys and' \ + '# short-circuits the service before ours gets a chance to run.' \ + 'ExecStartPre=' \ + 'ExecStartPre=/usr/bin/mkdir -p /run/sshd' \ + 'ExecStartPre=/usr/bin/ssh-keygen -A' \ + 'ExecStartPre=/usr/sbin/sshd -t' \ + 'StandardOutput=journal+console' \ + 'StandardError=journal+console' \ + > /etc/systemd/system/ssh.service.d/banger.conf \ + && rm -f /etc/systemd/system/ssh.service.d/regen-host-keys.conf \ + && printf 'd /run/sshd 0755 root root -\n' > /usr/lib/tmpfiles.d/sshd.conf + +# No CMD / ENTRYPOINT: banger boots this via systemd as PID 1 after +# first-boot, not via `docker run`. diff --git a/internal/api/types.go b/internal/api/types.go index fcd6961..7cfd6b1 100644 --- a/internal/api/types.go +++ b/internal/api/types.go @@ -9,9 +9,11 @@ import ( type Empty struct{} type PingResult struct { - Status string `json:"status"` - PID int `json:"pid"` - WebURL string `json:"web_url,omitempty"` + Status string `json:"status"` + PID int `json:"pid"` + Version string `json:"version,omitempty"` + Commit string `json:"commit,omitempty"` + BuiltAt string `json:"built_at,omitempty"` } type ShutdownResult struct { @@ -55,33 +57,6 @@ type VMCreateStatusResult struct { Operation VMCreateOperation `json:"operation"` } -type ImageBuildStatusParams struct { - ID string `json:"id"` -} - -type ImageBuildOperation struct { - ID string `json:"id"` - ImageID string `json:"image_id,omitempty"` - ImageName string `json:"image_name,omitempty"` - Stage string `json:"stage,omitempty"` - Detail string `json:"detail,omitempty"` - BuildLogPath string `json:"build_log_path,omitempty"` - StartedAt time.Time `json:"started_at,omitempty"` - UpdatedAt time.Time `json:"updated_at,omitempty"` - Done bool `json:"done"` - Success bool `json:"success"` - Error string `json:"error,omitempty"` - Image *model.Image `json:"image,omitempty"` -} - -type ImageBuildBeginResult struct { - Operation ImageBuildOperation `json:"operation"` -} - -type ImageBuildStatusResult struct { - Operation ImageBuildOperation `json:"operation"` -} - type VMRefParams struct { IDOrName string `json:"id_or_name"` } @@ -147,14 +122,32 @@ type VMPortsResult struct { Ports []VMPort `json:"ports"` } -type ImageBuildParams struct { - Name string `json:"name,omitempty"` - FromImage string `json:"from_image,omitempty"` - Size string `json:"size,omitempty"` - KernelPath string `json:"kernel_path,omitempty"` - InitrdPath string `json:"initrd_path,omitempty"` - ModulesDir string `json:"modules_dir,omitempty"` - Docker bool `json:"docker,omitempty"` +type WorkspaceExportParams struct { + IDOrName string `json:"id_or_name"` + GuestPath string `json:"guest_path,omitempty"` + BaseCommit string `json:"base_commit,omitempty"` +} + +type WorkspaceExportResult struct { + GuestPath string `json:"guest_path"` + BaseCommit string `json:"base_commit"` + Patch []byte `json:"patch"` + ChangedFiles []string `json:"changed_files"` + HasChanges bool `json:"has_changes"` +} + +type VMWorkspacePrepareParams struct { + IDOrName string `json:"id_or_name"` + SourcePath string `json:"source_path"` + GuestPath string `json:"guest_path,omitempty"` + Branch string `json:"branch,omitempty"` + From string `json:"from,omitempty"` + Mode string `json:"mode,omitempty"` + IncludeUntracked bool `json:"include_untracked,omitempty"` +} + +type VMWorkspacePrepareResult struct { + Workspace model.WorkspacePrepareResult `json:"workspace"` } type ImageRegisterParams struct { @@ -164,13 +157,48 @@ type ImageRegisterParams struct { KernelPath string `json:"kernel_path,omitempty"` InitrdPath string `json:"initrd_path,omitempty"` ModulesDir string `json:"modules_dir,omitempty"` - Docker bool `json:"docker,omitempty"` + KernelRef string `json:"kernel_ref,omitempty"` +} + +type ImagePullParams struct { + Ref string `json:"ref"` + Name string `json:"name,omitempty"` + KernelPath string `json:"kernel_path,omitempty"` + InitrdPath string `json:"initrd_path,omitempty"` + ModulesDir string `json:"modules_dir,omitempty"` + KernelRef string `json:"kernel_ref,omitempty"` + SizeBytes int64 `json:"size_bytes,omitempty"` } type ImageRefParams struct { IDOrName string `json:"id_or_name"` } +type OperationSummary struct { + ID string `json:"id"` + Kind string `json:"kind"` + Stage string `json:"stage,omitempty"` + Detail string `json:"detail,omitempty"` + Done bool `json:"done"` + StartedAt time.Time `json:"started_at,omitempty"` + UpdatedAt time.Time `json:"updated_at,omitempty"` +} + +type OperationsListResult struct { + Operations []OperationSummary `json:"operations"` +} + +type ImageCachePruneParams struct { + DryRun bool `json:"dry_run,omitempty"` +} + +type ImageCachePruneResult struct { + BytesFreed int64 `json:"bytes_freed"` + BlobsFreed int `json:"blobs_freed"` + DryRun bool `json:"dry_run"` + CacheDir string `json:"cache_dir"` +} + type ImageListResult struct { Images []model.Image `json:"images"` } @@ -179,41 +207,53 @@ type ImageShowResult struct { Image model.Image `json:"image"` } -type SudoStatus struct { - Available bool `json:"available"` - Command string `json:"command,omitempty"` - Error string `json:"error,omitempty"` +type KernelEntry struct { + Name string `json:"name"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` + KernelVersion string `json:"kernel_version,omitempty"` + SHA256 string `json:"sha256,omitempty"` + Source string `json:"source,omitempty"` + ImportedAt string `json:"imported_at,omitempty"` + KernelPath string `json:"kernel_path,omitempty"` + InitrdPath string `json:"initrd_path,omitempty"` + ModulesDir string `json:"modules_dir,omitempty"` } -type HostSummary struct { - CPUCount int `json:"cpu_count"` - TotalMemoryBytes int64 `json:"total_memory_bytes"` - StateFilesystemTotalBytes int64 `json:"state_filesystem_total_bytes"` - StateFilesystemFreeBytes int64 `json:"state_filesystem_free_bytes"` +type KernelListResult struct { + Entries []KernelEntry `json:"entries"` } -type BangerSummary struct { - ImageCount int `json:"image_count"` - ManagedImageCount int `json:"managed_image_count"` - VMCount int `json:"vm_count"` - RunningVMCount int `json:"running_vm_count"` - ConfiguredVCPUCount int `json:"configured_vcpu_count"` - ConfiguredMemoryBytes int64 `json:"configured_memory_bytes"` - ConfiguredDiskBytes int64 `json:"configured_disk_bytes"` - UsedSystemOverlayBytes int64 `json:"used_system_overlay_bytes"` - UsedWorkDiskBytes int64 `json:"used_work_disk_bytes"` - RunningCPUPercent float64 `json:"running_cpu_percent"` - RunningRSSBytes int64 `json:"running_rss_bytes"` - RunningVSZBytes int64 `json:"running_vsz_bytes"` +type KernelRefParams struct { + Name string `json:"name"` } -type DashboardSummary struct { - GeneratedAt time.Time `json:"generated_at"` - Host HostSummary `json:"host"` - Sudo SudoStatus `json:"sudo"` - Banger BangerSummary `json:"banger"` +type KernelShowResult struct { + Entry KernelEntry `json:"entry"` } -type DashboardSummaryResult struct { - Summary DashboardSummary `json:"summary"` +type KernelImportParams struct { + Name string `json:"name"` + FromDir string `json:"from_dir"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` +} + +type KernelPullParams struct { + Name string `json:"name"` + Force bool `json:"force,omitempty"` +} + +type KernelCatalogEntry struct { + Name string `json:"name"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` + KernelVersion string `json:"kernel_version,omitempty"` + SizeBytes int64 `json:"size_bytes,omitempty"` + Description string `json:"description,omitempty"` + Pulled bool `json:"pulled"` +} + +type KernelCatalogResult struct { + Entries []KernelCatalogEntry `json:"entries"` } diff --git a/internal/buildinfo/buildinfo.go b/internal/buildinfo/buildinfo.go new file mode 100644 index 0000000..61bc6c2 --- /dev/null +++ b/internal/buildinfo/buildinfo.go @@ -0,0 +1,34 @@ +package buildinfo + +import "strings" + +var ( + Version = "dev" + Commit = "unknown" + BuiltAt = "unknown" +) + +type Info struct { + Version string + Commit string + BuiltAt string +} + +func Current() Info { + return Normalize(Version, Commit, BuiltAt) +} + +func Normalize(version, commit, builtAt string) Info { + return Info{ + Version: normalizedValue(version, "dev"), + Commit: normalizedValue(commit, "unknown"), + BuiltAt: normalizedValue(builtAt, "unknown"), + } +} + +func normalizedValue(value, fallback string) string { + if trimmed := strings.TrimSpace(value); trimmed != "" { + return trimmed + } + return fallback +} diff --git a/internal/buildinfo/buildinfo_test.go b/internal/buildinfo/buildinfo_test.go new file mode 100644 index 0000000..51b1ce2 --- /dev/null +++ b/internal/buildinfo/buildinfo_test.go @@ -0,0 +1,33 @@ +package buildinfo + +import "testing" + +func TestNormalizeUsesFallbacks(t *testing.T) { + t.Parallel() + + info := Normalize("", " ", "\t") + if info.Version != "dev" { + t.Fatalf("Version = %q, want dev", info.Version) + } + if info.Commit != "unknown" { + t.Fatalf("Commit = %q, want unknown", info.Commit) + } + if info.BuiltAt != "unknown" { + t.Fatalf("BuiltAt = %q, want unknown", info.BuiltAt) + } +} + +func TestNormalizeTrimsValues(t *testing.T) { + t.Parallel() + + info := Normalize(" v1.2.3 ", " abc123 ", " 2026-03-22T12:00:00Z ") + if info.Version != "v1.2.3" { + t.Fatalf("Version = %q, want v1.2.3", info.Version) + } + if info.Commit != "abc123" { + t.Fatalf("Commit = %q, want abc123", info.Commit) + } + if info.BuiltAt != "2026-03-22T12:00:00Z" { + t.Fatalf("BuiltAt = %q, want 2026-03-22T12:00:00Z", info.BuiltAt) + } +} diff --git a/internal/cli/aliases_test.go b/internal/cli/aliases_test.go new file mode 100644 index 0000000..ed1cbe3 --- /dev/null +++ b/internal/cli/aliases_test.go @@ -0,0 +1,102 @@ +package cli + +import ( + "testing" + + "github.com/spf13/cobra" +) + +// findSubcommand walks cmd's subtree along path and returns the +// matching command, or nil. +func findSubcommand(root *cobra.Command, path ...string) *cobra.Command { + cur := root + for _, name := range path { + var next *cobra.Command + for _, sub := range cur.Commands() { + if sub.Name() == name { + next = sub + break + } + } + if next == nil { + return nil + } + cur = next + } + return cur +} + +func assertHasAlias(t *testing.T, cmd *cobra.Command, alias string) { + t.Helper() + if cmd == nil { + t.Fatal("command is nil") + } + for _, a := range cmd.Aliases { + if a == alias { + return + } + } + t.Errorf("%q missing alias %q; have %v", cmd.Name(), alias, cmd.Aliases) +} + +func TestListCommandsHaveLsAlias(t *testing.T) { + root := NewBangerCommand() + + cases := [][]string{ + {"vm", "list"}, + {"image", "list"}, + {"kernel", "list"}, + } + for _, path := range cases { + t.Run(path[len(path)-1], func(t *testing.T) { + cmd := findSubcommand(root, path...) + if cmd == nil { + t.Fatalf("missing command: %v", path) + } + assertHasAlias(t, cmd, "ls") + }) + } +} + +func TestDeleteCommandsHaveRmAlias(t *testing.T) { + root := NewBangerCommand() + + cases := [][]string{ + {"vm", "delete"}, + {"image", "delete"}, + } + for _, path := range cases { + t.Run(path[len(path)-1], func(t *testing.T) { + cmd := findSubcommand(root, path...) + if cmd == nil { + t.Fatalf("missing command: %v", path) + } + assertHasAlias(t, cmd, "rm") + }) + } +} + +func TestVMCommandRegistersPrune(t *testing.T) { + root := NewBangerCommand() + cmd := findSubcommand(root, "vm", "prune") + if cmd == nil { + t.Fatal("vm prune not registered") + } + if flag := cmd.Flags().Lookup("force"); flag == nil { + t.Error("vm prune missing --force flag") + } + if flag := cmd.Flags().ShorthandLookup("f"); flag == nil { + t.Error("vm prune missing -f shorthand") + } +} + +func TestKernelRmHasDeleteAlias(t *testing.T) { + // This already existed prior to this feature — guard against regressions. + root := NewBangerCommand() + cmd := findSubcommand(root, "kernel", "rm") + if cmd == nil { + t.Fatal("kernel rm missing") + } + assertHasAlias(t, cmd, "delete") + assertHasAlias(t, cmd, "remove") +} diff --git a/internal/cli/banger.go b/internal/cli/banger.go index cb595b6..7c40e5a 100644 --- a/internal/cli/banger.go +++ b/internal/cli/banger.go @@ -1,141 +1,107 @@ package cli import ( - "bytes" - "context" - "encoding/json" "errors" "fmt" - "io" - "net" - "os" - "os/exec" "path/filepath" - "sort" "strings" - "sync" - "syscall" - "text/tabwriter" - "time" "banger/internal/api" - "banger/internal/config" - "banger/internal/daemon" - "banger/internal/guest" - "banger/internal/hostnat" - "banger/internal/imagepreset" - "banger/internal/model" - "banger/internal/paths" - "banger/internal/rpc" - "banger/internal/system" - "banger/internal/vmdns" - "banger/internal/vsockagent" + "banger/internal/buildinfo" "github.com/spf13/cobra" ) -var ( - bangerdPathFunc = paths.BangerdPath - daemonExePath = func(pid int) string { - return filepath.Join("/proc", fmt.Sprintf("%d", pid), "exe") - } - doctorFunc = daemon.Doctor - sshExecFunc = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { - sshCmd := exec.CommandContext(ctx, "ssh", args...) - sshCmd.Stdout = stdout - sshCmd.Stderr = stderr - sshCmd.Stdin = stdin - return sshCmd.Run() - } - opencodeExecFunc = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { - opencodeCmd := exec.CommandContext(ctx, "opencode", args...) - opencodeCmd.Stdout = stdout - opencodeCmd.Stderr = stderr - opencodeCmd.Stdin = stdin - return opencodeCmd.Run() - } - hostCommandOutputFunc = func(ctx context.Context, name string, args ...string) ([]byte, error) { - cmd := exec.CommandContext(ctx, name, args...) - output, err := cmd.CombinedOutput() - if err == nil { - return output, nil - } - command := strings.TrimSpace(strings.Join(append([]string{name}, args...), " ")) - detail := strings.TrimSpace(string(output)) - if detail == "" { - return output, fmt.Errorf("%s: %w", command, err) - } - return output, fmt.Errorf("%s: %w: %s", command, err, detail) - } - vmHealthFunc = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { - return rpc.Call[api.VMHealthResult](ctx, socketPath, "vm.health", api.VMRefParams{IDOrName: idOrName}) - } - vmCreateBeginFunc = func(ctx context.Context, socketPath string, params api.VMCreateParams) (api.VMCreateBeginResult, error) { - return rpc.Call[api.VMCreateBeginResult](ctx, socketPath, "vm.create.begin", params) - } - vmCreateStatusFunc = func(ctx context.Context, socketPath, operationID string) (api.VMCreateStatusResult, error) { - return rpc.Call[api.VMCreateStatusResult](ctx, socketPath, "vm.create.status", api.VMCreateStatusParams{ID: operationID}) - } - vmCreateCancelFunc = func(ctx context.Context, socketPath, operationID string) error { - _, err := rpc.Call[api.Empty](ctx, socketPath, "vm.create.cancel", api.VMCreateStatusParams{ID: operationID}) - return err - } - vmPortsFunc = func(ctx context.Context, socketPath, idOrName string) (api.VMPortsResult, error) { - return rpc.Call[api.VMPortsResult](ctx, socketPath, "vm.ports", api.VMRefParams{IDOrName: idOrName}) - } - guestWaitForSSHFunc = func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { - return guest.WaitForSSH(ctx, address, privateKeyPath, interval) - } - guestDialFunc = func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) { - return guest.Dial(ctx, address, privateKeyPath) - } - cwdFunc = os.Getwd -) - -type vmRunGuestClient interface { - Close() error - UploadFile(ctx context.Context, remotePath string, mode os.FileMode, data []byte, logWriter io.Writer) error - RunScript(ctx context.Context, script string, logWriter io.Writer) error - StreamTarEntries(ctx context.Context, sourceDir string, entries []string, remoteCommand string, logWriter io.Writer) error -} - -type vmRunRepoSpec struct { - SourcePath string - RepoRoot string - RepoName string - HeadCommit string - CurrentBranch string - BranchName string - BaseCommit string - OverlayPaths []string -} - -const vmRunGuestBundlePath = "/tmp/banger-vm-run.bundle" - +// NewBangerCommand builds the top-level cobra tree with production +// defaults wired into the dependency struct. Tests reach into the +// package directly — see newRootCommand + defaultDeps. func NewBangerCommand() *cobra.Command { + return defaultDeps().newRootCommand() +} + +func (d *deps) newRootCommand() *cobra.Command { root := &cobra.Command{ - Use: "banger", - Short: "Manage development VMs and images", + Use: "banger", + Version: formatVersionLine(buildinfo.Current()), + Short: "Run development sandboxes as Firecracker microVMs", + Long: strings.TrimSpace(` +banger runs disposable development sandboxes as Firecracker microVMs. +Each sandbox boots in a few seconds, gets its own root filesystem and +network, and exits on demand. + +The most common workflow is one command: + + banger vm run bare sandbox, drops into ssh + banger vm run ./repo ships a repo into /root/repo, drops into ssh + banger vm run ./repo -- make test ships a repo, runs the command, exits with its status + banger vm run --rm -- script.sh --rm: VM auto-deletes when the session/command exits + banger vm run --nat ./repo --nat: outbound internet (required when .mise.toml installs tools) + banger vm run -d ./repo --nat -d/--detach: prep workspace + bootstrap, exit without ssh + +For a longer-lived VM, use 'banger vm create' to provision and +'banger vm ssh ' to attach. 'banger ps' lists running VMs; +'banger vm list --all' shows stopped ones too. Guests are reachable +at .vm from the host once 'banger ssh-config --install' is run. + +First-time setup, in order: + sudo banger system install install the systemd services + banger doctor confirm the host is ready + banger image pull debian-bookworm fetch a default image + +Run 'banger --help' for any subcommand. Run 'banger doctor' +to diagnose host readiness problems. +`), SilenceUsage: true, SilenceErrors: true, RunE: helpNoArgs, } - root.CompletionOptions.DisableDefaultCmd = true - root.AddCommand(newDaemonCommand(), newDoctorCommand(), newVMCommand(), newImageCommand(), newInternalCommand()) + // Drop cobra's default "{{.Name}} version {{.Version}}" wrapper — + // our Version string is already a complete sentence. + root.SetVersionTemplate("{{.Version}}\n") + root.AddCommand( + d.newDaemonCommand(), + d.newDoctorCommand(), + d.newImageCommand(), + d.newInternalCommand(), + d.newKernelCommand(), + newSSHConfigCommand(), + d.newSystemCommand(), + d.newUpdateCommand(), + newVersionCommand(), + d.newPSCommand(), + d.newVMCommand(), + ) return root } -func newDoctorCommand() *cobra.Command { - return &cobra.Command{ +func (d *deps) newDoctorCommand() *cobra.Command { + var verbose bool + cmd := &cobra.Command{ Use: "doctor", Short: "Check host and runtime readiness", - Args: noArgsUsage("usage: banger doctor"), + Long: strings.TrimSpace(` +Check that the host has everything banger needs to boot guests: +required tools (mkfs.ext4, debugfs, dmsetup, ip, iptables, ...), KVM +access, daemon reachability, and per-feature preflight (NAT, DNS +routing, work-disk seeding). + +Run 'banger doctor': + - after 'banger system install' to confirm the install took + - after upgrading the host kernel or banger itself + - when 'banger vm run' fails with an unclear error + +By default, prints failing and warning checks only and a summary +footer; a healthy host collapses to a single line. Pass --verbose to +print every check with its details. Exit code is non-zero if any +check fails. Warnings are reported but do not fail the run. +`), + Args: noArgsUsage("usage: banger doctor"), RunE: func(cmd *cobra.Command, args []string) error { - report, err := doctorFunc(cmd.Context()) + report, err := d.doctor(cmd.Context()) if err != nil { return err } - if err := printDoctorReport(cmd.OutOrStdout(), report); err != nil { + if err := printDoctorReport(cmd.OutOrStdout(), report, verbose); err != nil { return err } if report.HasFailures() { @@ -144,825 +110,22 @@ func newDoctorCommand() *cobra.Command { return nil }, } -} - -func newInternalCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "internal", - Hidden: true, - RunE: helpNoArgs, - } - cmd.AddCommand( - newInternalNATCommand(), - newInternalWorkSeedCommand(), - newInternalSSHKeyPathCommand(), - newInternalFirecrackerPathCommand(), - newInternalVSockAgentPathCommand(), - newInternalPackagesCommand(), - ) + cmd.Flags().BoolVarP(&verbose, "verbose", "v", false, "show every check (default: only failures and warnings)") return cmd } -func newInternalSSHKeyPathCommand() *cobra.Command { +func newVersionCommand() *cobra.Command { return &cobra.Command{ - Use: "ssh-key-path", - Hidden: true, - Args: noArgsUsage("usage: banger internal ssh-key-path"), + Use: "version", + Short: "Show banger build information", + Args: noArgsUsage("usage: banger version"), RunE: func(cmd *cobra.Command, args []string) error { - layout, err := paths.Resolve() - if err != nil { - return err - } - cfg, err := config.Load(layout) - if err != nil { - return err - } - _, err = fmt.Fprintln(cmd.OutOrStdout(), cfg.SSHKeyPath) + _, err := fmt.Fprint(cmd.OutOrStdout(), formatBuildInfoBlock(buildinfo.Current())) return err }, } } -func newInternalFirecrackerPathCommand() *cobra.Command { - return &cobra.Command{ - Use: "firecracker-path", - Hidden: true, - Args: noArgsUsage("usage: banger internal firecracker-path"), - RunE: func(cmd *cobra.Command, args []string) error { - layout, err := paths.Resolve() - if err != nil { - return err - } - cfg, err := config.Load(layout) - if err != nil { - return err - } - if strings.TrimSpace(cfg.FirecrackerBin) == "" { - return errors.New("firecracker binary not configured; install firecracker or set firecracker_bin") - } - _, err = fmt.Fprintln(cmd.OutOrStdout(), cfg.FirecrackerBin) - return err - }, - } -} - -func newInternalVSockAgentPathCommand() *cobra.Command { - return &cobra.Command{ - Use: "vsock-agent-path", - Hidden: true, - Args: noArgsUsage("usage: banger internal vsock-agent-path"), - RunE: func(cmd *cobra.Command, args []string) error { - path, err := paths.CompanionBinaryPath("banger-vsock-agent") - if err != nil { - return err - } - _, err = fmt.Fprintln(cmd.OutOrStdout(), path) - return err - }, - } -} - -func newInternalPackagesCommand() *cobra.Command { - var docker bool - cmd := &cobra.Command{ - Use: "packages ", - Hidden: true, - Args: exactArgsUsage(1, "usage: banger internal packages [--docker]"), - RunE: func(cmd *cobra.Command, args []string) error { - var packages []string - switch strings.TrimSpace(args[0]) { - case "debian": - packages = imagepreset.DebianBasePackages() - if docker { - packages = append(packages, "docker.io") - } - case "void": - packages = imagepreset.VoidBasePackages() - case "alpine": - packages = imagepreset.AlpineBasePackages() - default: - return fmt.Errorf("unknown package preset %q", args[0]) - } - for _, pkg := range packages { - if _, err := fmt.Fprintln(cmd.OutOrStdout(), pkg); err != nil { - return err - } - } - return nil - }, - } - cmd.Flags().BoolVar(&docker, "docker", false, "include docker-specific additions") - return cmd -} - -func newInternalWorkSeedCommand() *cobra.Command { - var rootfsPath string - var outPath string - cmd := &cobra.Command{ - Use: "work-seed", - Hidden: true, - Args: noArgsUsage("usage: banger internal work-seed --rootfs [--out ]"), - RunE: func(cmd *cobra.Command, args []string) error { - rootfsPath = strings.TrimSpace(rootfsPath) - outPath = strings.TrimSpace(outPath) - if rootfsPath == "" { - return errors.New("rootfs path is required") - } - if outPath == "" { - outPath = system.WorkSeedPath(rootfsPath) - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - return system.BuildWorkSeedImage(cmd.Context(), system.NewRunner(), rootfsPath, outPath) - }, - } - cmd.Flags().StringVar(&rootfsPath, "rootfs", "", "rootfs image path") - cmd.Flags().StringVar(&outPath, "out", "", "output work-seed image path") - return cmd -} - -func newInternalNATCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "nat", - Hidden: true, - RunE: helpNoArgs, - } - cmd.AddCommand( - newInternalNATActionCommand("up", true), - newInternalNATActionCommand("down", false), - ) - return cmd -} - -func newInternalNATActionCommand(use string, enable bool) *cobra.Command { - var guestIP string - var tapDevice string - cmd := &cobra.Command{ - Use: use, - Hidden: true, - Args: noArgsUsage("usage: banger internal nat " + use + " --guest-ip --tap "), - RunE: func(cmd *cobra.Command, args []string) error { - guestIP = strings.TrimSpace(guestIP) - tapDevice = strings.TrimSpace(tapDevice) - if guestIP == "" { - return errors.New("guest IP is required") - } - if tapDevice == "" { - return errors.New("tap device is required") - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - return hostnat.Ensure(cmd.Context(), system.NewRunner(), guestIP, tapDevice, enable) - }, - } - cmd.Flags().StringVar(&guestIP, "guest-ip", "", "guest IPv4 address") - cmd.Flags().StringVar(&tapDevice, "tap", "", "tap device name") - return cmd -} - -func newDaemonCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "daemon", - Short: "Manage the banger daemon", - RunE: helpNoArgs, - } - cmd.AddCommand( - &cobra.Command{ - Use: "status", - Short: "Show daemon status", - Args: noArgsUsage("usage: banger daemon status"), - RunE: func(cmd *cobra.Command, args []string) error { - layout, err := paths.Resolve() - if err != nil { - return err - } - cfg, err := config.Load(layout) - if err != nil { - return err - } - ping, pingErr := rpc.Call[api.PingResult](cmd.Context(), layout.SocketPath, "ping", api.Empty{}) - if pingErr != nil { - if strings.TrimSpace(cfg.WebListenAddr) != "" { - _, err = fmt.Fprintf(cmd.OutOrStdout(), "stopped\nsocket: %s\nlog: %s\ndns: %s\nweb: http://%s\n", layout.SocketPath, layout.DaemonLog, vmdns.DefaultListenAddr, cfg.WebListenAddr) - return err - } - _, err = fmt.Fprintf(cmd.OutOrStdout(), "stopped\nsocket: %s\nlog: %s\ndns: %s\n", layout.SocketPath, layout.DaemonLog, vmdns.DefaultListenAddr) - return err - } - if strings.TrimSpace(ping.WebURL) != "" { - _, err = fmt.Fprintf(cmd.OutOrStdout(), "running\npid: %d\nsocket: %s\nlog: %s\ndns: %s\nweb: %s\n", ping.PID, layout.SocketPath, layout.DaemonLog, vmdns.DefaultListenAddr, ping.WebURL) - return err - } - _, err = fmt.Fprintf(cmd.OutOrStdout(), "running\npid: %d\nsocket: %s\nlog: %s\ndns: %s\n", ping.PID, layout.SocketPath, layout.DaemonLog, vmdns.DefaultListenAddr) - return err - }, - }, - &cobra.Command{ - Use: "stop", - Short: "Stop the daemon", - Args: noArgsUsage("usage: banger daemon stop"), - RunE: func(cmd *cobra.Command, args []string) error { - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, err := paths.Resolve() - if err != nil { - return err - } - _, err = rpc.Call[api.ShutdownResult](cmd.Context(), layout.SocketPath, "shutdown", api.Empty{}) - if err != nil { - if os.IsNotExist(err) || strings.Contains(err.Error(), "connect") { - _, writeErr := fmt.Fprintln(cmd.OutOrStdout(), "daemon not running") - return writeErr - } - return err - } - _, err = fmt.Fprintln(cmd.OutOrStdout(), "stopping") - return err - }, - }, - &cobra.Command{ - Use: "socket", - Short: "Print the daemon socket path", - Args: noArgsUsage("usage: banger daemon socket"), - RunE: func(cmd *cobra.Command, args []string) error { - layout, err := paths.Resolve() - if err != nil { - return err - } - _, err = fmt.Fprintln(cmd.OutOrStdout(), layout.SocketPath) - return err - }, - }, - ) - return cmd -} - -func newVMCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "vm", - Short: "Manage virtual machines", - RunE: helpNoArgs, - } - cmd.AddCommand( - newVMCreateCommand(), - newVMRunCommand(), - newVMListCommand(), - newVMShowCommand(), - newVMActionCommand("start", "Start a VM", "vm.start"), - newVMActionCommand("stop", "Stop a VM", "vm.stop"), - newVMKillCommand(), - newVMActionCommand("restart", "Restart a VM", "vm.restart"), - newVMActionCommand("delete", "Delete a VM", "vm.delete"), - newVMSetCommand(), - newVMSSHCommand(), - newVMLogsCommand(), - newVMStatsCommand(), - newVMPortsCommand(), - ) - return cmd -} - -func newVMRunCommand() *cobra.Command { - var ( - name string - imageName string - vcpu = model.DefaultVCPUCount - memory = model.DefaultMemoryMiB - systemOverlaySize = model.FormatSizeBytes(model.DefaultSystemOverlaySize) - workDiskSize = model.FormatSizeBytes(model.DefaultWorkDiskSize) - natEnabled bool - branchName string - fromRef = "HEAD" - ) - cmd := &cobra.Command{ - Use: "run [path]", - Short: "Create a repo-backed VM session and attach opencode", - Args: maxArgsUsage(1, "usage: banger vm run [path]"), - RunE: func(cmd *cobra.Command, args []string) error { - if cmd.Flags().Changed("branch") && strings.TrimSpace(branchName) == "" { - return errors.New("--branch requires a branch name") - } - if cmd.Flags().Changed("from") && strings.TrimSpace(branchName) == "" { - return errors.New("--from requires --branch") - } - - sourcePath := "" - if len(args) == 1 { - sourcePath = args[0] - } - spec, err := inspectVMRunRepo(cmd.Context(), sourcePath, branchName, fromRef) - if err != nil { - return err - } - - layout, err := paths.Resolve() - if err != nil { - return err - } - cfg, err := config.Load(layout) - if err != nil { - return err - } - if err := validateVMRunPrereqs(cfg); err != nil { - return err - } - params, err := vmCreateParamsFromFlags(cmd, name, imageName, vcpu, memory, systemOverlaySize, workDiskSize, natEnabled, false) - if err != nil { - return err - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, cfg, err = ensureDaemon(cmd.Context()) - if err != nil { - return err - } - return runVMRun(cmd.Context(), layout.SocketPath, cfg, cmd.InOrStdin(), cmd.OutOrStdout(), cmd.ErrOrStderr(), params, spec) - }, - } - cmd.Flags().StringVar(&name, "name", "", "vm name") - cmd.Flags().StringVar(&imageName, "image", "", "image name or id") - cmd.Flags().IntVar(&vcpu, "vcpu", model.DefaultVCPUCount, "vcpu count") - cmd.Flags().IntVar(&memory, "memory", model.DefaultMemoryMiB, "memory in MiB") - cmd.Flags().StringVar(&systemOverlaySize, "system-overlay-size", model.FormatSizeBytes(model.DefaultSystemOverlaySize), "system overlay size") - cmd.Flags().StringVar(&workDiskSize, "disk-size", model.FormatSizeBytes(model.DefaultWorkDiskSize), "work disk size") - cmd.Flags().BoolVar(&natEnabled, "nat", false, "enable NAT") - cmd.Flags().StringVar(&branchName, "branch", "", "create and switch to a new guest branch") - cmd.Flags().StringVar(&fromRef, "from", "HEAD", "base ref for --branch") - return cmd -} - -func newVMKillCommand() *cobra.Command { - var signal string - cmd := &cobra.Command{ - Use: "kill ...", - Short: "Send a signal to a VM process", - Args: minArgsUsage(1, "usage: banger vm kill [--signal SIGTERM|SIGKILL|...] ..."), - RunE: func(cmd *cobra.Command, args []string) error { - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - if len(args) > 1 { - return runVMBatchAction(cmd, layout.SocketPath, args, func(ctx context.Context, id string) (model.VMRecord, error) { - result, err := rpc.Call[api.VMShowResult]( - ctx, - layout.SocketPath, - "vm.kill", - api.VMKillParams{IDOrName: id, Signal: signal}, - ) - if err != nil { - return model.VMRecord{}, err - } - return result.VM, nil - }) - } - result, err := rpc.Call[api.VMShowResult]( - cmd.Context(), - layout.SocketPath, - "vm.kill", - api.VMKillParams{IDOrName: args[0], Signal: signal}, - ) - if err != nil { - return err - } - return printVMSummary(cmd.OutOrStdout(), result.VM) - }, - } - cmd.Flags().StringVar(&signal, "signal", "TERM", "signal name to send") - return cmd -} - -func newVMCreateCommand() *cobra.Command { - var ( - name string - imageName string - vcpu = model.DefaultVCPUCount - memory = model.DefaultMemoryMiB - systemOverlaySize = model.FormatSizeBytes(model.DefaultSystemOverlaySize) - workDiskSize = model.FormatSizeBytes(model.DefaultWorkDiskSize) - natEnabled bool - noStart bool - ) - cmd := &cobra.Command{ - Use: "create", - Short: "Create a VM", - Args: noArgsUsage("usage: banger vm create"), - RunE: func(cmd *cobra.Command, args []string) error { - params, err := vmCreateParamsFromFlags(cmd, name, imageName, vcpu, memory, systemOverlaySize, workDiskSize, natEnabled, noStart) - if err != nil { - return err - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - vm, err := runVMCreate(cmd.Context(), layout.SocketPath, cmd.ErrOrStderr(), params) - if err != nil { - return err - } - return printVMSummary(cmd.OutOrStdout(), vm) - }, - } - cmd.Flags().StringVar(&name, "name", "", "vm name") - cmd.Flags().StringVar(&imageName, "image", "", "image name or id") - cmd.Flags().IntVar(&vcpu, "vcpu", model.DefaultVCPUCount, "vcpu count") - cmd.Flags().IntVar(&memory, "memory", model.DefaultMemoryMiB, "memory in MiB") - cmd.Flags().StringVar(&systemOverlaySize, "system-overlay-size", model.FormatSizeBytes(model.DefaultSystemOverlaySize), "system overlay size") - cmd.Flags().StringVar(&workDiskSize, "disk-size", model.FormatSizeBytes(model.DefaultWorkDiskSize), "work disk size") - cmd.Flags().BoolVar(&natEnabled, "nat", false, "enable NAT") - cmd.Flags().BoolVar(&noStart, "no-start", false, "create without starting") - return cmd -} - -func newVMListCommand() *cobra.Command { - return &cobra.Command{ - Use: "list", - Short: "List VMs", - Args: noArgsUsage("usage: banger vm list"), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.VMListResult](cmd.Context(), layout.SocketPath, "vm.list", api.Empty{}) - if err != nil { - return err - } - images, err := rpc.Call[api.ImageListResult](cmd.Context(), layout.SocketPath, "image.list", api.Empty{}) - if err != nil { - return err - } - return printVMListTable(cmd.OutOrStdout(), result.VMs, imageNameIndex(images.Images)) - }, - } -} - -func newVMShowCommand() *cobra.Command { - return &cobra.Command{ - Use: "show ", - Short: "Show VM details", - Args: exactArgsUsage(1, "usage: banger vm show "), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, "vm.show", api.VMRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - return printJSON(cmd.OutOrStdout(), result.VM) - }, - } -} - -func newVMActionCommand(use, short, method string) *cobra.Command { - return &cobra.Command{ - Use: use + " ...", - Short: short, - Args: minArgsUsage(1, fmt.Sprintf("usage: banger vm %s ...", use)), - RunE: func(cmd *cobra.Command, args []string) error { - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - if len(args) > 1 { - return runVMBatchAction(cmd, layout.SocketPath, args, func(ctx context.Context, id string) (model.VMRecord, error) { - result, err := rpc.Call[api.VMShowResult](ctx, layout.SocketPath, method, api.VMRefParams{IDOrName: id}) - if err != nil { - return model.VMRecord{}, err - } - return result.VM, nil - }) - } - result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, method, api.VMRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - return printVMSummary(cmd.OutOrStdout(), result.VM) - }, - } -} - -func newVMSetCommand() *cobra.Command { - var ( - vcpu int - memory int - diskSize string - nat bool - noNat bool - ) - cmd := &cobra.Command{ - Use: "set ...", - Short: "Update stopped VM settings", - Args: minArgsUsage(1, "usage: banger vm set [--vcpu N] [--memory MiB] [--disk-size SIZE] [--nat|--no-nat] ..."), - RunE: func(cmd *cobra.Command, args []string) error { - params, err := vmSetParamsFromFlags(args[0], vcpu, memory, diskSize, nat, noNat) - if err != nil { - return err - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - if len(args) > 1 { - return runVMBatchAction(cmd, layout.SocketPath, args, func(ctx context.Context, id string) (model.VMRecord, error) { - batchParams := params - batchParams.IDOrName = id - result, err := rpc.Call[api.VMShowResult](ctx, layout.SocketPath, "vm.set", batchParams) - if err != nil { - return model.VMRecord{}, err - } - return result.VM, nil - }) - } - result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, "vm.set", params) - if err != nil { - return err - } - return printVMSummary(cmd.OutOrStdout(), result.VM) - }, - } - cmd.Flags().IntVar(&vcpu, "vcpu", -1, "vcpu count") - cmd.Flags().IntVar(&memory, "memory", -1, "memory in MiB") - cmd.Flags().StringVar(&diskSize, "disk-size", "", "new work disk size") - cmd.Flags().BoolVar(&nat, "nat", false, "enable NAT") - cmd.Flags().BoolVar(&noNat, "no-nat", false, "disable NAT") - return cmd -} - -func newVMSSHCommand() *cobra.Command { - return &cobra.Command{ - Use: "ssh [ssh args...]", - Short: "SSH into a running VM", - Args: minArgsUsage(1, "usage: banger vm ssh [ssh args...]"), - RunE: func(cmd *cobra.Command, args []string) error { - layout, cfg, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - if err := validateSSHPrereqs(cfg); err != nil { - return err - } - result, err := rpc.Call[api.VMSSHResult](cmd.Context(), layout.SocketPath, "vm.ssh", api.VMRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - sshArgs, err := sshCommandArgs(cfg, result.GuestIP, args[1:]) - if err != nil { - return err - } - return runSSHSession(cmd.Context(), layout.SocketPath, result.Name, cmd.InOrStdin(), cmd.OutOrStdout(), cmd.ErrOrStderr(), sshArgs) - }, - } -} - -func newVMLogsCommand() *cobra.Command { - var follow bool - cmd := &cobra.Command{ - Use: "logs ", - Short: "Show VM logs", - Args: exactArgsUsage(1, "usage: banger vm logs [-f] "), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.VMLogsResult](cmd.Context(), layout.SocketPath, "vm.logs", api.VMRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - if result.LogPath == "" { - return errors.New("vm has no log path") - } - return system.CopyStream(cmd.OutOrStdout(), system.TailCommand(result.LogPath, follow)) - }, - } - cmd.Flags().BoolVarP(&follow, "follow", "f", false, "follow logs") - return cmd -} - -func newVMStatsCommand() *cobra.Command { - return &cobra.Command{ - Use: "stats ", - Short: "Show VM stats", - Args: exactArgsUsage(1, "usage: banger vm stats "), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.VMStatsResult](cmd.Context(), layout.SocketPath, "vm.stats", api.VMRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - return printJSON(cmd.OutOrStdout(), result) - }, - } -} - -func newVMPortsCommand() *cobra.Command { - return &cobra.Command{ - Use: "ports ", - Short: "Show host-reachable listening guest ports", - Args: exactArgsUsage(1, "usage: banger vm ports "), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := vmPortsFunc(cmd.Context(), layout.SocketPath, args[0]) - if err != nil { - return err - } - return printVMPortsTable(cmd.OutOrStdout(), result) - }, - } -} - -func newImageCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "image", - Short: "Manage images", - RunE: helpNoArgs, - } - cmd.AddCommand( - newImageBuildCommand(), - newImageRegisterCommand(), - newImagePromoteCommand(), - newImageListCommand(), - newImageShowCommand(), - newImageDeleteCommand(), - ) - return cmd -} - -func newImageBuildCommand() *cobra.Command { - var params api.ImageBuildParams - cmd := &cobra.Command{ - Use: "build", - Short: "Build an image", - Args: noArgsUsage("usage: banger image build"), - RunE: func(cmd *cobra.Command, args []string) error { - if err := absolutizeImageBuildPaths(¶ms); err != nil { - return err - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.build", params) - if err != nil { - return err - } - return printImageSummary(cmd.OutOrStdout(), result.Image) - }, - } - cmd.Flags().StringVar(¶ms.Name, "name", "", "image name") - cmd.Flags().StringVar(¶ms.FromImage, "from-image", "", "registered base image id or name") - cmd.Flags().StringVar(¶ms.Size, "size", "", "output image size") - cmd.Flags().StringVar(¶ms.KernelPath, "kernel", "", "kernel path") - cmd.Flags().StringVar(¶ms.InitrdPath, "initrd", "", "initrd path") - cmd.Flags().StringVar(¶ms.ModulesDir, "modules", "", "modules dir") - cmd.Flags().BoolVar(¶ms.Docker, "docker", false, "install docker") - return cmd -} - -func newImageRegisterCommand() *cobra.Command { - var params api.ImageRegisterParams - cmd := &cobra.Command{ - Use: "register", - Short: "Register or update an unmanaged image", - Args: noArgsUsage("usage: banger image register --name --rootfs [--work-seed ] --kernel [--initrd ] [--modules ]"), - RunE: func(cmd *cobra.Command, args []string) error { - if err := absolutizeImageRegisterPaths(¶ms); err != nil { - return err - } - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.register", params) - if err != nil { - return err - } - return printImageSummary(cmd.OutOrStdout(), result.Image) - }, - } - cmd.Flags().StringVar(¶ms.Name, "name", "", "image name") - cmd.Flags().StringVar(¶ms.RootfsPath, "rootfs", "", "rootfs path") - cmd.Flags().StringVar(¶ms.WorkSeedPath, "work-seed", "", "work-seed path") - cmd.Flags().StringVar(¶ms.KernelPath, "kernel", "", "kernel path") - cmd.Flags().StringVar(¶ms.InitrdPath, "initrd", "", "initrd path") - cmd.Flags().StringVar(¶ms.ModulesDir, "modules", "", "modules dir") - cmd.Flags().BoolVar(¶ms.Docker, "docker", false, "mark image as docker-prepared") - return cmd -} - -func newImagePromoteCommand() *cobra.Command { - return &cobra.Command{ - Use: "promote ", - Short: "Promote an unmanaged image to a managed artifact", - Args: exactArgsUsage(1, "usage: banger image promote "), - RunE: func(cmd *cobra.Command, args []string) error { - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.promote", api.ImageRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - return printImageSummary(cmd.OutOrStdout(), result.Image) - }, - } -} - -func newImageListCommand() *cobra.Command { - return &cobra.Command{ - Use: "list", - Short: "List images", - Args: noArgsUsage("usage: banger image list"), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.ImageListResult](cmd.Context(), layout.SocketPath, "image.list", api.Empty{}) - if err != nil { - return err - } - return printImageListTable(cmd.OutOrStdout(), result.Images) - }, - } -} - -func newImageShowCommand() *cobra.Command { - return &cobra.Command{ - Use: "show ", - Short: "Show image details", - Args: exactArgsUsage(1, "usage: banger image show "), - RunE: func(cmd *cobra.Command, args []string) error { - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.show", api.ImageRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - return printJSON(cmd.OutOrStdout(), result.Image) - }, - } -} - -func newImageDeleteCommand() *cobra.Command { - return &cobra.Command{ - Use: "delete ", - Short: "Delete an image", - Args: exactArgsUsage(1, "usage: banger image delete "), - RunE: func(cmd *cobra.Command, args []string) error { - if err := system.EnsureSudo(cmd.Context()); err != nil { - return err - } - layout, _, err := ensureDaemon(cmd.Context()) - if err != nil { - return err - } - result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.delete", api.ImageRefParams{IDOrName: args[0]}) - if err != nil { - return err - } - return printImageSummary(cmd.OutOrStdout(), result.Image) - }, - } -} - func helpNoArgs(cmd *cobra.Command, args []string) error { if len(args) != 0 { return fmt.Errorf("unknown arguments: %s", strings.Join(args, " ")) @@ -1006,692 +169,10 @@ func maxArgsUsage(n int, usage string) cobra.PositionalArgs { } } -type resolvedVMTarget struct { - Index int - Ref string - VM model.VMRecord -} - -type vmRefResolutionError struct { - Index int - Ref string - Err error -} - -type vmBatchActionResult struct { - Target resolvedVMTarget - VM model.VMRecord - Err error -} - -func runVMBatchAction(cmd *cobra.Command, socketPath string, refs []string, action func(context.Context, string) (model.VMRecord, error)) error { - listResult, err := rpc.Call[api.VMListResult](cmd.Context(), socketPath, "vm.list", api.Empty{}) - if err != nil { - return err - } - targets, resolutionErrs := resolveVMTargets(listResult.VMs, refs) - results := executeVMActionBatch(cmd.Context(), targets, action) - - failed := false - for _, resolutionErr := range resolutionErrs { - if _, err := fmt.Fprintf(cmd.ErrOrStderr(), "%s: %v\n", resolutionErr.Ref, resolutionErr.Err); err != nil { - return err - } - failed = true - } - for _, result := range results { - if result.Err != nil { - if _, err := fmt.Fprintf(cmd.ErrOrStderr(), "%s: %v\n", result.Target.Ref, result.Err); err != nil { - return err - } - failed = true - continue - } - if err := printVMSummary(cmd.OutOrStdout(), result.VM); err != nil { - return err - } - } - if failed { - return errors.New("one or more VM operations failed") - } - return nil -} - -func resolveVMTargets(vms []model.VMRecord, refs []string) ([]resolvedVMTarget, []vmRefResolutionError) { - targets := make([]resolvedVMTarget, 0, len(refs)) - resolutionErrs := make([]vmRefResolutionError, 0) - seen := make(map[string]struct{}, len(refs)) - for index, ref := range refs { - vm, err := resolveVMRef(vms, ref) - if err != nil { - resolutionErrs = append(resolutionErrs, vmRefResolutionError{Index: index, Ref: ref, Err: err}) - continue - } - if _, ok := seen[vm.ID]; ok { - continue - } - seen[vm.ID] = struct{}{} - targets = append(targets, resolvedVMTarget{Index: index, Ref: ref, VM: vm}) - } - return targets, resolutionErrs -} - -func resolveVMRef(vms []model.VMRecord, ref string) (model.VMRecord, error) { - ref = strings.TrimSpace(ref) - if ref == "" { - return model.VMRecord{}, errors.New("vm id or name is required") - } - exactMatches := make([]model.VMRecord, 0, 1) - for _, vm := range vms { - if vm.ID == ref || vm.Name == ref { - exactMatches = append(exactMatches, vm) - } - } - switch len(exactMatches) { - case 1: - return exactMatches[0], nil - case 0: - default: - return model.VMRecord{}, fmt.Errorf("multiple VMs match %q", ref) - } - - prefixMatches := make([]model.VMRecord, 0, 1) - for _, vm := range vms { - if strings.HasPrefix(vm.ID, ref) || strings.HasPrefix(vm.Name, ref) { - prefixMatches = append(prefixMatches, vm) - } - } - switch len(prefixMatches) { - case 1: - return prefixMatches[0], nil - case 0: - return model.VMRecord{}, fmt.Errorf("vm %q not found", ref) - default: - return model.VMRecord{}, fmt.Errorf("multiple VMs match %q", ref) - } -} - -func executeVMActionBatch(ctx context.Context, targets []resolvedVMTarget, action func(context.Context, string) (model.VMRecord, error)) []vmBatchActionResult { - results := make([]vmBatchActionResult, len(targets)) - var wg sync.WaitGroup - wg.Add(len(targets)) - for index, target := range targets { - index := index - target := target - go func() { - defer wg.Done() - vm, err := action(ctx, target.VM.ID) - results[index] = vmBatchActionResult{ - Target: target, - VM: vm, - Err: err, - } - }() - } - wg.Wait() - return results -} - -func ensureDaemon(ctx context.Context) (paths.Layout, model.DaemonConfig, error) { - layout, err := paths.Resolve() - if err != nil { - return paths.Layout{}, model.DaemonConfig{}, err - } - cfg, err := config.Load(layout) - if err != nil { - return paths.Layout{}, model.DaemonConfig{}, err - } - if ping, err := rpc.Call[api.PingResult](ctx, layout.SocketPath, "ping", api.Empty{}); err == nil { - if daemonOutdated(ping.PID) { - if err := restartDaemon(ctx, layout, ping.PID); err != nil { - return paths.Layout{}, model.DaemonConfig{}, err - } - return layout, cfg, nil - } - return layout, cfg, nil - } - if err := startDaemon(ctx, layout); err != nil { - return paths.Layout{}, model.DaemonConfig{}, err - } - return layout, cfg, nil -} - -func daemonOutdated(pid int) bool { - if pid <= 0 { - return false - } - daemonBin, err := bangerdPathFunc() - if err != nil { - return false - } - currentInfo, err := os.Stat(daemonBin) - if err != nil { - return false - } - runningInfo, err := os.Stat(daemonExePath(pid)) - if err != nil { - return false - } - return !os.SameFile(currentInfo, runningInfo) -} - -func restartDaemon(ctx context.Context, layout paths.Layout, pid int) error { - stopCtx, cancel := context.WithTimeout(ctx, 2*time.Second) - defer cancel() - - _, _ = rpc.Call[api.ShutdownResult](stopCtx, layout.SocketPath, "shutdown", api.Empty{}) - if waitForPIDExit(pid, 2*time.Second) { - return startDaemon(ctx, layout) - } - if proc, err := os.FindProcess(pid); err == nil { - _ = proc.Signal(syscall.SIGTERM) - } - if !waitForPIDExit(pid, 2*time.Second) { - return fmt.Errorf("timed out restarting stale daemon pid %d", pid) - } - return startDaemon(ctx, layout) -} - -func waitForPIDExit(pid int, timeout time.Duration) bool { - deadline := time.Now().Add(timeout) - for time.Now().Before(deadline) { - if !pidRunning(pid) { - return true - } - time.Sleep(50 * time.Millisecond) - } - return !pidRunning(pid) -} - -func pidRunning(pid int) bool { - if pid <= 0 { - return false - } - proc, err := os.FindProcess(pid) - if err != nil { - return false - } - return proc.Signal(syscall.Signal(0)) == nil -} - -func startDaemon(ctx context.Context, layout paths.Layout) error { - if err := paths.Ensure(layout); err != nil { - return err - } - logFile, err := os.OpenFile(layout.DaemonLog, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0o644) - if err != nil { - return err - } - defer logFile.Close() - - daemonBin, err := paths.BangerdPath() - if err != nil { - return err - } - cmd := buildDaemonCommand(daemonBin) - cmd.Stdout = logFile - cmd.Stderr = logFile - cmd.Stdin = nil - cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true} - if err := cmd.Start(); err != nil { - return err - } - if err := rpc.WaitForSocket(layout.SocketPath, 5*time.Second); err != nil { - return fmt.Errorf("daemon failed to start; inspect %s: %w", layout.DaemonLog, err) - } - return nil -} - -func buildDaemonCommand(daemonBin string) *exec.Cmd { - return exec.Command(daemonBin) -} - -func vmSetParamsFromFlags(idOrName string, vcpu, memory int, diskSize string, nat, noNat bool) (api.VMSetParams, error) { - if nat && noNat { - return api.VMSetParams{}, errors.New("use only one of --nat or --no-nat") - } - params := api.VMSetParams{IDOrName: idOrName, WorkDiskSize: diskSize} - if vcpu >= 0 { - if err := validatePositiveSetting("vcpu", vcpu); err != nil { - return api.VMSetParams{}, err - } - params.VCPUCount = &vcpu - } - if memory >= 0 { - if err := validatePositiveSetting("memory", memory); err != nil { - return api.VMSetParams{}, err - } - params.MemoryMiB = &memory - } - if nat || noNat { - value := nat && !noNat - params.NATEnabled = &value - } - if params.VCPUCount == nil && params.MemoryMiB == nil && params.WorkDiskSize == "" && params.NATEnabled == nil { - return api.VMSetParams{}, errors.New("no VM settings changed") - } - return params, nil -} - -func vmCreateParamsFromFlags(cmd *cobra.Command, name, imageName string, vcpu, memory int, systemOverlaySize, workDiskSize string, natEnabled, noStart bool) (api.VMCreateParams, error) { - params := api.VMCreateParams{ - Name: name, - ImageName: imageName, - NATEnabled: natEnabled, - NoStart: noStart, - } - if cmd.Flags().Changed("vcpu") { - if err := validatePositiveSetting("vcpu", vcpu); err != nil { - return api.VMCreateParams{}, err - } - params.VCPUCount = &vcpu - } - if cmd.Flags().Changed("memory") { - if err := validatePositiveSetting("memory", memory); err != nil { - return api.VMCreateParams{}, err - } - params.MemoryMiB = &memory - } - if cmd.Flags().Changed("system-overlay-size") { - params.SystemOverlaySize = systemOverlaySize - } - if cmd.Flags().Changed("disk-size") { - params.WorkDiskSize = workDiskSize - } - return params, nil -} - -func validatePositiveSetting(label string, value int) error { - if value <= 0 { - return fmt.Errorf("%s must be a positive integer", label) - } - return nil -} - -func runSSHSession(ctx context.Context, socketPath, vmRef string, stdin io.Reader, stdout, stderr io.Writer, sshArgs []string) error { - sshErr := sshExecFunc(ctx, stdin, stdout, stderr, sshArgs) - if !shouldCheckSSHReminder(sshErr) || ctx.Err() != nil { - return sshErr - } - pingCtx, cancel := context.WithTimeout(context.Background(), 3*time.Second) - defer cancel() - health, err := vmHealthFunc(pingCtx, socketPath, vmRef) - if err != nil { - _, _ = fmt.Fprintln(stderr, vsockagent.WarningMessage(vmRef, err)) - return sshErr - } - if health.Healthy { - name := health.Name - if strings.TrimSpace(name) == "" { - name = vmRef - } - _, _ = fmt.Fprintln(stderr, vsockagent.ReminderMessage(name)) - } - return sshErr -} - -func shouldCheckSSHReminder(err error) bool { - if err == nil { - return true - } - var exitErr *exec.ExitError - if !errors.As(err, &exitErr) { - return false - } - return exitErr.ExitCode() != 255 -} - -func sshCommandArgs(cfg model.DaemonConfig, guestIP string, extra []string) ([]string, error) { - if guestIP == "" { - return nil, errors.New("vm has no guest IP") - } - args := []string{} - args = append(args, "-F", "/dev/null") - if cfg.SSHKeyPath != "" { - args = append(args, "-i", cfg.SSHKeyPath) - } - args = append( - args, - "-o", "IdentitiesOnly=yes", - "-o", "BatchMode=yes", - "-o", "PreferredAuthentications=publickey", - "-o", "PasswordAuthentication=no", - "-o", "KbdInteractiveAuthentication=no", - "-o", "StrictHostKeyChecking=no", - "-o", "UserKnownHostsFile=/dev/null", - "root@"+guestIP, - ) - args = append(args, extra...) - return args, nil -} - -func validateSSHPrereqs(cfg model.DaemonConfig) error { - checks := system.NewPreflight() - checks.RequireCommand("ssh", "install openssh-client") - if strings.TrimSpace(cfg.SSHKeyPath) != "" { - checks.RequireFile(cfg.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`) - } - return checks.Err("ssh preflight failed") -} - -func validateVMRunPrereqs(cfg model.DaemonConfig) error { - checks := system.NewPreflight() - checks.RequireCommand("git", "install git") - checks.RequireCommand("opencode", "install opencode") - if strings.TrimSpace(cfg.SSHKeyPath) != "" { - checks.RequireFile(cfg.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`) - } - return checks.Err("vm run preflight failed") -} - -func inspectVMRunRepo(ctx context.Context, rawPath, branchName, fromRef string) (vmRunRepoSpec, error) { - sourcePath, err := resolveVMRunSourcePath(rawPath) - if err != nil { - return vmRunRepoSpec{}, err - } - - repoRoot, err := gitTrimmedOutput(ctx, sourcePath, "rev-parse", "--show-toplevel") - if err != nil { - return vmRunRepoSpec{}, fmt.Errorf("%s is not inside a git repository", sourcePath) - } - isBare, err := gitTrimmedOutput(ctx, repoRoot, "rev-parse", "--is-bare-repository") - if err != nil { - return vmRunRepoSpec{}, fmt.Errorf("inspect git repository %s: %w", repoRoot, err) - } - if isBare == "true" { - return vmRunRepoSpec{}, fmt.Errorf("vm run requires a non-bare git repository: %s", repoRoot) - } - if err := ensureVMRunRepoHasNoSubmodules(ctx, repoRoot); err != nil { - return vmRunRepoSpec{}, err - } - - headCommit, err := gitTrimmedOutput(ctx, repoRoot, "rev-parse", "HEAD^{commit}") - if err != nil { - return vmRunRepoSpec{}, fmt.Errorf("git repository %s must have at least one commit", repoRoot) - } - currentBranch, err := gitTrimmedOutput(ctx, repoRoot, "branch", "--show-current") - if err != nil { - return vmRunRepoSpec{}, fmt.Errorf("resolve current branch for %s: %w", repoRoot, err) - } - - baseCommit := headCommit - branchName = strings.TrimSpace(branchName) - if branchName != "" { - fromRef = strings.TrimSpace(fromRef) - if fromRef == "" { - return vmRunRepoSpec{}, errors.New("--from cannot be empty") - } - baseCommit, err = gitTrimmedOutput(ctx, repoRoot, "rev-parse", fromRef+"^{commit}") - if err != nil { - return vmRunRepoSpec{}, fmt.Errorf("resolve --from %q: %w", fromRef, err) - } - } - - overlayPaths, err := listVMRunOverlayPaths(ctx, repoRoot) - if err != nil { - return vmRunRepoSpec{}, err - } - - return vmRunRepoSpec{ - SourcePath: sourcePath, - RepoRoot: repoRoot, - RepoName: filepath.Base(repoRoot), - HeadCommit: headCommit, - CurrentBranch: currentBranch, - BranchName: branchName, - BaseCommit: baseCommit, - OverlayPaths: overlayPaths, - }, nil -} - -func resolveVMRunSourcePath(rawPath string) (string, error) { - if strings.TrimSpace(rawPath) == "" { - wd, err := cwdFunc() - if err != nil { - return "", err - } - rawPath = wd - } - absPath, err := filepath.Abs(rawPath) - if err != nil { - return "", err - } - info, err := os.Stat(absPath) - if err != nil { - return "", err - } - if !info.IsDir() { - return "", fmt.Errorf("%s is not a directory", absPath) - } - return absPath, nil -} - -func ensureVMRunRepoHasNoSubmodules(ctx context.Context, repoRoot string) error { - output, err := gitOutput(ctx, repoRoot, "ls-files", "--stage", "-z") - if err != nil { - return fmt.Errorf("inspect git index for %s: %w", repoRoot, err) - } - for _, record := range parseNullSeparatedOutput(output) { - if strings.HasPrefix(record, "160000 ") { - return fmt.Errorf("vm run does not yet support git submodules: %s", repoRoot) - } - } - return nil -} - -func listVMRunOverlayPaths(ctx context.Context, repoRoot string) ([]string, error) { - trackedOutput, err := gitOutput(ctx, repoRoot, "ls-files", "-z") - if err != nil { - return nil, fmt.Errorf("list tracked files for %s: %w", repoRoot, err) - } - untrackedOutput, err := gitOutput(ctx, repoRoot, "ls-files", "--others", "--exclude-standard", "-z") - if err != nil { - return nil, fmt.Errorf("list untracked files for %s: %w", repoRoot, err) - } - - paths := make([]string, 0) - seen := make(map[string]struct{}) - for _, relPath := range parseNullSeparatedOutput(trackedOutput) { - if relPath == "" { - continue - } - if _, err := os.Lstat(filepath.Join(repoRoot, relPath)); err != nil { - if os.IsNotExist(err) { - continue - } - return nil, err - } - seen[relPath] = struct{}{} - paths = append(paths, relPath) - } - for _, relPath := range parseNullSeparatedOutput(untrackedOutput) { - if relPath == "" { - continue - } - if _, ok := seen[relPath]; ok { - continue - } - seen[relPath] = struct{}{} - paths = append(paths, relPath) - } - sort.Strings(paths) - return paths, nil -} - -func gitOutput(ctx context.Context, dir string, args ...string) ([]byte, error) { - fullArgs := make([]string, 0, len(args)+2) - if strings.TrimSpace(dir) != "" { - fullArgs = append(fullArgs, "-C", dir) - } - fullArgs = append(fullArgs, args...) - return hostCommandOutputFunc(ctx, "git", fullArgs...) -} - -func gitTrimmedOutput(ctx context.Context, dir string, args ...string) (string, error) { - output, err := gitOutput(ctx, dir, args...) - if err != nil { - return "", err - } - return strings.TrimSpace(string(output)), nil -} - -func parseNullSeparatedOutput(output []byte) []string { - chunks := bytes.Split(output, []byte{0}) - values := make([]string, 0, len(chunks)) - for _, chunk := range chunks { - value := strings.TrimSpace(string(chunk)) - if value == "" { - continue - } - values = append(values, value) - } - return values -} - -func runVMRun(ctx context.Context, socketPath string, cfg model.DaemonConfig, stdin io.Reader, stdout, stderr io.Writer, params api.VMCreateParams, spec vmRunRepoSpec) error { - vm, err := runVMCreate(ctx, socketPath, stderr, params) - if err != nil { - return err - } - vmRef := strings.TrimSpace(vm.Name) - if vmRef == "" { - vmRef = shortID(vm.ID) - } - sshAddress := net.JoinHostPort(vm.Runtime.GuestIP, "22") - if err := guestWaitForSSHFunc(ctx, sshAddress, cfg.SSHKeyPath, 250*time.Millisecond); err != nil { - return fmt.Errorf("vm %q is running but guest ssh is unavailable: %w", vmRef, err) - } - client, err := guestDialFunc(ctx, sshAddress, cfg.SSHKeyPath) - if err != nil { - return fmt.Errorf("vm %q is running but guest ssh is unavailable: %w", vmRef, err) - } - defer client.Close() - if err := importVMRunRepoToGuest(ctx, client, spec); err != nil { - return fmt.Errorf("vm %q is running but repo import failed: %w", vmRef, err) - } - if err := runVMRunAttach(ctx, stdin, stdout, stderr, vm.Runtime.GuestIP, vmRunGuestDir(spec.RepoName)); err != nil { - return fmt.Errorf("vm %q is running but opencode attach failed: %w", vmRef, err) - } - return nil -} - -func importVMRunRepoToGuest(ctx context.Context, client vmRunGuestClient, spec vmRunRepoSpec) error { - bundleData, err := createVMRunBundle(ctx, spec) - if err != nil { - return err - } - var uploadLog bytes.Buffer - if err := client.UploadFile(ctx, vmRunGuestBundlePath, 0o600, bundleData, &uploadLog); err != nil { - return formatVMRunStepError("upload git bundle", err, uploadLog.String()) - } - var scriptLog bytes.Buffer - if err := client.RunScript(ctx, vmRunCloneScript(spec), &scriptLog); err != nil { - return formatVMRunStepError("prepare guest checkout", err, scriptLog.String()) - } - var overlayLog bytes.Buffer - remoteCommand := fmt.Sprintf("tar -C %s --strip-components=1 -xf -", shellQuote(vmRunGuestDir(spec.RepoName))) - if err := client.StreamTarEntries(ctx, spec.RepoRoot, spec.OverlayPaths, remoteCommand, &overlayLog); err != nil { - return formatVMRunStepError("overlay host working tree", err, overlayLog.String()) - } - return nil -} - -func createVMRunBundle(ctx context.Context, spec vmRunRepoSpec) ([]byte, error) { - tempFile, err := os.CreateTemp("", "banger-vm-run-*.bundle") - if err != nil { - return nil, err - } - tempPath := tempFile.Name() - if err := tempFile.Close(); err != nil { - _ = os.Remove(tempPath) - return nil, err - } - defer os.Remove(tempPath) - - args := []string{"-C", spec.RepoRoot, "bundle", "create", tempPath, "--all"} - for _, rev := range uniqueNonEmptyStrings(spec.HeadCommit, spec.BaseCommit) { - args = append(args, rev) - } - if _, err := hostCommandOutputFunc(ctx, "git", args...); err != nil { - return nil, fmt.Errorf("create git bundle: %w", err) - } - data, err := os.ReadFile(tempPath) - if err != nil { - return nil, fmt.Errorf("read git bundle: %w", err) - } - return data, nil -} - -func vmRunCloneScript(spec vmRunRepoSpec) string { - guestDir := vmRunGuestDir(spec.RepoName) - var script strings.Builder - script.WriteString("set -euo pipefail\n") - fmt.Fprintf(&script, "DIR=%s\n", shellQuote(guestDir)) - fmt.Fprintf(&script, "BUNDLE=%s\n", shellQuote(vmRunGuestBundlePath)) - script.WriteString("rm -rf \"$DIR\"\n") - script.WriteString("git clone \"$BUNDLE\" \"$DIR\"\n") - script.WriteString("rm -f \"$BUNDLE\"\n") - switch { - case strings.TrimSpace(spec.BranchName) != "": - fmt.Fprintf(&script, "git -C \"$DIR\" checkout -B %s %s\n", shellQuote(spec.BranchName), shellQuote(spec.BaseCommit)) - case strings.TrimSpace(spec.CurrentBranch) != "": - fmt.Fprintf(&script, "git -C \"$DIR\" checkout -B %s %s\n", shellQuote(spec.CurrentBranch), shellQuote(spec.HeadCommit)) - default: - fmt.Fprintf(&script, "git -C \"$DIR\" checkout --detach %s\n", shellQuote(spec.HeadCommit)) - } - script.WriteString("find \"$DIR\" -mindepth 1 -maxdepth 1 ! -name .git -exec rm -rf {} +\n") - return script.String() -} - -func vmRunGuestDir(repoName string) string { - return filepath.ToSlash(filepath.Join("/root", repoName)) -} - -func runVMRunAttach(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, guestIP, guestDir string) error { - guestIP = strings.TrimSpace(guestIP) - if guestIP == "" { - return errors.New("vm has no guest IP") - } - return opencodeExecFunc(ctx, stdin, stdout, stderr, []string{ - "attach", - "--dir", guestDir, - "http://" + net.JoinHostPort(guestIP, "4096"), - }) -} - -func formatVMRunStepError(action string, err error, log string) error { - log = strings.TrimSpace(log) - if log == "" { - return fmt.Errorf("%s: %w", action, err) - } - return fmt.Errorf("%s: %w: %s", action, err, log) -} - -func uniqueNonEmptyStrings(values ...string) []string { - unique := make([]string, 0, len(values)) - seen := make(map[string]struct{}, len(values)) - for _, value := range values { - value = strings.TrimSpace(value) - if value == "" { - continue - } - if _, ok := seen[value]; ok { - continue - } - seen[value] = struct{}{} - unique = append(unique, value) - } - return unique -} - func shellQuote(value string) string { return "'" + strings.ReplaceAll(value, "'", `'"'"'`) + "'" } -func absolutizeImageBuildPaths(params *api.ImageBuildParams) error { - return absolutizePaths(¶ms.KernelPath, ¶ms.InitrdPath, ¶ms.ModulesDir) -} - func absolutizeImageRegisterPaths(params *api.ImageRegisterParams) error { return absolutizePaths( ¶ms.RootfsPath, @@ -1716,353 +197,20 @@ func absolutizePaths(values ...*string) error { return nil } -func printJSON(out anyWriter, v any) error { - data, err := json.MarshalIndent(v, "", " ") - if err != nil { - return err - } - _, err = fmt.Fprintln(out, string(data)) - return err +func formatBuildInfoBlock(info buildinfo.Info) string { + return fmt.Sprintf("version: %s\ncommit: %s\nbuilt_at: %s\n", info.Version, info.Commit, info.BuiltAt) } -func printVMSummary(out anyWriter, vm model.VMRecord) error { - _, err := fmt.Fprintf( - out, - "%s\t%s\t%s\t%s\t%s\t%s\n", - shortID(vm.ID), - vm.Name, - vm.State, - vm.Runtime.GuestIP, - model.FormatSizeBytes(vm.Spec.WorkDiskSizeBytes), - vm.Runtime.DNSName, - ) - return err -} - -func printVMListTable(out anyWriter, vms []model.VMRecord, imageNames map[string]string) error { - w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) - if _, err := fmt.Fprintln(w, "ID\tNAME\tSTATE\tIMAGE\tIP\tVCPU\tMEM\tDISK\tCREATED"); err != nil { - return err - } - for _, vm := range vms { - if _, err := fmt.Fprintf( - w, - "%s\t%s\t%s\t%s\t%s\t%d\t%d MiB\t%s\t%s\n", - shortID(vm.ID), - vm.Name, - vm.State, - vmImageLabel(vm.ImageID, imageNames), - vm.Runtime.GuestIP, - vm.Spec.VCPUCount, - vm.Spec.MemoryMiB, - model.FormatSizeBytes(vm.Spec.WorkDiskSizeBytes), - relativeTime(vm.CreatedAt), - ); err != nil { - return err - } - } - return w.Flush() -} - -func printImageSummary(out anyWriter, image model.Image) error { - _, err := fmt.Fprintf(out, "%s\t%s\t%t\t%s\n", shortID(image.ID), image.Name, image.Managed, image.RootfsPath) - return err -} - -func imageNameIndex(images []model.Image) map[string]string { - index := make(map[string]string, len(images)) - for _, image := range images { - index[image.ID] = image.Name - } - return index -} - -func vmImageLabel(imageID string, imageNames map[string]string) string { - if name := strings.TrimSpace(imageNames[imageID]); name != "" { - return name - } - return shortID(imageID) -} - -func printImageListTable(out anyWriter, images []model.Image) error { - w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) - if _, err := fmt.Fprintln(w, "ID\tNAME\tMANAGED\tROOTFS SIZE\tCREATED"); err != nil { - return err - } - for _, image := range images { - if _, err := fmt.Fprintf( - w, - "%s\t%s\t%t\t%s\t%s\n", - shortID(image.ID), - image.Name, - image.Managed, - rootfsSizeLabel(image.RootfsPath), - relativeTime(image.CreatedAt), - ); err != nil { - return err - } - } - return w.Flush() -} - -func rootfsSizeLabel(path string) string { - info, err := os.Stat(path) - if err != nil { - return "-" - } - if info.Size() <= 0 { - return "0" - } - return model.FormatSizeBytes(info.Size()) -} - -func printVMPortsTable(out anyWriter, result api.VMPortsResult) error { - type portRow struct { - Proto string - Endpoint string - Process string - Command string - Port int - } - rows := make([]portRow, 0, len(result.Ports)) - for _, port := range result.Ports { - rows = append(rows, portRow{ - Proto: port.Proto, - Endpoint: port.Endpoint, - Process: port.Process, - Command: port.Command, - Port: port.Port, - }) - } - sort.Slice(rows, func(i, j int) bool { - if rows[i].Proto != rows[j].Proto { - return rows[i].Proto < rows[j].Proto - } - if rows[i].Port != rows[j].Port { - return rows[i].Port < rows[j].Port - } - if rows[i].Process != rows[j].Process { - return rows[i].Process < rows[j].Process - } - return rows[i].Command < rows[j].Command - }) - if len(rows) == 0 { - return nil - } - - w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) - if _, err := fmt.Fprintln(w, "PROTO\tENDPOINT\tPROCESS\tCOMMAND"); err != nil { - return err - } - for _, row := range rows { - if _, err := fmt.Fprintf( - w, - "%s\t%s\t%s\t%s\n", - row.Proto, - emptyDash(row.Endpoint), - emptyDash(row.Process), - emptyDash(row.Command), - ); err != nil { - return err - } - } - return w.Flush() -} - -func printDoctorReport(out anyWriter, report system.Report) error { - for _, check := range report.Checks { - status := strings.ToUpper(string(check.Status)) - if _, err := fmt.Fprintf(out, "%s\t%s\n", status, check.Name); err != nil { - return err - } - for _, detail := range check.Details { - if _, err := fmt.Fprintf(out, " - %s\n", detail); err != nil { - return err - } - } - } - return nil -} - -func emptyDash(value string) string { - value = strings.TrimSpace(value) - if value == "" { - return "-" - } - return value -} - -type anyWriter interface { - Write(p []byte) (n int, err error) -} - -func runVMCreate(ctx context.Context, socketPath string, stderr io.Writer, params api.VMCreateParams) (model.VMRecord, error) { - begin, err := vmCreateBeginFunc(ctx, socketPath, params) - if err != nil { - return model.VMRecord{}, err - } - renderer := newVMCreateProgressRenderer(stderr) - renderer.render(begin.Operation) - - op := begin.Operation - for { - if op.Done { - renderer.render(op) - if op.Success && op.VM != nil { - return *op.VM, nil - } - if strings.TrimSpace(op.Error) == "" { - return model.VMRecord{}, errors.New("vm create failed") - } - return model.VMRecord{}, errors.New(op.Error) - } - - select { - case <-ctx.Done(): - cancelCtx, cancel := context.WithTimeout(context.Background(), time.Second) - defer cancel() - _ = vmCreateCancelFunc(cancelCtx, socketPath, op.ID) - return model.VMRecord{}, ctx.Err() - case <-time.After(200 * time.Millisecond): - } - - status, err := vmCreateStatusFunc(ctx, socketPath, op.ID) - if err != nil { - if ctx.Err() != nil { - cancelCtx, cancel := context.WithTimeout(context.Background(), time.Second) - defer cancel() - _ = vmCreateCancelFunc(cancelCtx, socketPath, op.ID) - return model.VMRecord{}, ctx.Err() - } - return model.VMRecord{}, err - } - op = status.Operation - renderer.render(op) - } -} - -type vmCreateProgressRenderer struct { - out io.Writer - enabled bool - lastLine string -} - -func newVMCreateProgressRenderer(out io.Writer) *vmCreateProgressRenderer { - return &vmCreateProgressRenderer{ - out: out, - enabled: writerSupportsProgress(out), - } -} - -func (r *vmCreateProgressRenderer) render(op api.VMCreateOperation) { - if r == nil || !r.enabled { - return - } - line := formatVMCreateProgress(op) - if line == "" || line == r.lastLine { - return - } - r.lastLine = line - _, _ = fmt.Fprintln(r.out, line) -} - -func writerSupportsProgress(out io.Writer) bool { - file, ok := out.(*os.File) - if !ok { - return false - } - info, err := file.Stat() - if err != nil { - return false - } - return info.Mode()&os.ModeCharDevice != 0 -} - -func formatVMCreateProgress(op api.VMCreateOperation) string { - stage := strings.TrimSpace(op.Stage) - detail := strings.TrimSpace(op.Detail) - label := vmCreateStageLabel(stage) - if label == "" && detail == "" { - return "" - } - if label == "" { - return "[vm create] " + detail - } - if detail == "" { - return "[vm create] " + label - } - return "[vm create] " + label + ": " + detail -} - -func vmCreateStageLabel(stage string) string { - switch strings.TrimSpace(stage) { - case "queued": - return "queued" - case "resolve_image": - return "resolving image" - case "reserve_vm": - return "allocating vm" - case "preflight": - return "checking host prerequisites" - case "prepare_rootfs": - return "preparing root filesystem" - case "prepare_host_features": - return "preparing host features" - case "prepare_work_disk": - return "preparing work disk" - case "boot_firecracker": - return "starting firecracker" - case "wait_vsock_agent": - return "waiting for vsock agent" - case "wait_guest_ready": - return "waiting for guest services" - case "wait_opencode": - return "waiting for opencode" - case "apply_dns": - return "publishing dns" - case "apply_nat": - return "configuring nat" - case "finalize": - return "finalizing" - case "ready": - return "ready" - default: - return strings.ReplaceAll(stage, "_", " ") - } -} - -func shortID(id string) string { - if len(id) <= 12 { - return id - } - return id[:12] -} - -func relativeTime(t time.Time) string { - if t.IsZero() { - return "-" - } - delta := time.Since(t) - switch { - case delta < 30*time.Second: - return "moments ago" - case delta < time.Minute: - return fmt.Sprintf("%d seconds ago", int(delta.Seconds())) - case delta < 2*time.Minute: - return "1 minute ago" - case delta < time.Hour: - return fmt.Sprintf("%d minutes ago", int(delta.Minutes())) - case delta < 2*time.Hour: - return "1 hour ago" - case delta < 24*time.Hour: - return fmt.Sprintf("%d hours ago", int(delta.Hours())) - case delta < 48*time.Hour: - return "1 day ago" - case delta < 7*24*time.Hour: - return fmt.Sprintf("%d days ago", int(delta.Hours()/24)) - case delta < 14*24*time.Hour: - return "1 week ago" - default: - return fmt.Sprintf("%d weeks ago", int(delta.Hours()/(24*7))) - } +// formatVersionLine renders a buildinfo.Info as a single line — +// "banger v0.1.0 (commit abcd1234, built 2026-04-28T20:45:50Z)" — for +// the `--version` flag. Long commit strings are truncated to the +// first 8 hex chars so the line stays scannable. The verbose +// multi-line form lives on `banger version` for callers that want +// the full SHA / built_at on separate lines. +func formatVersionLine(info buildinfo.Info) string { + commit := info.Commit + if len(commit) > 8 { + commit = commit[:8] + } + return fmt.Sprintf("banger %s (commit %s, built %s)", info.Version, commit, info.BuiltAt) } diff --git a/internal/cli/bangerd.go b/internal/cli/bangerd.go index 13c55a1..c1d2867 100644 --- a/internal/cli/bangerd.go +++ b/internal/cli/bangerd.go @@ -1,20 +1,55 @@ package cli import ( + "errors" + "fmt" + "os" + "strings" + + "banger/internal/buildinfo" "banger/internal/daemon" + "banger/internal/paths" + "banger/internal/roothelper" + "banger/internal/store" "github.com/spf13/cobra" ) +// bangerdExit is var-injected so tests can capture the exit code +// without terminating the test process. Production points at os.Exit. +var bangerdExit = os.Exit + func NewBangerdCommand() *cobra.Command { + var systemMode bool + var rootHelperMode bool + var checkMigrations bool cmd := &cobra.Command{ Use: "bangerd", + Version: strings.Replace(formatVersionLine(buildinfo.Current()), "banger ", "bangerd ", 1), Short: "Run the banger daemon", SilenceUsage: true, SilenceErrors: true, Args: noArgsUsage("usage: bangerd"), RunE: func(cmd *cobra.Command, args []string) error { - d, err := daemon.Open(cmd.Context()) + if systemMode && rootHelperMode { + return errors.New("choose only one of --system or --root-helper") + } + if checkMigrations { + return runCheckMigrations(cmd, systemMode) + } + if rootHelperMode { + server, err := roothelper.Open() + if err != nil { + return err + } + defer server.Close() + return server.Serve(cmd.Context()) + } + open := daemon.Open + if systemMode { + open = daemon.OpenSystem + } + d, err := open(cmd.Context()) if err != nil { return err } @@ -22,6 +57,71 @@ func NewBangerdCommand() *cobra.Command { return d.Serve(cmd.Context()) }, } + cmd.Flags().BoolVar(&systemMode, "system", false, "run as the owner-user system service") + cmd.Flags().BoolVar(&rootHelperMode, "root-helper", false, "run as the privileged root helper service") + cmd.Flags().BoolVar(&checkMigrations, "check-migrations", false, "inspect the state DB and report whether this binary's schema matches; exit 0=compatible, 1=migrations needed, 2=incompatible") + cmd.SetVersionTemplate("{{.Version}}\n") cmd.CompletionOptions.DisableDefaultCmd = true return cmd } + +// runCheckMigrations is the entry point for `bangerd --check-migrations`. +// Used by `banger update` to gate a binary swap on a staged binary +// before service restart: if the staged binary doesn't recognise the +// running install's schema, the swap is aborted before any host state +// changes. +// +// Exit codes are part of the contract: +// +// 0 — compatible (no migrations to apply on Open) +// 1 — migrations needed (binary newer than DB; safe to swap) +// 2 — incompatible (DB has migrations this binary doesn't know; +// swapping would leave the daemon unable to open the store) +func runCheckMigrations(cmd *cobra.Command, systemMode bool) error { + layout := paths.ResolveSystem() + if !systemMode { + userLayout, err := paths.Resolve() + if err != nil { + return err + } + layout = userLayout + } + state, err := store.InspectSchemaState(layout.DBPath) + if err != nil { + return fmt.Errorf("inspect %s: %w", layout.DBPath, err) + } + out := cmd.OutOrStdout() + switch state.Compatibility { + case store.SchemaCompatible: + fmt.Fprintf(out, "compatible: db at v%d, binary knows up to v%d\n", lastID(state.AppliedIDs), state.KnownMaxID) + return nil + case store.SchemaMigrationsNeeded: + fmt.Fprintf(out, "migrations needed: pending %v (binary will apply on first Open)\n", state.Pending) + // Distinct exit code so callers can tell "safe to swap, will + // auto-migrate" apart from "compatible, no work pending". + // Returning a cobra error would also exit non-zero, but we + // want a specific code (1) — and we don't want SilenceErrors + // to print our message twice. + bangerdExit(1) + return nil + case store.SchemaIncompatible: + fmt.Fprintf(out, "incompatible: db has unknown migrations %v (binary knows up to v%d)\n", state.Unknown, state.KnownMaxID) + bangerdExit(2) + return nil + default: + return fmt.Errorf("unexpected schema-state classification %d", state.Compatibility) + } +} + +// lastID returns the largest int in xs, or 0 when empty. The schema- +// migrations table doesn't guarantee insert order, so we scan rather +// than trusting xs[len-1]. +func lastID(xs []int) int { + max := 0 + for _, x := range xs { + if x > max { + max = x + } + } + return max +} diff --git a/internal/cli/bangerd_test.go b/internal/cli/bangerd_test.go new file mode 100644 index 0000000..fa60b76 --- /dev/null +++ b/internal/cli/bangerd_test.go @@ -0,0 +1,194 @@ +package cli + +import ( + "bytes" + "database/sql" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/store" + + "github.com/spf13/cobra" + _ "modernc.org/sqlite" +) + +func TestNewBangerdCommandSubcommands(t *testing.T) { + cmd := NewBangerdCommand() + if cmd.Use != "bangerd" { + t.Errorf("Use = %q, want bangerd", cmd.Use) + } + for _, flag := range []string{"system", "root-helper", "check-migrations"} { + if cmd.Flag(flag) == nil { + t.Errorf("flag %q missing", flag) + } + } +} + +func TestLastID(t *testing.T) { + tests := []struct { + name string + in []int + want int + }{ + {"nil", nil, 0}, + {"empty", []int{}, 0}, + {"single", []int{7}, 7}, + {"sorted ascending", []int{1, 2, 3}, 3}, + {"unsorted, max in middle", []int{1, 99, 5}, 99}, + {"duplicates", []int{4, 4, 2, 4}, 4}, + {"negative ignored", []int{-3, -1, 0}, 0}, + } + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + if got := lastID(tc.in); got != tc.want { + t.Fatalf("lastID(%v) = %d, want %d", tc.in, got, tc.want) + } + }) + } +} + +// stubExit replaces bangerdExit for the test and returns a pointer to +// the captured exit code (-1 = not called) and a restore func. +func stubExit(t *testing.T) *int { + t.Helper() + called := -1 + prev := bangerdExit + bangerdExit = func(code int) { called = code } + t.Cleanup(func() { bangerdExit = prev }) + return &called +} + +// pointHomeAtTempDB sets XDG_STATE_HOME (and HOME, which Resolve falls +// back to) so that paths.Resolve().DBPath lands at /banger/state.db. +// Returns the DB path. +func pointHomeAtTempDB(t *testing.T) string { + t.Helper() + tmp := t.TempDir() + t.Setenv("HOME", tmp) + t.Setenv("XDG_STATE_HOME", tmp) + t.Setenv("XDG_CONFIG_HOME", tmp) + t.Setenv("XDG_CACHE_HOME", tmp) + t.Setenv("XDG_RUNTIME_DIR", tmp) + dir := filepath.Join(tmp, "banger") + if err := os.MkdirAll(dir, 0o700); err != nil { + t.Fatalf("mkdir state dir: %v", err) + } + return filepath.Join(dir, "state.db") +} + +func TestRunCheckMigrationsCompatible(t *testing.T) { + dbPath := pointHomeAtTempDB(t) + s, err := store.Open(dbPath) + if err != nil { + t.Fatalf("store.Open: %v", err) + } + _ = s.Close() + + exit := stubExit(t) + cmd := &cobra.Command{} + var out bytes.Buffer + cmd.SetOut(&out) + + if err := runCheckMigrations(cmd, false); err != nil { + t.Fatalf("runCheckMigrations: %v", err) + } + if *exit != -1 { + t.Errorf("bangerdExit called with %d, want no call", *exit) + } + if !strings.HasPrefix(out.String(), "compatible:") { + t.Errorf("stdout = %q, want prefix \"compatible:\"", out.String()) + } +} + +func TestRunCheckMigrationsMigrationsNeeded(t *testing.T) { + dbPath := pointHomeAtTempDB(t) + // Hand-craft a DB that has schema_migrations with only the baseline + // row — InspectSchemaState classifies this as "migrations needed". + dsn := "file:" + dbPath + "?_pragma=foreign_keys(1)" + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + if _, err := db.Exec(`CREATE TABLE schema_migrations (id INTEGER PRIMARY KEY, name TEXT NOT NULL, applied_at TEXT NOT NULL)`); err != nil { + t.Fatalf("create table: %v", err) + } + if _, err := db.Exec(`INSERT INTO schema_migrations VALUES (1, 'baseline', '2026-01-01T00:00:00Z')`); err != nil { + t.Fatalf("insert baseline: %v", err) + } + _ = db.Close() + + exit := stubExit(t) + cmd := &cobra.Command{} + var out bytes.Buffer + cmd.SetOut(&out) + + if err := runCheckMigrations(cmd, false); err != nil { + t.Fatalf("runCheckMigrations: %v", err) + } + if *exit != 1 { + t.Errorf("bangerdExit called with %d, want 1", *exit) + } + if !strings.HasPrefix(out.String(), "migrations needed:") { + t.Errorf("stdout = %q, want prefix \"migrations needed:\"", out.String()) + } +} + +func TestRunCheckMigrationsIncompatible(t *testing.T) { + dbPath := pointHomeAtTempDB(t) + s, err := store.Open(dbPath) + if err != nil { + t.Fatalf("store.Open: %v", err) + } + _ = s.Close() + + // Inject an unknown migration id directly so the binary's known set + // is a strict subset — InspectSchemaState classifies as incompatible. + dsn := "file:" + dbPath + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + if _, err := db.Exec(`INSERT INTO schema_migrations VALUES (9999, 'from_the_future', '2030-01-01T00:00:00Z')`); err != nil { + t.Fatalf("insert future row: %v", err) + } + _ = db.Close() + + exit := stubExit(t) + cmd := &cobra.Command{} + var out bytes.Buffer + cmd.SetOut(&out) + + if err := runCheckMigrations(cmd, false); err != nil { + t.Fatalf("runCheckMigrations: %v", err) + } + if *exit != 2 { + t.Errorf("bangerdExit called with %d, want 2", *exit) + } + if !strings.HasPrefix(out.String(), "incompatible:") { + t.Errorf("stdout = %q, want prefix \"incompatible:\"", out.String()) + } +} + +func TestRunCheckMigrationsInspectError(t *testing.T) { + // Point at a state dir with a non-DB file at state.db so Inspect + // fails to open it. The function should wrap the error with the path. + dbPath := pointHomeAtTempDB(t) + if err := os.WriteFile(dbPath, []byte("not a sqlite file"), 0o600); err != nil { + t.Fatalf("write garbage: %v", err) + } + + stubExit(t) + cmd := &cobra.Command{} + var out bytes.Buffer + cmd.SetOut(&out) + + err := runCheckMigrations(cmd, false) + if err == nil { + t.Fatal("runCheckMigrations: nil error, want wrapped inspect error") + } + if !strings.Contains(err.Error(), dbPath) { + t.Errorf("error %q does not mention DB path %q", err.Error(), dbPath) + } +} diff --git a/internal/cli/cli_test.go b/internal/cli/cli_test.go index 5083811..f39a962 100644 --- a/internal/cli/cli_test.go +++ b/internal/cli/cli_test.go @@ -15,8 +15,13 @@ import ( "time" "banger/internal/api" + "banger/internal/buildinfo" + "banger/internal/daemon/workspace" "banger/internal/model" "banger/internal/system" + "banger/internal/toolingplan" + + "github.com/spf13/cobra" ) func TestNewBangerCommandHasExpectedSubcommands(t *testing.T) { @@ -25,27 +30,90 @@ func TestNewBangerCommandHasExpectedSubcommands(t *testing.T) { for _, sub := range cmd.Commands() { names = append(names, sub.Name()) } - want := []string{"daemon", "doctor", "image", "internal", "vm"} + want := []string{"daemon", "doctor", "image", "internal", "kernel", "ps", "ssh-config", "system", "update", "version", "vm"} if !reflect.DeepEqual(names, want) { t.Fatalf("subcommands = %v, want %v", names, want) } } -func TestLegacyRemovedCommandIsRejected(t *testing.T) { +func TestVersionCommandPrintsBuildInfo(t *testing.T) { cmd := NewBangerCommand() - cmd.SetArgs([]string{"tui"}) - err := cmd.Execute() - if err == nil || !strings.Contains(err.Error(), "unknown command \"tui\"") { - t.Fatalf("Execute() error = %v, want unknown legacy command", err) + var stdout bytes.Buffer + cmd.SetOut(&stdout) + cmd.SetErr(&stdout) + cmd.SetArgs([]string{"version"}) + + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } + + info := buildinfo.Current() + output := stdout.String() + for _, want := range []string{ + "version: " + info.Version, + "commit: " + info.Commit, + "built_at: " + info.BuiltAt, + } { + if !strings.Contains(output, want) { + t.Fatalf("output = %q, want %q", output, want) + } + } +} + +func TestImageCommandIncludesPull(t *testing.T) { + cmd := NewBangerCommand() + var image *cobra.Command + for _, sub := range cmd.Commands() { + if sub.Name() == "image" { + image = sub + break + } + } + if image == nil { + t.Fatalf("image command missing from root") + } + hasPull := false + for _, sub := range image.Commands() { + if sub.Name() == "pull" { + hasPull = true + if flag := sub.Flags().Lookup("kernel-ref"); flag == nil { + t.Errorf("image pull missing --kernel-ref flag") + } + if flag := sub.Flags().Lookup("size"); flag == nil { + t.Errorf("image pull missing --size flag") + } + } + } + if !hasPull { + t.Fatalf("image pull subcommand missing") + } +} + +func TestKernelCommandExposesSubcommands(t *testing.T) { + cmd := NewBangerCommand() + var kernel *cobra.Command + for _, sub := range cmd.Commands() { + if sub.Name() == "kernel" { + kernel = sub + break + } + } + if kernel == nil { + t.Fatalf("kernel command missing from root") + } + names := []string{} + for _, sub := range kernel.Commands() { + names = append(names, sub.Name()) + } + want := []string{"import", "list", "pull", "rm", "show"} + if !reflect.DeepEqual(names, want) { + t.Fatalf("kernel subcommands = %v, want %v", names, want) } } func TestDoctorCommandPrintsReportAndFailsOnHardFailures(t *testing.T) { - original := doctorFunc - t.Cleanup(func() { - doctorFunc = original - }) - doctorFunc = func(context.Context) (system.Report, error) { + d := defaultDeps() + d.doctor = func(context.Context) (system.Report, error) { return system.Report{ Checks: []system.CheckResult{ {Name: "runtime bundle", Status: system.CheckStatusPass, Details: []string{"runtime dir /tmp/runtime"}}, @@ -54,7 +122,7 @@ func TestDoctorCommandPrintsReportAndFailsOnHardFailures(t *testing.T) { }, nil } - cmd := NewBangerCommand() + cmd := d.newRootCommand() var stdout bytes.Buffer cmd.SetOut(&stdout) cmd.SetErr(&stdout) @@ -65,24 +133,24 @@ func TestDoctorCommandPrintsReportAndFailsOnHardFailures(t *testing.T) { t.Fatalf("Execute() error = %v, want doctor failure", err) } output := stdout.String() - if !strings.Contains(output, "PASS\truntime bundle") { - t.Fatalf("output = %q, want runtime bundle pass", output) + if strings.Contains(output, "PASS\truntime bundle") { + t.Fatalf("output = %q, brief default should hide PASS rows", output) } if !strings.Contains(output, "FAIL\tfeature nat") { t.Fatalf("output = %q, want feature nat fail", output) } + if !strings.Contains(output, "1 passed, 0 warnings, 1 failure") { + t.Fatalf("output = %q, want summary footer", output) + } } func TestDoctorCommandReturnsUnderlyingError(t *testing.T) { - original := doctorFunc - t.Cleanup(func() { - doctorFunc = original - }) - doctorFunc = func(context.Context) (system.Report, error) { + d := defaultDeps() + d.doctor = func(context.Context) (system.Report, error) { return system.Report{}, errors.New("load failed") } - cmd := NewBangerCommand() + cmd := d.newRootCommand() cmd.SetArgs([]string{"doctor"}) err := cmd.Execute() if err == nil || !strings.Contains(err.Error(), "load failed") { @@ -111,22 +179,45 @@ func TestInternalNATFlagsExist(t *testing.T) { } } -func TestInternalPackagesCommandSupportsAlpine(t *testing.T) { - cmd := NewBangerCommand() - var stdout bytes.Buffer - cmd.SetOut(&stdout) - cmd.SetArgs([]string{"internal", "packages", "alpine"}) - - if err := cmd.Execute(); err != nil { - t.Fatalf("Execute(): %v", err) +func TestPSAndVMListAliasesAndFlagsExist(t *testing.T) { + root := NewBangerCommand() + ps, _, err := root.Find([]string{"ps"}) + if err != nil { + t.Fatalf("find ps: %v", err) } - - output := stdout.String() - for _, want := range []string{"alpine-base", "docker", "libgcc", "libstdc++", "mkinitfs", "openssh"} { - if !strings.Contains(output, want+"\n") { - t.Fatalf("output = %q, want package %q", output, want) + for _, flagName := range []string{"all", "latest", "quiet"} { + if ps.Flags().Lookup(flagName) == nil { + t.Fatalf("missing ps flag %q", flagName) } } + vm, _, err := root.Find([]string{"vm"}) + if err != nil { + t.Fatalf("find vm: %v", err) + } + list, _, err := vm.Find([]string{"list"}) + if err != nil { + t.Fatalf("find list: %v", err) + } + if _, _, err := vm.Find([]string{"ls"}); err != nil { + t.Fatalf("find ls alias: %v", err) + } + if _, _, err := vm.Find([]string{"ps"}); err != nil { + t.Fatalf("find ps alias: %v", err) + } + for _, flagName := range []string{"all", "latest", "quiet"} { + if list.Flags().Lookup(flagName) == nil { + t.Fatalf("missing vm list flag %q", flagName) + } + } +} + +func TestPSCommandRejectsArgs(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"ps", "extra"}) + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "usage: banger ps") { + t.Fatalf("Execute() error = %v, want ps usage error", err) + } } func TestVMCreateFlagsExist(t *testing.T) { @@ -166,7 +257,11 @@ func TestVMRunFlagsExist(t *testing.T) { } } -func TestVMCreateFlagsShowStaticDefaults(t *testing.T) { +func TestVMCreateFlagsShowResolvedDefaults(t *testing.T) { + // Defaults are resolved at command-build time from config + host + // heuristics. Guarantee only that the values are sensible-positive + // and match the resolver's output — the exact numbers depend on + // the host the tests run on. root := NewBangerCommand() vm, _, err := root.Find([]string{"vm"}) if err != nil { @@ -177,17 +272,23 @@ func TestVMCreateFlagsShowStaticDefaults(t *testing.T) { t.Fatalf("find create: %v", err) } - if got := create.Flags().Lookup("vcpu").DefValue; got != fmt.Sprintf("%d", model.DefaultVCPUCount) { - t.Fatalf("vcpu default = %q, want %d", got, model.DefaultVCPUCount) + for _, flagName := range []string{"vcpu", "memory"} { + flag := create.Flags().Lookup(flagName) + if flag == nil { + t.Fatalf("flag %q missing", flagName) + } + if flag.DefValue == "" || flag.DefValue == "0" { + t.Errorf("flag %q default = %q, want a positive integer", flagName, flag.DefValue) + } } - if got := create.Flags().Lookup("memory").DefValue; got != fmt.Sprintf("%d", model.DefaultMemoryMiB) { - t.Fatalf("memory default = %q, want %d", got, model.DefaultMemoryMiB) - } - if got := create.Flags().Lookup("system-overlay-size").DefValue; got != model.FormatSizeBytes(model.DefaultSystemOverlaySize) { - t.Fatalf("system-overlay-size default = %q, want %q", got, model.FormatSizeBytes(model.DefaultSystemOverlaySize)) - } - if got := create.Flags().Lookup("disk-size").DefValue; got != model.FormatSizeBytes(model.DefaultWorkDiskSize) { - t.Fatalf("disk-size default = %q, want %q", got, model.FormatSizeBytes(model.DefaultWorkDiskSize)) + for _, flagName := range []string{"system-overlay-size", "disk-size"} { + flag := create.Flags().Lookup(flagName) + if flag == nil { + t.Fatalf("flag %q missing", flagName) + } + if !strings.ContainsAny(flag.DefValue, "GMK") { + t.Errorf("flag %q default = %q, want a formatted size like '8G'", flagName, flag.DefValue) + } } } @@ -211,7 +312,7 @@ func TestImageRegisterFlagsExist(t *testing.T) { if err != nil { t.Fatalf("find register: %v", err) } - for _, flagName := range []string{"name", "rootfs", "work-seed", "kernel", "initrd", "modules", "docker"} { + for _, flagName := range []string{"name", "rootfs", "work-seed", "kernel", "initrd", "modules"} { if register.Flags().Lookup(flagName) == nil { t.Fatalf("missing flag %q", flagName) } @@ -283,7 +384,11 @@ func TestVMSetParamsFromFlags(t *testing.T) { } } -func TestVMCreateParamsFromFlagsOmitsStaticDefaultsWhenFlagsAreUnchanged(t *testing.T) { +func TestVMCreateParamsFromFlagsAlwaysPopulatesResolvedValues(t *testing.T) { + // Post-resolver behavior: the CLI is the single source of truth for + // effective defaults. Whether or not the user changed a flag, the + // daemon receives the explicit value so the spec printed to the + // user matches the VM that gets created. cmd := NewBangerCommand() vm, _, err := cmd.Find([]string{"vm"}) if err != nil { @@ -298,18 +403,87 @@ func TestVMCreateParamsFromFlagsOmitsStaticDefaultsWhenFlagsAreUnchanged(t *test create, "devbox", "default", - model.DefaultVCPUCount, - model.DefaultMemoryMiB, - model.FormatSizeBytes(model.DefaultSystemOverlaySize), - model.FormatSizeBytes(model.DefaultWorkDiskSize), + 3, + 4096, + "10G", + "20G", false, false, ) if err != nil { t.Fatalf("vmCreateParamsFromFlags: %v", err) } - if params.VCPUCount != nil || params.MemoryMiB != nil || params.SystemOverlaySize != "" || params.WorkDiskSize != "" { - t.Fatalf("expected unchanged defaults to stay omitted: %+v", params) + if params.VCPUCount == nil || *params.VCPUCount != 3 { + t.Errorf("VCPUCount = %v, want 3", params.VCPUCount) + } + if params.MemoryMiB == nil || *params.MemoryMiB != 4096 { + t.Errorf("MemoryMiB = %v, want 4096", params.MemoryMiB) + } + if params.SystemOverlaySize != "10G" { + t.Errorf("SystemOverlaySize = %q, want 10G", params.SystemOverlaySize) + } + if params.WorkDiskSize != "20G" { + t.Errorf("WorkDiskSize = %q, want 20G", params.WorkDiskSize) + } +} + +func TestVMCreateParamsFromFlagsRejectsNonPositive(t *testing.T) { + cmd := NewBangerCommand() + vm, _, err := cmd.Find([]string{"vm"}) + if err != nil { + t.Fatalf("find vm: %v", err) + } + create, _, err := vm.Find([]string{"create"}) + if err != nil { + t.Fatalf("find create: %v", err) + } + + if _, err := vmCreateParamsFromFlags(create, "x", "", 0, 1024, "8G", "8G", false, false); err == nil { + t.Error("expected error for vcpu=0") + } + if _, err := vmCreateParamsFromFlags(create, "x", "", 2, 0, "8G", "8G", false, false); err == nil { + t.Error("expected error for memory=0") + } +} + +func TestVMCreateParamsFromFlagsRejectsInvalidName(t *testing.T) { + cmd := NewBangerCommand() + vm, _, err := cmd.Find([]string{"vm"}) + if err != nil { + t.Fatalf("find vm: %v", err) + } + create, _, err := vm.Find([]string{"create"}) + if err != nil { + t.Fatalf("find create: %v", err) + } + + // A sampling of failure modes; the exhaustive character-class + // matrix lives in internal/model/vm_name_test.go. Here we just + // prove the CLI wires the validator in and surfaces its errors + // before any RPC call is made. + cases := []struct { + name string + input string + }{ + {"space", "my box"}, + {"uppercase", "MyBox"}, + {"dot", "box.vm"}, + {"leading hyphen", "-box"}, + {"newline", "my\nbox"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + if _, err := vmCreateParamsFromFlags(create, tc.input, "", 2, 1024, "8G", "8G", false, false); err == nil { + t.Fatalf("vmCreateParamsFromFlags(%q) = nil error, want rejection", tc.input) + } + }) + } + + // Empty name must STILL be accepted at the CLI layer — the daemon + // generates one when the flag is unset. Rejecting here would + // break `banger vm create` with no --name. + if _, err := vmCreateParamsFromFlags(create, "", "", 2, 1024, "8G", "8G", false, false); err != nil { + t.Fatalf("vmCreateParamsFromFlags(empty name) = %v, want nil (daemon generates)", err) } } @@ -364,14 +538,7 @@ func TestVMCreateParamsFromFlagsRejectsNonPositiveCPUAndMemory(t *testing.T) { } func TestRunVMCreatePollsUntilDone(t *testing.T) { - origBegin := vmCreateBeginFunc - origStatus := vmCreateStatusFunc - origCancel := vmCreateCancelFunc - t.Cleanup(func() { - vmCreateBeginFunc = origBegin - vmCreateStatusFunc = origStatus - vmCreateCancelFunc = origCancel - }) + d := defaultDeps() vm := model.VMRecord{ ID: "vm-id", @@ -383,7 +550,7 @@ func TestRunVMCreatePollsUntilDone(t *testing.T) { DNSName: "devbox.vm", }, } - vmCreateBeginFunc = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { return api.VMCreateBeginResult{ Operation: api.VMCreateOperation{ ID: "op-1", @@ -393,14 +560,14 @@ func TestRunVMCreatePollsUntilDone(t *testing.T) { }, nil } statusCalls := 0 - vmCreateStatusFunc = func(context.Context, string, string) (api.VMCreateStatusResult, error) { + d.vmCreateStatus = func(context.Context, string, string) (api.VMCreateStatusResult, error) { statusCalls++ if statusCalls == 1 { return api.VMCreateStatusResult{ Operation: api.VMCreateOperation{ ID: "op-1", - Stage: "wait_opencode", - Detail: "waiting for opencode on guest port 4096", + Stage: "wait_vsock_agent", + Detail: "waiting for guest vsock agent", }, }, nil } @@ -415,14 +582,15 @@ func TestRunVMCreatePollsUntilDone(t *testing.T) { }, }, nil } - vmCreateCancelFunc = func(context.Context, string, string) error { + d.vmCreateCancel = func(context.Context, string, string) error { t.Fatal("cancel should not be called") return nil } - got, err := runVMCreate(context.Background(), "/tmp/bangerd.sock", &bytes.Buffer{}, api.VMCreateParams{Name: "devbox"}) + var stderr bytes.Buffer + got, err := d.runVMCreate(context.Background(), "/tmp/bangerd.sock", &stderr, api.VMCreateParams{Name: "devbox"}, false) if err != nil { - t.Fatalf("runVMCreate: %v", err) + t.Fatalf("d.runVMCreate: %v", err) } if got.Name != vm.Name || got.Runtime.GuestIP != vm.Runtime.GuestIP { t.Fatalf("vm = %+v, want %+v", got, vm) @@ -430,6 +598,27 @@ func TestRunVMCreatePollsUntilDone(t *testing.T) { if statusCalls != 2 { t.Fatalf("statusCalls = %d, want 2", statusCalls) } + if !strings.Contains(stderr.String(), "[vm create] ready in ") { + t.Fatalf("stderr missing elapsed line:\n%s", stderr.String()) + } +} + +func TestFormatVMCreateElapsed(t *testing.T) { + cases := []struct { + in time.Duration + want string + }{ + {350 * time.Millisecond, "350ms"}, + {4*time.Second + 700*time.Millisecond, "4.7s"}, + {59*time.Second + 900*time.Millisecond, "59.9s"}, + {62 * time.Second, "1m02s"}, + {2*time.Minute + 5*time.Second, "2m05s"}, + } + for _, tc := range cases { + if got := formatVMCreateElapsed(tc.in); got != tc.want { + t.Errorf("formatVMCreateElapsed(%s) = %q, want %q", tc.in, got, tc.want) + } + } } func TestVMCreateProgressRendererSuppressesDuplicateLines(t *testing.T) { @@ -438,7 +627,7 @@ func TestVMCreateProgressRendererSuppressesDuplicateLines(t *testing.T) { renderer.render(api.VMCreateOperation{Stage: "prepare_work_disk", Detail: "cloning work seed"}) renderer.render(api.VMCreateOperation{Stage: "prepare_work_disk", Detail: "cloning work seed"}) - renderer.render(api.VMCreateOperation{Stage: "wait_opencode", Detail: "waiting for opencode on guest port 4096"}) + renderer.render(api.VMCreateOperation{Stage: "wait_vsock_agent", Detail: "waiting for guest vsock agent"}) lines := strings.Split(strings.TrimSpace(stderr.String()), "\n") if len(lines) != 2 { @@ -447,11 +636,119 @@ func TestVMCreateProgressRendererSuppressesDuplicateLines(t *testing.T) { if lines[0] != "[vm create] preparing work disk: cloning work seed" { t.Fatalf("first line = %q", lines[0]) } - if lines[1] != "[vm create] waiting for opencode: waiting for opencode on guest port 4096" { + if lines[1] != "[vm create] waiting for vsock agent: waiting for guest vsock agent" { t.Fatalf("second line = %q", lines[1]) } } +func TestVMRunProgressRendererSuppressesDuplicateLines(t *testing.T) { + var stderr bytes.Buffer + renderer := newVMRunProgressRenderer(&stderr, true) + + renderer.render("waiting for guest ssh") + renderer.render("waiting for guest ssh") + renderer.render("overlaying host working tree") + + lines := strings.Split(strings.TrimSpace(stderr.String()), "\n") + if len(lines) != 2 { + t.Fatalf("rendered lines = %q, want 2 lines", stderr.String()) + } + if lines[0] != "[vm run] waiting for guest ssh" { + t.Fatalf("first line = %q", lines[0]) + } + if lines[1] != "[vm run] overlaying host working tree" { + t.Fatalf("second line = %q", lines[1]) + } +} + +// TestVMRunProgressRendererInlineRewrites covers the TTY default: each +// render call rewrites the same line via \r + clear-to-EOL instead of +// emitting a newline, so the user sees one moving status line until +// commitLine / clear / the caller's own newline closes it out. +func TestVMRunProgressRendererInlineRewrites(t *testing.T) { + var stderr bytes.Buffer + renderer := &vmRunProgressRenderer{out: &stderr, enabled: true, inline: true} + + renderer.render("waiting for guest ssh") + renderer.render("preparing guest workspace") + renderer.commitLine("vm devbox running; reconnect with: banger vm ssh devbox") + + got := stderr.String() + wantPrefix := "\r\x1b[K[vm run] waiting for guest ssh" + + "\r\x1b[K[vm run] preparing guest workspace" + + "\r\x1b[K[vm run] vm devbox running; reconnect with: banger vm ssh devbox\n" + if got != wantPrefix { + t.Fatalf("inline output = %q, want %q", got, wantPrefix) + } +} + +// TestVMRunProgressRendererClearWipesActiveLine guards the path used +// before sshExec/runSSHSession: clear() must erase the live inline +// line so the next writer (the ssh session, a warning, the user's +// command output) starts from column 0 without a trailing status. +func TestVMRunProgressRendererClearWipesActiveLine(t *testing.T) { + var stderr bytes.Buffer + renderer := &vmRunProgressRenderer{out: &stderr, enabled: true, inline: true} + + renderer.render("attaching to guest") + renderer.clear() + // clear() on an already-cleared renderer is a no-op (active=false). + renderer.clear() + + got := stderr.String() + want := "\r\x1b[K[vm run] attaching to guest\r\x1b[K" + if got != want { + t.Fatalf("after clear stderr = %q, want %q", got, want) + } +} + +// TestVMCreateProgressRendererInlineRewrites mirrors the vm_run inline +// test for the create-side renderer so both progress paths stay in +// sync if either is touched in isolation. +func TestVMCreateProgressRendererInlineRewrites(t *testing.T) { + var stderr bytes.Buffer + renderer := &vmCreateProgressRenderer{out: &stderr, enabled: true, inline: true} + + renderer.render(api.VMCreateOperation{Stage: "prepare_work_disk", Detail: "cloning work seed"}) + renderer.render(api.VMCreateOperation{Stage: "wait_vsock_agent", Detail: "waiting for guest vsock agent"}) + renderer.clear() + + got := stderr.String() + want := "\r\x1b[K[vm create] preparing work disk: cloning work seed" + + "\r\x1b[K[vm create] waiting for vsock agent: waiting for guest vsock agent" + + "\r\x1b[K" + if got != want { + t.Fatalf("inline output = %q, want %q", got, want) + } +} + +func TestWithHeartbeatNoOpForNonTTY(t *testing.T) { + var buf bytes.Buffer + called := false + err := withHeartbeat(&buf, "image pull", func() error { + called = true + return nil + }) + if err != nil { + t.Fatalf("withHeartbeat: %v", err) + } + if !called { + t.Fatal("fn should have been called") + } + if buf.Len() != 0 { + t.Fatalf("stderr = %q, want empty for non-TTY", buf.String()) + } +} + +func TestWithHeartbeatPropagatesError(t *testing.T) { + sentinel := errors.New("boom") + var buf bytes.Buffer + err := withHeartbeat(&buf, "image pull", func() error { return sentinel }) + if !errors.Is(err, sentinel) { + t.Fatalf("withHeartbeat error = %v, want %v", err, sentinel) + } +} + func TestVMSetParamsFromFlagsConflict(t *testing.T) { if _, err := vmSetParamsFromFlags("devbox", -1, -1, "", true, true); err == nil { t.Fatal("expected nat conflict error") @@ -504,6 +801,50 @@ func TestAbsolutizeImageRegisterPaths(t *testing.T) { } } +func TestAbsolutizePaths(t *testing.T) { + tmp := t.TempDir() + wd, err := os.Getwd() + if err != nil { + t.Fatalf("Getwd: %v", err) + } + if err := os.Chdir(tmp); err != nil { + t.Fatalf("Chdir: %v", err) + } + t.Cleanup(func() { _ = os.Chdir(wd) }) + + empty := "" + abs := "/already/absolute/path" + rel1 := filepath.Join("a", "b") + rel2 := "./c/d" + + if err := absolutizePaths(&empty, &abs, &rel1, &rel2); err != nil { + t.Fatalf("absolutizePaths: %v", err) + } + + if empty != "" { + t.Errorf("empty value mutated: %q", empty) + } + if abs != "/already/absolute/path" { + t.Errorf("absolute value mutated: %q", abs) + } + if !filepath.IsAbs(rel1) { + t.Errorf("rel1 not absolutized: %q", rel1) + } + if !filepath.IsAbs(rel2) { + t.Errorf("rel2 not absolutized: %q", rel2) + } + // Sanity: relative paths should land under tmp. + if !strings.HasPrefix(rel1, tmp) { + t.Errorf("rel1 = %q, want prefix %q", rel1, tmp) + } +} + +func TestAbsolutizePathsNoArgs(t *testing.T) { + if err := absolutizePaths(); err != nil { + t.Fatalf("absolutizePaths() with no args: %v", err) + } +} + func TestPrintImageListTableShowsRootfsSizes(t *testing.T) { rootfs := filepath.Join(t.TempDir(), "rootfs.ext4") if err := os.WriteFile(rootfs, nil, 0o644); err != nil { @@ -549,6 +890,49 @@ func TestPrintImageListTableShowsRootfsSizes(t *testing.T) { } } +func TestSelectVMListVMsDefaultsToRunning(t *testing.T) { + now := time.Now() + vms := []model.VMRecord{ + {ID: "running-1", State: model.VMStateRunning, CreatedAt: now.Add(-3 * time.Hour)}, + {ID: "stopped-1", State: model.VMStateStopped, CreatedAt: now.Add(-2 * time.Hour)}, + {ID: "running-2", State: model.VMStateRunning, CreatedAt: now.Add(-1 * time.Hour)}, + } + got := selectVMListVMs(vms, false, false) + if len(got) != 2 || got[0].ID != "running-1" || got[1].ID != "running-2" { + t.Fatalf("selectVMListVMs() = %#v, want only running VMs in original order", got) + } +} + +func TestSelectVMListVMsLatestUsesFilteredSet(t *testing.T) { + now := time.Now() + vms := []model.VMRecord{ + {ID: "running-old", State: model.VMStateRunning, CreatedAt: now.Add(-3 * time.Hour)}, + {ID: "stopped-new", State: model.VMStateStopped, CreatedAt: now.Add(-30 * time.Minute)}, + {ID: "running-new", State: model.VMStateRunning, CreatedAt: now.Add(-1 * time.Hour)}, + } + got := selectVMListVMs(vms, false, true) + if len(got) != 1 || got[0].ID != "running-new" { + t.Fatalf("selectVMListVMs(default latest) = %#v, want latest running VM", got) + } + got = selectVMListVMs(vms, true, true) + if len(got) != 1 || got[0].ID != "stopped-new" { + t.Fatalf("selectVMListVMs(all latest) = %#v, want latest VM across all states", got) + } +} + +func TestPrintVMIDListShowsFullIDs(t *testing.T) { + var out bytes.Buffer + err := printVMIDList(&out, []model.VMRecord{{ID: "0123456789abcdef0123456789abcdef"}, {ID: "fedcba9876543210fedcba9876543210"}}) + if err != nil { + t.Fatalf("printVMIDList() error = %v", err) + } + lines := strings.Split(strings.TrimSpace(out.String()), "\n") + want := []string{"0123456789abcdef0123456789abcdef", "fedcba9876543210fedcba9876543210"} + if !reflect.DeepEqual(lines, want) { + t.Fatalf("lines = %v, want %v", lines, want) + } +} + func TestPrintVMListTableShowsImageNames(t *testing.T) { var out bytes.Buffer err := printVMListTable(&out, []model.VMRecord{ @@ -643,23 +1027,18 @@ func TestPrintVMPortsTableSortsAndRendersURLEndpoints(t *testing.T) { } func TestRunSSHSessionPrintsReminderWhenHealthCheckPasses(t *testing.T) { - origSSHExec := sshExecFunc - origHealth := vmHealthFunc - t.Cleanup(func() { - sshExecFunc = origSSHExec - vmHealthFunc = origHealth - }) + d := defaultDeps() - sshExecFunc = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { + d.sshExec = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { return nil } - vmHealthFunc = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { + d.vmHealth = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { return api.VMHealthResult{Name: "devbox", Healthy: true}, nil } var stderr bytes.Buffer - if err := runSSHSession(context.Background(), "/tmp/bangerd.sock", "devbox", strings.NewReader(""), &bytes.Buffer{}, &stderr, []string{"root@127.0.0.1"}); err != nil { - t.Fatalf("runSSHSession: %v", err) + if err := d.runSSHSession(context.Background(), "/tmp/bangerd.sock", "devbox", strings.NewReader(""), &bytes.Buffer{}, &stderr, []string{"root@127.0.0.1"}, false); err != nil { + t.Fatalf("d.runSSHSession: %v", err) } if !strings.Contains(stderr.String(), "devbox is still running") { t.Fatalf("stderr = %q, want reminder", stderr.String()) @@ -667,25 +1046,20 @@ func TestRunSSHSessionPrintsReminderWhenHealthCheckPasses(t *testing.T) { } func TestRunSSHSessionPreservesSSHExitStatusOnHealthWarning(t *testing.T) { - origSSHExec := sshExecFunc - origHealth := vmHealthFunc - t.Cleanup(func() { - sshExecFunc = origSSHExec - vmHealthFunc = origHealth - }) + d := defaultDeps() - sshExecFunc = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { + d.sshExec = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { return exitErrorWithCode(t, 1) } - vmHealthFunc = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { + d.vmHealth = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { return api.VMHealthResult{}, errors.New("dial failed") } var stderr bytes.Buffer - err := runSSHSession(context.Background(), "/tmp/bangerd.sock", "devbox", strings.NewReader(""), &bytes.Buffer{}, &stderr, []string{"root@127.0.0.1"}) + err := d.runSSHSession(context.Background(), "/tmp/bangerd.sock", "devbox", strings.NewReader(""), &bytes.Buffer{}, &stderr, []string{"root@127.0.0.1"}, false) var exitErr *exec.ExitError if !errors.As(err, &exitErr) { - t.Fatalf("runSSHSession error = %v, want exit error", err) + t.Fatalf("d.runSSHSession error = %v, want exit error", err) } if !strings.Contains(stderr.String(), "failed to check whether devbox is still running") { t.Fatalf("stderr = %q, want warning", stderr.String()) @@ -693,27 +1067,22 @@ func TestRunSSHSessionPreservesSSHExitStatusOnHealthWarning(t *testing.T) { } func TestRunSSHSessionSkipsReminderOnSSHAuthFailure(t *testing.T) { - origSSHExec := sshExecFunc - origHealth := vmHealthFunc - t.Cleanup(func() { - sshExecFunc = origSSHExec - vmHealthFunc = origHealth - }) + d := defaultDeps() healthCalled := false - sshExecFunc = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { + d.sshExec = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { return exitErrorWithCode(t, 255) } - vmHealthFunc = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { + d.vmHealth = func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { healthCalled = true return api.VMHealthResult{Name: "devbox", Healthy: true}, nil } var stderr bytes.Buffer - err := runSSHSession(context.Background(), "/tmp/bangerd.sock", "devbox", strings.NewReader(""), &bytes.Buffer{}, &stderr, []string{"root@127.0.0.1"}) + err := d.runSSHSession(context.Background(), "/tmp/bangerd.sock", "devbox", strings.NewReader(""), &bytes.Buffer{}, &stderr, []string{"root@127.0.0.1"}, false) var exitErr *exec.ExitError if !errors.As(err, &exitErr) || exitErr.ExitCode() != 255 { - t.Fatalf("runSSHSession error = %v, want exit 255", err) + t.Fatalf("d.runSSHSession error = %v, want exit 255", err) } if healthCalled { t.Fatal("vm health should not run after ssh auth failure") @@ -815,25 +1184,65 @@ func TestExecuteVMActionBatchRunsConcurrentlyAndPreservesOrder(t *testing.T) { } func TestSSHCommandArgs(t *testing.T) { + // sshCommandArgs wires banger's own known_hosts into the shell + // SSH invocation — never /dev/null. Assert the shape and the + // posture rather than the exact path (which is host-XDG-derived). args, err := sshCommandArgs(model.DaemonConfig{SSHKeyPath: "/bundle/id_ed25519"}, "172.16.0.2", []string{"--", "uname", "-a"}) if err != nil { t.Fatalf("sshCommandArgs: %v", err) } - want := []string{ + + wantSubstrings := []string{ "-F", "/dev/null", "-i", "/bundle/id_ed25519", "-o", "IdentitiesOnly=yes", - "-o", "BatchMode=yes", - "-o", "PreferredAuthentications=publickey", "-o", "PasswordAuthentication=no", "-o", "KbdInteractiveAuthentication=no", - "-o", "StrictHostKeyChecking=no", - "-o", "UserKnownHostsFile=/dev/null", "root@172.16.0.2", - "--", "uname", "-a", } - if !reflect.DeepEqual(args, want) { - t.Fatalf("args = %v, want %v", args, want) + for _, s := range wantSubstrings { + found := false + for _, a := range args { + if a == s { + found = true + break + } + } + if !found { + t.Errorf("args missing %q: %v", s, args) + } + } + + // The trailing argument is the user's command, shell-quoted and + // joined so ssh(1)'s space-concatenation produces the exact argv + // the user typed on the remote shell. Without this, multi-word + // args like `sh -c 'exit 42'` re-tokenise on the remote and lose + // exit codes. + if got, want := args[len(args)-1], `'--' 'uname' '-a'`; got != want { + t.Errorf("trailing arg = %q, want %q (ssh needs a single shell-quoted string)", got, want) + } + + // Host-key verification posture: accept-new + a real path into + // banger state, not /dev/null. + joined := strings.Join(args, " ") + if !strings.Contains(joined, "StrictHostKeyChecking=accept-new") { + t.Errorf("args missing accept-new posture: %v", args) + } + if strings.Contains(joined, "UserKnownHostsFile=/dev/null") { + t.Errorf("args leaked UserKnownHostsFile=/dev/null: %v", args) + } + if strings.Contains(joined, "StrictHostKeyChecking=no") { + t.Errorf("args leaked StrictHostKeyChecking=no: %v", args) + } + // Must reference a known_hosts file ending in "known_hosts". + sawKnownHosts := false + for _, a := range args { + if strings.HasPrefix(a, "UserKnownHostsFile=") && strings.HasSuffix(a, "known_hosts") { + sawKnownHosts = true + } + } + if !sawKnownHosts { + t.Errorf("args missing UserKnownHostsFile=: %v", args) } } @@ -867,139 +1276,141 @@ func TestValidateSSHPrereqsFailsForMissingKey(t *testing.T) { } } -func TestResolveVMRunSourcePathDefaultsToCWD(t *testing.T) { - origCWD := cwdFunc - t.Cleanup(func() { - cwdFunc = origCWD - }) +// CLI-side git inspection moved to internal/daemon/workspace; the +// CLI now runs only a minimal preflight. Those tests live in the +// workspace package. What we still guard here is the preflight +// policy: reject submodules before the VM is created so the user +// gets a fast error instead of an orphaned VM. - want := t.TempDir() - cwdFunc = func() (string, error) { - return want, nil - } - - got, err := resolveVMRunSourcePath("") - if err != nil { - t.Fatalf("resolveVMRunSourcePath: %v", err) - } - if got != want { - t.Fatalf("resolveVMRunSourcePath() = %q, want %q", got, want) - } -} - -func TestInspectVMRunRepoUsesRepoRootAndOverlayPaths(t *testing.T) { - if _, err := exec.LookPath("git"); err != nil { - t.Skip("git not installed") - } - - repoRoot := t.TempDir() - testRunGit(t, repoRoot, "init") - testRunGit(t, repoRoot, "config", "user.email", "test@example.com") - testRunGit(t, repoRoot, "config", "user.name", "Banger Test") - - if err := os.MkdirAll(filepath.Join(repoRoot, "dir"), 0o755); err != nil { - t.Fatalf("MkdirAll(dir): %v", err) - } - if err := os.WriteFile(filepath.Join(repoRoot, ".gitignore"), []byte("ignored.txt\n"), 0o644); err != nil { - t.Fatalf("WriteFile(.gitignore): %v", err) - } - if err := os.WriteFile(filepath.Join(repoRoot, "tracked.txt"), []byte("tracked\n"), 0o644); err != nil { - t.Fatalf("WriteFile(tracked.txt): %v", err) - } - if err := os.WriteFile(filepath.Join(repoRoot, "dir", "keep.txt"), []byte("keep\n"), 0o644); err != nil { - t.Fatalf("WriteFile(keep.txt): %v", err) - } - testRunGit(t, repoRoot, "add", ".") - testRunGit(t, repoRoot, "commit", "-m", "init") - testRunGit(t, repoRoot, "checkout", "-b", "trunk") - - if err := os.WriteFile(filepath.Join(repoRoot, "tracked.txt"), []byte("tracked local\n"), 0o644); err != nil { - t.Fatalf("WriteFile(tracked.txt local): %v", err) - } - if err := os.WriteFile(filepath.Join(repoRoot, "untracked.txt"), []byte("untracked\n"), 0o644); err != nil { - t.Fatalf("WriteFile(untracked.txt): %v", err) - } - if err := os.WriteFile(filepath.Join(repoRoot, "ignored.txt"), []byte("ignored\n"), 0o644); err != nil { - t.Fatalf("WriteFile(ignored.txt): %v", err) - } - - spec, err := inspectVMRunRepo(context.Background(), filepath.Join(repoRoot, "dir"), "", "HEAD") - if err != nil { - t.Fatalf("inspectVMRunRepo: %v", err) - } - - if spec.RepoRoot != repoRoot { - t.Fatalf("RepoRoot = %q, want %q", spec.RepoRoot, repoRoot) - } - if spec.RepoName != filepath.Base(repoRoot) { - t.Fatalf("RepoName = %q, want %q", spec.RepoName, filepath.Base(repoRoot)) - } - if spec.CurrentBranch != "trunk" { - t.Fatalf("CurrentBranch = %q, want trunk", spec.CurrentBranch) - } - if spec.HeadCommit == "" { - t.Fatal("HeadCommit should not be empty") - } - if spec.BaseCommit != spec.HeadCommit { - t.Fatalf("BaseCommit = %q, want head %q", spec.BaseCommit, spec.HeadCommit) - } - wantOverlay := []string{".gitignore", "dir/keep.txt", "tracked.txt", "untracked.txt"} - if !reflect.DeepEqual(spec.OverlayPaths, wantOverlay) { - t.Fatalf("OverlayPaths = %v, want %v", spec.OverlayPaths, wantOverlay) - } -} - -func TestInspectVMRunRepoRejectsSubmodules(t *testing.T) { +func TestVMRunPreflightRejectsSubmodules(t *testing.T) { + d := defaultDeps() repoRoot := t.TempDir() - origHostCommandOutput := hostCommandOutputFunc - t.Cleanup(func() { - hostCommandOutputFunc = origHostCommandOutput - }) - - hostCommandOutputFunc = func(ctx context.Context, name string, args ...string) ([]byte, error) { - t.Helper() - if name != "git" { - t.Fatalf("command = %q, want git", name) - } - switch { - case reflect.DeepEqual(args, []string{"-C", repoRoot, "rev-parse", "--show-toplevel"}): - return []byte(repoRoot + "\n"), nil - case reflect.DeepEqual(args, []string{"-C", repoRoot, "rev-parse", "--is-bare-repository"}): - return []byte("false\n"), nil - case reflect.DeepEqual(args, []string{"-C", repoRoot, "ls-files", "--stage", "-z"}): - return []byte("160000 deadbeef 0\tvendor/submodule\x00"), nil - default: - t.Fatalf("unexpected git args: %v", args) - return nil, nil - } + // Stub the CLI's repo-inspector with a scripted runner. Per-deps + // injection keeps this test free of package globals, so t.Parallel + // is safe to add here in the future without racing another test's + // fake runner. + d.repoInspector = &workspace.Inspector{ + Runner: func(ctx context.Context, name string, args ...string) ([]byte, error) { + t.Helper() + if name != "git" { + t.Fatalf("command = %q, want git", name) + } + switch { + case reflect.DeepEqual(args, []string{"-C", repoRoot, "rev-parse", "--show-toplevel"}): + return []byte(repoRoot + "\n"), nil + case reflect.DeepEqual(args, []string{"-C", repoRoot, "rev-parse", "--is-bare-repository"}): + return []byte("false\n"), nil + case reflect.DeepEqual(args, []string{"-C", repoRoot, "ls-files", "--stage", "-z"}): + return []byte("160000 deadbeef 0\tvendor/submodule\x00"), nil + default: + t.Fatalf("unexpected git args: %v", args) + return nil, nil + } + }, } - _, err := inspectVMRunRepo(context.Background(), repoRoot, "", "HEAD") + _, err := d.vmRunPreflightRepo(context.Background(), repoRoot) if err == nil || !strings.Contains(err.Error(), "submodules") { - t.Fatalf("inspectVMRunRepo() error = %v, want submodule rejection", err) + t.Fatalf("d.vmRunPreflightRepo() error = %v, want submodule rejection", err) } } -func TestRunVMRunCreatesImportsAndAttaches(t *testing.T) { +func TestRunVMRunWorkspacePreparesAndAttaches(t *testing.T) { + d := defaultDeps() repoRoot := t.TempDir() - origBegin := vmCreateBeginFunc - origStatus := vmCreateStatusFunc - origCancel := vmCreateCancelFunc - origWaitForSSH := guestWaitForSSHFunc - origGuestDial := guestDialFunc - origHostCommandOutput := hostCommandOutputFunc - origOpencodeExec := opencodeExecFunc - t.Cleanup(func() { - vmCreateBeginFunc = origBegin - vmCreateStatusFunc = origStatus - vmCreateCancelFunc = origCancel - guestWaitForSSHFunc = origWaitForSSH - guestDialFunc = origGuestDial - hostCommandOutputFunc = origHostCommandOutput - opencodeExecFunc = origOpencodeExec - }) + vm := model.VMRecord{ + ID: "vm-id", + Name: "devbox", + Runtime: model.VMRuntime{ + State: model.VMStateRunning, + GuestIP: "172.16.0.2", + DNSName: "devbox.vm", + }, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{ + Operation: api.VMCreateOperation{ + ID: "op-1", Stage: "ready", Detail: "vm is ready", + Done: true, Success: true, VM: &vm, + }, + }, nil + } + d.vmCreateStatus = func(context.Context, string, string) (api.VMCreateStatusResult, error) { + t.Fatal("d.vmCreateStatus should not be called") + return api.VMCreateStatusResult{}, nil + } + d.vmCreateCancel = func(context.Context, string, string) error { + t.Fatal("d.vmCreateCancel should not be called") + return nil + } + + fakeClient := &testVMRunGuestClient{} + d.guestWaitForSSH = func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { + return nil + } + d.guestDial = func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) { + return fakeClient, nil + } + var workspaceParams api.VMWorkspacePrepareParams + d.vmWorkspacePrepare = func(ctx context.Context, socketPath string, params api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + workspaceParams = params + return api.VMWorkspacePrepareResult{Workspace: model.WorkspacePrepareResult{VMID: vm.ID, GuestPath: "/root/repo", RepoName: "repo", RepoRoot: "/tmp/repo"}}, nil + } + d.buildVMRunToolingPlan = func(context.Context, string) toolingplan.Plan { + return toolingplan.Plan{ + RepoManagedTools: []string{"go"}, + Steps: []toolingplan.InstallStep{{Tool: "go", Version: "1.25.0", Source: "go.mod"}}, + } + } + var sshArgsSeen []string + d.sshExec = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { + sshArgsSeen = args + return nil + } + d.vmHealth = func(context.Context, string, string) (api.VMHealthResult, error) { + return api.VMHealthResult{Name: "devbox", Healthy: false}, nil + } + + repo := vmRunRepo{sourcePath: repoRoot} + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox"}, + &repo, + nil, + false, + false, + false, + false, + ) + if err != nil { + t.Fatalf("d.runVMRun: %v", err) + } + if workspaceParams.IDOrName != "devbox" || workspaceParams.SourcePath != repoRoot { + t.Fatalf("workspaceParams = %+v", workspaceParams) + } + if len(fakeClient.uploads) != 1 { + t.Fatalf("uploads = %d, want tooling harness upload", len(fakeClient.uploads)) + } + if !fakeClient.closed { + t.Fatal("guest client should be closed after tooling bootstrap") + } + if len(sshArgsSeen) == 0 || sshArgsSeen[len(sshArgsSeen)-1] != "root@172.16.0.2" { + t.Fatalf("sshArgsSeen = %v, want interactive ssh to 172.16.0.2 (no trailing command)", sshArgsSeen) + } + if got := stdout.String(); strings.Contains(got, "VM ready.") { + t.Fatalf("stdout = %q, want no next-steps block", got) + } +} + +func TestVMRunPrintsPostCreateProgress(t *testing.T) { + d := defaultDeps() vm := model.VMRecord{ ID: "vm-id", @@ -1009,133 +1420,486 @@ func TestRunVMRunCreatesImportsAndAttaches(t *testing.T) { GuestIP: "172.16.0.2", }, } - vmCreateBeginFunc = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { return api.VMCreateBeginResult{ Operation: api.VMCreateOperation{ - ID: "op-1", - Stage: "ready", - Detail: "vm is ready", - Done: true, - Success: true, - VM: &vm, + ID: "op-1", Stage: "ready", Detail: "vm is ready", + Done: true, Success: true, VM: &vm, }, }, nil } - vmCreateStatusFunc = func(context.Context, string, string) (api.VMCreateStatusResult, error) { - t.Fatal("vmCreateStatusFunc should not be called") + d.vmCreateStatus = func(context.Context, string, string) (api.VMCreateStatusResult, error) { + t.Fatal("d.vmCreateStatus should not be called") return api.VMCreateStatusResult{}, nil } - vmCreateCancelFunc = func(context.Context, string, string) error { - t.Fatal("vmCreateCancelFunc should not be called") + d.vmCreateCancel = func(context.Context, string, string) error { + t.Fatal("d.vmCreateCancel should not be called") return nil } + d.guestWaitForSSH = func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { + return nil + } + d.guestDial = func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) { + return &testVMRunGuestClient{}, nil + } + d.vmWorkspacePrepare = func(ctx context.Context, socketPath string, params api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + return api.VMWorkspacePrepareResult{Workspace: model.WorkspacePrepareResult{VMID: vm.ID, GuestPath: "/root/repo", RepoName: "repo", RepoRoot: "/tmp/repo"}}, nil + } + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { + return nil + } + d.vmHealth = func(context.Context, string, string) (api.VMHealthResult, error) { + return api.VMHealthResult{Name: "devbox", Healthy: false}, nil + } - fakeClient := &testVMRunGuestClient{} - waitAddress := "" - waitKeyPath := "" - waitInterval := time.Duration(0) - guestWaitForSSHFunc = func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { - waitAddress = address - waitKeyPath = privateKeyPath - waitInterval = interval - return nil - } - dialAddress := "" - dialKeyPath := "" - guestDialFunc = func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) { - dialAddress = address - dialKeyPath = privateKeyPath - return fakeClient, nil - } - hostCommandOutputFunc = func(ctx context.Context, name string, args ...string) ([]byte, error) { - if name != "git" { - t.Fatalf("command = %q, want git", name) - } - if len(args) < 7 || args[0] != "-C" || args[1] != repoRoot || args[2] != "bundle" || args[3] != "create" || args[5] != "--all" { - t.Fatalf("unexpected bundle args: %v", args) - } - if !reflect.DeepEqual(args[6:], []string{"deadbeef", "cafebabe"}) { - t.Fatalf("bundle revs = %v, want deadbeef/cafebabe", args[6:]) - } - if err := os.WriteFile(args[4], []byte("bundle-data"), 0o600); err != nil { - t.Fatalf("WriteFile(bundle): %v", err) - } - return nil, nil - } - var attachArgs []string - opencodeExecFunc = func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { - attachArgs = append([]string(nil), args...) - return nil - } - - spec := vmRunRepoSpec{ - RepoRoot: repoRoot, - RepoName: "repo", - HeadCommit: "deadbeef", - CurrentBranch: "main", - BranchName: "feature", - BaseCommit: "cafebabe", - OverlayPaths: []string{"tracked.txt", "nested/keep.txt"}, - } - err := runVMRun( + repo := vmRunRepo{sourcePath: t.TempDir()} + var stdout, stderr bytes.Buffer + err := d.runVMRun( context.Background(), "/tmp/bangerd.sock", model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, strings.NewReader(""), - &bytes.Buffer{}, - &bytes.Buffer{}, + &stdout, &stderr, api.VMCreateParams{Name: "devbox"}, - spec, + &repo, + nil, + false, + false, + false, + false, ) if err != nil { - t.Fatalf("runVMRun: %v", err) + t.Fatalf("d.runVMRun: %v", err) } - if waitAddress != "172.16.0.2:22" { - t.Fatalf("waitAddress = %q, want 172.16.0.2:22", waitAddress) + output := stderr.String() + for _, want := range []string{ + "[vm run] waiting for guest ssh", + "[vm run] preparing guest workspace", + "[vm run] starting guest tooling bootstrap", + "[vm run] guest tooling log: /root/.cache/banger/vm-run-tooling-repo.log", + "[vm run] attaching to guest", + } { + if !strings.Contains(output, want) { + t.Fatalf("stderr = %q, want %q", output, want) + } } - if waitKeyPath != "/tmp/id_ed25519" { - t.Fatalf("waitKeyPath = %q, want /tmp/id_ed25519", waitKeyPath) + if strings.Contains(output, "[vm run] printing next steps") { + t.Fatalf("stderr = %q, should not print next-steps progress", output) } - if waitInterval <= 0 { - t.Fatalf("waitInterval = %s, want positive interval", waitInterval) +} + +func TestRunVMRunWarnsWhenToolingHarnessStartFails(t *testing.T) { + d := defaultDeps() + + vm := model.VMRecord{ + ID: "vm-id", + Name: "devbox", + Runtime: model.VMRuntime{ + State: model.VMStateRunning, + GuestIP: "172.16.0.2", + }, } - if dialAddress != waitAddress { - t.Fatalf("dialAddress = %q, want %q", dialAddress, waitAddress) + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Detail: "vm is ready", Done: true, Success: true, VM: &vm}}, nil } - if dialKeyPath != waitKeyPath { - t.Fatalf("dialKeyPath = %q, want %q", dialKeyPath, waitKeyPath) + d.vmCreateStatus = func(context.Context, string, string) (api.VMCreateStatusResult, error) { + t.Fatal("d.vmCreateStatus should not be called") + return api.VMCreateStatusResult{}, nil } - if fakeClient.uploadPath != vmRunGuestBundlePath { - t.Fatalf("uploadPath = %q, want %q", fakeClient.uploadPath, vmRunGuestBundlePath) + d.vmCreateCancel = func(context.Context, string, string) error { + t.Fatal("d.vmCreateCancel should not be called") + return nil } - if fakeClient.uploadMode != 0o600 { - t.Fatalf("uploadMode = %v, want 0600", fakeClient.uploadMode) + d.guestWaitForSSH = func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { + return nil } - if string(fakeClient.uploadData) != "bundle-data" { - t.Fatalf("uploadData = %q, want bundle-data", string(fakeClient.uploadData)) + fakeClient := &testVMRunGuestClient{launchErr: errors.New("launch failed")} + d.guestDial = func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) { + return fakeClient, nil } - if !strings.Contains(fakeClient.script, `git clone "$BUNDLE" "$DIR"`) { - t.Fatalf("script = %q, want clone command", fakeClient.script) + d.vmWorkspacePrepare = func(ctx context.Context, socketPath string, params api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + return api.VMWorkspacePrepareResult{Workspace: model.WorkspacePrepareResult{VMID: vm.ID, GuestPath: "/root/repo", RepoName: "repo", RepoRoot: "/tmp/repo"}}, nil } - if !strings.Contains(fakeClient.script, `git -C "$DIR" checkout -B 'feature' 'cafebabe'`) { - t.Fatalf("script = %q, want guest branch checkout", fakeClient.script) + sshExecCalls := 0 + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { + sshExecCalls++ + return nil } - if fakeClient.streamSourceDir != repoRoot { - t.Fatalf("streamSourceDir = %q, want %q", fakeClient.streamSourceDir, repoRoot) + d.vmHealth = func(context.Context, string, string) (api.VMHealthResult, error) { + return api.VMHealthResult{Healthy: false}, nil } - if !reflect.DeepEqual(fakeClient.streamEntries, spec.OverlayPaths) { - t.Fatalf("streamEntries = %v, want %v", fakeClient.streamEntries, spec.OverlayPaths) + + repo := vmRunRepo{sourcePath: t.TempDir()} + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox"}, + &repo, + nil, + false, + false, + false, + false, + ) + if err != nil { + t.Fatalf("d.runVMRun: %v", err) } - if fakeClient.streamCommand != "tar -C '/root/repo' --strip-components=1 -xf -" { - t.Fatalf("streamCommand = %q", fakeClient.streamCommand) + if !strings.Contains(stderr.String(), "[vm run] warning: guest tooling bootstrap start failed: launch guest tooling bootstrap") { + t.Fatalf("stderr = %q, want tooling bootstrap warning", stderr.String()) } - wantAttach := []string{"attach", "--dir", "/root/repo", "http://172.16.0.2:4096"} - if !reflect.DeepEqual(attachArgs, wantAttach) { - t.Fatalf("attachArgs = %v, want %v", attachArgs, wantAttach) + if sshExecCalls != 1 { + t.Fatalf("sshExec calls = %d, want 1 (interactive attach still runs)", sshExecCalls) } - if !fakeClient.closed { - t.Fatal("guest client should be closed") +} + +func TestRunVMRunBareModeSkipsWorkspaceAndTooling(t *testing.T) { + d := defaultDeps() + + vm := model.VMRecord{ + ID: "vm-id", Name: "bare", + Runtime: model.VMRuntime{State: model.VMStateRunning, GuestIP: "172.16.0.2"}, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Done: true, Success: true, VM: &vm}}, nil + } + d.guestWaitForSSH = func(context.Context, string, string, time.Duration) error { return nil } + d.guestDial = func(context.Context, string, string) (vmRunGuestClient, error) { + t.Fatal("d.guestDial should not be called in bare mode") + return nil, nil + } + d.vmWorkspacePrepare = func(context.Context, string, api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + t.Fatal("d.vmWorkspacePrepare should not be called in bare mode") + return api.VMWorkspacePrepareResult{}, nil + } + sshExecCalls := 0 + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { + sshExecCalls++ + return nil + } + d.vmHealth = func(context.Context, string, string) (api.VMHealthResult, error) { + return api.VMHealthResult{Healthy: false}, nil + } + + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "bare"}, + nil, + nil, + false, + false, + false, + false, + ) + if err != nil { + t.Fatalf("d.runVMRun: %v", err) + } + if sshExecCalls != 1 { + t.Fatalf("sshExec calls = %d, want 1", sshExecCalls) + } + if !strings.Contains(stderr.String(), "[vm run] attaching to guest") { + t.Fatalf("stderr = %q, want attach progress", stderr.String()) + } +} + +func TestRunVMRunRMDeletesAfterSessionExits(t *testing.T) { + d := defaultDeps() + + vm := model.VMRecord{ + ID: "vm-id", Name: "tmpbox", + Runtime: model.VMRuntime{State: model.VMStateRunning, GuestIP: "172.16.0.2"}, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Done: true, Success: true, VM: &vm}}, nil + } + d.guestWaitForSSH = func(context.Context, string, string, time.Duration) error { return nil } + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { return nil } + d.vmHealth = func(context.Context, string, string) (api.VMHealthResult, error) { + return api.VMHealthResult{Healthy: false}, nil + } + deletedRef := "" + d.vmDelete = func(_ context.Context, _, idOrName string) error { + deletedRef = idOrName + return nil + } + + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "tmpbox"}, + nil, + nil, + true, // --rm, + false, + false, + false, + ) + if err != nil { + t.Fatalf("d.runVMRun: %v", err) + } + if deletedRef != "tmpbox" { + t.Fatalf("deletedRef = %q, want tmpbox", deletedRef) + } + // The "VM is still running" reminder would be misleading when + // the VM is about to be deleted; it must be suppressed. + if strings.Contains(stderr.String(), "is still running") { + t.Fatalf("stderr = %q, should not print still-running reminder under --rm", stderr.String()) + } +} + +func TestRunVMRunRMSkipsDeleteOnSSHWaitTimeout(t *testing.T) { + d := defaultDeps() + origTimeout := vmRunSSHTimeout + vmRunSSHTimeout = 50 * time.Millisecond + t.Cleanup(func() { + vmRunSSHTimeout = origTimeout + }) + + vm := model.VMRecord{ + ID: "vm-id", Name: "slowvm", + Runtime: model.VMRuntime{State: model.VMStateRunning, GuestIP: "172.16.0.2"}, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Done: true, Success: true, VM: &vm}}, nil + } + d.guestWaitForSSH = func(ctx context.Context, _, _ string, _ time.Duration) error { + <-ctx.Done() + return ctx.Err() + } + deleteCalled := false + d.vmDelete = func(context.Context, string, string) error { + deleteCalled = true + return nil + } + + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "slowvm"}, + nil, + nil, + true, // --rm, + false, + false, + false, + ) + if err == nil { + t.Fatal("want timeout error") + } + if deleteCalled { + t.Fatal("VM should NOT be deleted on ssh-wait timeout even with --rm (keep for debugging)") + } +} + +func TestRunVMRunSSHTimeoutReturnsActionableError(t *testing.T) { + d := defaultDeps() + origTimeout := vmRunSSHTimeout + vmRunSSHTimeout = 50 * time.Millisecond + t.Cleanup(func() { + vmRunSSHTimeout = origTimeout + }) + + vm := model.VMRecord{ + ID: "vm-id", Name: "slowvm", + Runtime: model.VMRuntime{State: model.VMStateRunning, GuestIP: "172.16.0.2"}, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Done: true, Success: true, VM: &vm}}, nil + } + // Simulate the guest never bringing sshd up — the wait-for-ssh + // child context fires its deadline, returning a DeadlineExceeded. + d.guestWaitForSSH = func(ctx context.Context, _, _ string, _ time.Duration) error { + <-ctx.Done() + return ctx.Err() + } + + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "slowvm"}, + nil, + nil, + false, + false, + false, + false, + ) + if err == nil { + t.Fatal("want timeout error") + } + msg := err.Error() + for _, want := range []string{ + "slowvm", + "did not come up", + "banger vm logs slowvm", + "banger vm delete slowvm", + } { + if !strings.Contains(msg, want) { + t.Fatalf("err = %q, want contains %q", msg, want) + } + } +} + +func TestRunVMRunCommandModePropagatesExitCode(t *testing.T) { + d := defaultDeps() + + vm := model.VMRecord{ + ID: "vm-id", Name: "cmdbox", + Runtime: model.VMRuntime{State: model.VMStateRunning, GuestIP: "172.16.0.2"}, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Done: true, Success: true, VM: &vm}}, nil + } + d.guestWaitForSSH = func(context.Context, string, string, time.Duration) error { return nil } + d.vmWorkspacePrepare = func(context.Context, string, api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + t.Fatal("workspace prepare should not run without spec") + return api.VMWorkspacePrepareResult{}, nil + } + var sshArgsSeen []string + d.sshExec = func(_ context.Context, _ io.Reader, _, _ io.Writer, args []string) error { + sshArgsSeen = args + return exitErrorWithCode(t, 7) + } + + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "cmdbox"}, + nil, + []string{"false"}, + false, + false, + false, + false, + ) + var exitErr ExitCodeError + if !errors.As(err, &exitErr) || exitErr.Code != 7 { + t.Fatalf("d.runVMRun error = %v, want ExitCodeError{7}", err) + } + if len(sshArgsSeen) == 0 || sshArgsSeen[len(sshArgsSeen)-1] != "'false'" { + t.Fatalf("sshArgsSeen = %v, want trailing shell-quoted command 'false'", sshArgsSeen) + } + if !strings.Contains(stderr.String(), "[vm run] running command in guest") { + t.Fatalf("stderr = %q, want command progress", stderr.String()) + } +} + +func TestVMRunCommandRejectsBranchWithoutPath(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"vm", "run", "--branch", "feat"}) + cmd.SetOut(&bytes.Buffer{}) + cmd.SetErr(&bytes.Buffer{}) + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "--branch requires a path") { + t.Fatalf("Execute() error = %v, want --branch requires a path", err) + } +} + +func TestSplitVMRunArgsPartitionsOnDash(t *testing.T) { + cases := []struct { + name string + argv []string + wantPath []string + wantCmd []string + }{ + {"empty", []string{}, []string{}, nil}, + {"path only", []string{"./repo"}, []string{"./repo"}, nil}, + {"cmd only", []string{"--", "make", "test"}, []string{}, []string{"make", "test"}}, + {"path and cmd", []string{"./repo", "--", "ls"}, []string{"./repo"}, []string{"ls"}}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + // Parse through cobra so ArgsLenAtDash is populated. + var seenPath, seenCmd []string + root := &cobra.Command{Use: "root"} + run := &cobra.Command{ + Use: "run", + Args: cobra.ArbitraryArgs, + RunE: func(cmd *cobra.Command, args []string) error { + seenPath, seenCmd = splitVMRunArgs(cmd, args) + return nil + }, + } + root.AddCommand(run) + root.SetArgs(append([]string{"run"}, tc.argv...)) + root.SetOut(&bytes.Buffer{}) + root.SetErr(&bytes.Buffer{}) + if err := root.Execute(); err != nil { + t.Fatalf("execute: %v", err) + } + if len(seenPath) != len(tc.wantPath) { + t.Fatalf("path = %v, want %v", seenPath, tc.wantPath) + } + for i := range seenPath { + if seenPath[i] != tc.wantPath[i] { + t.Fatalf("path = %v, want %v", seenPath, tc.wantPath) + } + } + if len(seenCmd) != len(tc.wantCmd) { + t.Fatalf("cmd = %v, want %v", seenCmd, tc.wantCmd) + } + for i := range seenCmd { + if seenCmd[i] != tc.wantCmd[i] { + t.Fatalf("cmd = %v, want %v", seenCmd, tc.wantCmd) + } + } + }) + } +} + +func TestVMRunToolingHarnessScriptUsesMiseOnly(t *testing.T) { + script := vmRunToolingHarnessScript(toolingplan.Plan{ + RepoManagedTools: []string{"node"}, + Steps: []toolingplan.InstallStep{{Tool: "go", Version: "1.25.0", Source: "go.mod"}}, + Skips: []toolingplan.SkipNote{{Target: "python", Reason: "no .python-version"}}, + }) + + for _, want := range []string{ + `repo-managed mise tools: node`, + `run_best_effort "$MISE_BIN" install`, + `run_bounded_best_effort "$INSTALL_TIMEOUT_SECS" "$MISE_BIN" use -g --pin 'go@1.25.0'`, + `deterministic skip: python (no .python-version)`, + `run_best_effort "$MISE_BIN" reshim`, + } { + if !strings.Contains(script, want) { + t.Fatalf("script = %q, want %q", script, want) + } + } + for _, unwanted := range []string{`opencode run`, `PROMPT_FILE=`, `--format json`, `mimo-v2-pro-free`} { + if strings.Contains(script, unwanted) { + t.Fatalf("script = %q, want no %q", script, unwanted) + } + } +} + +func TestVMRunGuestDirIsFixed(t *testing.T) { + if got := vmRunGuestDir(); got != "/root/repo" { + t.Fatalf("vmRunGuestDir() = %q, want /root/repo", got) } } @@ -1147,54 +1911,7 @@ func TestNewBangerdCommandRejectsArgs(t *testing.T) { } } -func TestDaemonOutdated(t *testing.T) { - dir := t.TempDir() - current := filepath.Join(dir, "bangerd-current") - same := filepath.Join(dir, "bangerd-same") - stale := filepath.Join(dir, "bangerd-stale") - if err := os.WriteFile(current, []byte("current"), 0o755); err != nil { - t.Fatalf("write current: %v", err) - } - if err := os.Link(current, same); err != nil { - t.Fatalf("hard link: %v", err) - } - if err := os.WriteFile(stale, []byte("stale"), 0o755); err != nil { - t.Fatalf("write stale: %v", err) - } - - origBangerdPath := bangerdPathFunc - origDaemonExePath := daemonExePath - t.Cleanup(func() { - bangerdPathFunc = origBangerdPath - daemonExePath = origDaemonExePath - }) - - bangerdPathFunc = func() (string, error) { - return current, nil - } - daemonExePath = func(pid int) string { - if pid == 1 { - return same - } - return stale - } - - if daemonOutdated(1) { - t.Fatal("expected matching daemon executable to be current") - } - if !daemonOutdated(2) { - t.Fatal("expected replaced daemon executable to be outdated") - } -} - func TestDaemonStatusIncludesLogPathWhenStopped(t *testing.T) { - configHome := filepath.Join(t.TempDir(), "config") - stateHome := filepath.Join(t.TempDir(), "state") - runtimeHome := filepath.Join(t.TempDir(), "runtime") - t.Setenv("XDG_CONFIG_HOME", configHome) - t.Setenv("XDG_STATE_HOME", stateHome) - t.Setenv("XDG_RUNTIME_DIR", runtimeHome) - cmd := NewBangerCommand() var stdout bytes.Buffer cmd.SetOut(&stdout) @@ -1205,62 +1922,57 @@ func TestDaemonStatusIncludesLogPathWhenStopped(t *testing.T) { } output := stdout.String() - if !strings.Contains(output, "stopped\n") { - t.Fatalf("output = %q, want stopped status", output) - } - if !strings.Contains(output, "log: "+filepath.Join(stateHome, "banger", "bangerd.log")) { - t.Fatalf("output = %q, want daemon log path", output) - } - if !strings.Contains(output, "dns: 127.0.0.1:42069") { - t.Fatalf("output = %q, want dns listener", output) - } - if !strings.Contains(output, "web: http://127.0.0.1:7777") { - t.Fatalf("output = %q, want default web listener", output) + // Output is tabwriter-formatted (key TAB value, padded). Assert + // the key and value land on the same line rather than pinning a + // specific separator. + for _, want := range []string{ + "service", + "bangerd.service", + "/run/banger/bangerd.sock", + "journalctl -u bangerd.service", + } { + if !strings.Contains(output, want) { + t.Fatalf("output = %q, want %q", output, want) + } } } -func TestBuildDaemonCommandIsDetachedFromCallerContext(t *testing.T) { - cmd := buildDaemonCommand("/tmp/bangerd") +func TestDaemonStatusIncludesDaemonBuildInfoWhenRunning(t *testing.T) { + d := defaultDeps() - if cmd.Path != "/tmp/bangerd" { - t.Fatalf("command path = %q", cmd.Path) - } - if cmd.Cancel != nil { - t.Fatal("daemon process should not be tied to a CLI request context") - } -} - -func TestAbsolutizeImageBuildPaths(t *testing.T) { - dir := t.TempDir() - prev, err := os.Getwd() - if err != nil { - t.Fatalf("getwd: %v", err) - } - if err := os.Chdir(dir); err != nil { - t.Fatalf("chdir: %v", err) - } - t.Cleanup(func() { - _ = os.Chdir(prev) - }) - - params := api.ImageBuildParams{ - FromImage: "base-image", - KernelPath: "/kernel", - InitrdPath: "boot/initrd.img", - ModulesDir: "modules", - } - if err := absolutizeImageBuildPaths(¶ms); err != nil { - t.Fatalf("absolutizeImageBuildPaths: %v", err) + d.daemonPing = func(context.Context, string) (api.PingResult, error) { + return api.PingResult{ + Status: "ok", + PID: 42, + Version: "v1.2.3", + Commit: "abc123", + BuiltAt: "2026-03-22T12:00:00Z", + }, nil } - want := api.ImageBuildParams{ - FromImage: "base-image", - KernelPath: "/kernel", - InitrdPath: filepath.Join(dir, "boot/initrd.img"), - ModulesDir: filepath.Join(dir, "modules"), + cmd := d.newRootCommand() + var stdout bytes.Buffer + cmd.SetOut(&stdout) + cmd.SetErr(&stdout) + cmd.SetArgs([]string{"daemon", "status"}) + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) } - if !reflect.DeepEqual(params, want) { - t.Fatalf("params = %+v, want %+v", params, want) + + output := stdout.String() + for _, want := range []string{ + "service", + "bangerd.service", + "/run/banger/bangerd.sock", + "journalctl -u bangerd.service", + "42", + "v1.2.3", + "abc123", + "2026-03-22T12:00:00Z", + } { + if !strings.Contains(output, want) { + t.Fatalf("output = %q, want %q", output, want) + } } } @@ -1278,12 +1990,26 @@ func testRunGit(t *testing.T, dir string, args ...string) string { return string(output) } +type testVMRunUpload struct { + path string + mode os.FileMode + data []byte +} + type testVMRunGuestClient struct { closed bool + uploads []testVMRunUpload uploadPath string uploadMode os.FileMode uploadData []byte + uploadErr error + checkoutErr error + launchErr error script string + launchScript string + runScriptCalls int + tarSourceDir string + tarCommand string streamSourceDir string streamEntries []string streamCommand string @@ -1295,15 +2021,32 @@ func (c *testVMRunGuestClient) Close() error { } func (c *testVMRunGuestClient) UploadFile(ctx context.Context, remotePath string, mode os.FileMode, data []byte, logWriter io.Writer) error { + copyData := append([]byte(nil), data...) + c.uploads = append(c.uploads, testVMRunUpload{path: remotePath, mode: mode, data: copyData}) c.uploadPath = remotePath c.uploadMode = mode - c.uploadData = append([]byte(nil), data...) + c.uploadData = copyData + return c.uploadErr +} + +func (c *testVMRunGuestClient) StreamTar(ctx context.Context, sourceDir, remoteCommand string, logWriter io.Writer) error { + c.tarSourceDir = sourceDir + c.tarCommand = remoteCommand return nil } func (c *testVMRunGuestClient) RunScript(ctx context.Context, script string, logWriter io.Writer) error { - c.script = script - return nil + c.runScriptCalls++ + if c.runScriptCalls == 1 { + c.script = script + c.launchScript = script + if c.checkoutErr != nil { + return c.checkoutErr + } + return c.launchErr + } + c.launchScript = script + return c.launchErr } func (c *testVMRunGuestClient) StreamTarEntries(ctx context.Context, sourceDir string, entries []string, remoteCommand string, logWriter io.Writer) error { @@ -1312,3 +2055,181 @@ func (c *testVMRunGuestClient) StreamTarEntries(ctx context.Context, sourceDir s c.streamCommand = remoteCommand return nil } + +// stubEnsureDaemonForSend isolates XDG dirs and installs a daemon-ping +// fake onto the caller's *deps so `ensureDaemon` short-circuits without +// trying to spawn bangerd. +func stubEnsureDaemonForSend(t *testing.T, d *deps) { + t.Helper() + t.Setenv("XDG_CONFIG_HOME", filepath.Join(t.TempDir(), "config")) + t.Setenv("XDG_STATE_HOME", filepath.Join(t.TempDir(), "state")) + t.Setenv("XDG_RUNTIME_DIR", filepath.Join(t.TempDir(), "run")) + d.daemonPing = func(context.Context, string) (api.PingResult, error) { + return api.PingResult{Status: "ok", PID: os.Getpid()}, nil + } +} + +func TestVMWorkspaceExportCommandExists(t *testing.T) { + root := NewBangerCommand() + vm, _, err := root.Find([]string{"vm"}) + if err != nil { + t.Fatalf("find vm: %v", err) + } + workspace, _, err := vm.Find([]string{"workspace"}) + if err != nil { + t.Fatalf("find workspace: %v", err) + } + if _, _, err := workspace.Find([]string{"export"}); err != nil { + t.Fatalf("find workspace export: %v", err) + } +} + +func TestVMWorkspaceExportRejectsMissingArg(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"vm", "workspace", "export"}) + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "usage: banger vm workspace export") { + t.Fatalf("Execute() error = %v, want usage error", err) + } +} + +func TestVMWorkspaceExportWritesToStdout(t *testing.T) { + d := defaultDeps() + stubEnsureDaemonForSend(t, d) + + patch := []byte("diff --git a/main.go b/main.go\nindex 0000000..1111111 100644\n") + d.vmWorkspaceExport = func(_ context.Context, _ string, params api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + return api.WorkspaceExportResult{ + GuestPath: params.GuestPath, + Patch: patch, + ChangedFiles: []string{"main.go"}, + HasChanges: true, + }, nil + } + + cmd := d.newRootCommand() + var out bytes.Buffer + cmd.SetOut(&out) + cmd.SetErr(io.Discard) + cmd.SetArgs([]string{"vm", "workspace", "export", "devbox"}) + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } + if !bytes.Equal(out.Bytes(), patch) { + t.Fatalf("stdout = %q, want %q", out.Bytes(), patch) + } +} + +func TestVMWorkspaceExportWritesToFile(t *testing.T) { + d := defaultDeps() + stubEnsureDaemonForSend(t, d) + + patch := []byte("diff --git a/main.go b/main.go\n") + d.vmWorkspaceExport = func(_ context.Context, _ string, _ api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + return api.WorkspaceExportResult{ + GuestPath: "/root/repo", + Patch: patch, + ChangedFiles: []string{"main.go"}, + HasChanges: true, + }, nil + } + + outFile := filepath.Join(t.TempDir(), "worker.diff") + cmd := d.newRootCommand() + cmd.SetOut(io.Discard) + var stderr bytes.Buffer + cmd.SetErr(&stderr) + cmd.SetArgs([]string{"vm", "workspace", "export", "devbox", "--output", outFile}) + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } + + got, err := os.ReadFile(outFile) + if err != nil { + t.Fatalf("ReadFile: %v", err) + } + if !bytes.Equal(got, patch) { + t.Fatalf("file content = %q, want %q", got, patch) + } + if !strings.Contains(stderr.String(), "worker.diff") { + t.Fatalf("stderr = %q, want output path mentioned", stderr.String()) + } +} + +func TestVMWorkspaceExportNoChanges(t *testing.T) { + d := defaultDeps() + stubEnsureDaemonForSend(t, d) + + d.vmWorkspaceExport = func(_ context.Context, _ string, _ api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + return api.WorkspaceExportResult{ + GuestPath: "/root/repo", + HasChanges: false, + }, nil + } + + cmd := d.newRootCommand() + var out bytes.Buffer + var stderr bytes.Buffer + cmd.SetOut(&out) + cmd.SetErr(&stderr) + cmd.SetArgs([]string{"vm", "workspace", "export", "devbox"}) + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } + if out.Len() != 0 { + t.Fatalf("stdout = %q, want empty when no changes", out.String()) + } + if !strings.Contains(stderr.String(), "no changes") { + t.Fatalf("stderr = %q, want 'no changes'", stderr.String()) + } +} + +func TestVMWorkspaceExportGuestPathFlag(t *testing.T) { + d := defaultDeps() + stubEnsureDaemonForSend(t, d) + + var capturedParams api.WorkspaceExportParams + d.vmWorkspaceExport = func(_ context.Context, _ string, params api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + capturedParams = params + return api.WorkspaceExportResult{HasChanges: false}, nil + } + + cmd := d.newRootCommand() + cmd.SetOut(io.Discard) + cmd.SetErr(io.Discard) + cmd.SetArgs([]string{"vm", "workspace", "export", "devbox", "--guest-path", "/root/project"}) + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } + if capturedParams.GuestPath != "/root/project" { + t.Fatalf("GuestPath = %q, want /root/project", capturedParams.GuestPath) + } + if capturedParams.IDOrName != "devbox" { + t.Fatalf("IDOrName = %q, want devbox", capturedParams.IDOrName) + } +} + +func TestVMWorkspaceExportBaseCommitFlag(t *testing.T) { + d := defaultDeps() + stubEnsureDaemonForSend(t, d) + + var capturedParams api.WorkspaceExportParams + d.vmWorkspaceExport = func(_ context.Context, _ string, params api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + capturedParams = params + return api.WorkspaceExportResult{ + HasChanges: false, + BaseCommit: params.BaseCommit, + }, nil + } + + cmd := d.newRootCommand() + cmd.SetOut(io.Discard) + cmd.SetErr(io.Discard) + cmd.SetArgs([]string{"vm", "workspace", "export", "devbox", "--base-commit", "abc1234deadbeef"}) + if err := cmd.Execute(); err != nil { + t.Fatalf("Execute: %v", err) + } + if capturedParams.BaseCommit != "abc1234deadbeef" { + t.Fatalf("BaseCommit = %q, want abc1234deadbeef", capturedParams.BaseCommit) + } +} diff --git a/internal/cli/commands_daemon.go b/internal/cli/commands_daemon.go new file mode 100644 index 0000000..7669118 --- /dev/null +++ b/internal/cli/commands_daemon.go @@ -0,0 +1,55 @@ +package cli + +import ( + "fmt" + + "banger/internal/installmeta" + "banger/internal/paths" + + "github.com/spf13/cobra" +) + +func (d *deps) newDaemonCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "daemon", + Short: "Manage the installed banger services", + RunE: helpNoArgs, + } + cmd.AddCommand( + &cobra.Command{ + Use: "status", + Short: "Show owner-daemon and root-helper status", + Args: noArgsUsage("usage: banger daemon status"), + RunE: func(cmd *cobra.Command, args []string) error { + return d.runSystemStatus(cmd.Context(), cmd.OutOrStdout()) + }, + }, + &cobra.Command{ + Use: "stop", + Short: "Stop the installed banger services", + Args: noArgsUsage("usage: banger daemon stop"), + RunE: func(cmd *cobra.Command, args []string) error { + if err := requireRoot(); err != nil { + return err + } + if err := d.runSystemctl(cmd.Context(), "stop", installmeta.DefaultService, installmeta.DefaultRootHelperService); err != nil { + return err + } + _, err := fmt.Fprintln(cmd.OutOrStdout(), "stopped") + return err + }, + }, + &cobra.Command{ + Use: "socket", + Short: "Print the daemon socket path", + Args: noArgsUsage("usage: banger daemon socket"), + RunE: func(cmd *cobra.Command, args []string) error { + layout := paths.ResolveSystem() + var err error + _, err = fmt.Fprintln(cmd.OutOrStdout(), layout.SocketPath) + return err + }, + }, + ) + return cmd +} diff --git a/internal/cli/commands_image.go b/internal/cli/commands_image.go new file mode 100644 index 0000000..095482d --- /dev/null +++ b/internal/cli/commands_image.go @@ -0,0 +1,302 @@ +package cli + +import ( + "errors" + "fmt" + "strings" + + "banger/internal/api" + "banger/internal/model" + "banger/internal/rpc" + + "github.com/spf13/cobra" +) + +func (d *deps) newImageCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "image", + Short: "Pull and manage banger images (rootfs + kernel + work-seed)", + Long: strings.TrimSpace(` +A banger image bundles a rootfs.ext4, a kernel, an optional initrd ++ modules, and an optional work-seed (the snapshot used to populate +each new VM's /root). Most users only need 'banger image pull +' for the cataloged paths (see internal/imagecat), +or 'banger image pull ' for an OCI image. + +Subcommands: + pull fetch a bundle by catalog name OR pull an OCI image + register point banger at an existing local rootfs (advanced) + promote copy a registered image's files into banger's managed dir + list show what's installed + show print one image's full record as JSON + delete remove an image (no VMs may reference it) +`), + Example: strings.TrimSpace(` + banger image pull debian-bookworm + banger image pull docker.io/library/alpine:3.20 --kernel-ref generic-6.12 + banger image list +`), + RunE: helpNoArgs, + } + cmd.AddCommand( + d.newImageRegisterCommand(), + d.newImagePullCommand(), + d.newImagePromoteCommand(), + d.newImageListCommand(), + d.newImageShowCommand(), + d.newImageDeleteCommand(), + d.newImageCacheCommand(), + ) + return cmd +} + +// newImageCacheCommand groups OCI-cache lifecycle subcommands. Today +// the only one is `prune`; future additions (size, list, etc.) plug +// in here without polluting the top-level `image` namespace. +func (d *deps) newImageCacheCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "cache", + Short: "Manage banger's OCI layer-blob cache", + Long: strings.TrimSpace(` +banger keeps a local copy of every OCI layer it downloads so a re-pull +of the same image (or any image that shares a base layer) skips the +network round-trip. The cache lives under the daemon's CacheDir +(see 'banger doctor' or docs/config.md). Layers accumulate forever; +'banger image cache prune' is the cheap way to reclaim disk. +`), + Example: strings.TrimSpace(` + banger image cache prune --dry-run + banger image cache prune +`), + RunE: helpNoArgs, + } + cmd.AddCommand(d.newImageCachePruneCommand()) + return cmd +} + +func (d *deps) newImageCachePruneCommand() *cobra.Command { + var dryRun bool + cmd := &cobra.Command{ + Use: "prune", + Short: "Remove every cached OCI layer blob", + Long: strings.TrimSpace(` +Removes every layer blob under the OCI cache. Registered banger +images are independent of the cache (each pull flattens layers into +a self-contained ext4), so prune only loses re-pull avoidance — the +next pull of the same image re-downloads the layers it needs. + +Safe to run any time the daemon is idle. If you have an image pull +in flight when you run prune, that pull may fail and need a retry. + +--dry-run reports the byte count without removing anything. +`), + Args: noArgsUsage("usage: banger image cache prune [--dry-run]"), + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.ImageCachePruneResult](cmd.Context(), layout.SocketPath, "image.cache.prune", api.ImageCachePruneParams{DryRun: dryRun}) + if err != nil { + return err + } + out := cmd.OutOrStdout() + verb := "freed" + if result.DryRun { + verb = "would free" + } + _, err = fmt.Fprintf(out, "%s %s across %d blob(s) in %s\n", + verb, humanSize(result.BytesFreed), result.BlobsFreed, result.CacheDir) + return err + }, + } + cmd.Flags().BoolVar(&dryRun, "dry-run", false, "report the size that would be freed without deleting anything") + return cmd +} + +func (d *deps) newImageRegisterCommand() *cobra.Command { + var params api.ImageRegisterParams + cmd := &cobra.Command{ + Use: "register", + Short: "Register or update an unmanaged image", + Args: noArgsUsage("usage: banger image register --name --rootfs [--work-seed ] (--kernel [--initrd ] [--modules ] | --kernel-ref )"), + RunE: func(cmd *cobra.Command, args []string) error { + if strings.TrimSpace(params.KernelRef) != "" && (params.KernelPath != "" || params.InitrdPath != "" || params.ModulesDir != "") { + return errors.New("--kernel-ref is mutually exclusive with --kernel/--initrd/--modules") + } + if err := absolutizeImageRegisterPaths(¶ms); err != nil { + return err + } + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.register", params) + if err != nil { + return err + } + return printImageSummary(cmd.OutOrStdout(), result.Image) + }, + } + cmd.Flags().StringVar(¶ms.Name, "name", "", "image name") + cmd.Flags().StringVar(¶ms.RootfsPath, "rootfs", "", "rootfs path") + cmd.Flags().StringVar(¶ms.WorkSeedPath, "work-seed", "", "work-seed path") + cmd.Flags().StringVar(¶ms.KernelPath, "kernel", "", "kernel path") + cmd.Flags().StringVar(¶ms.InitrdPath, "initrd", "", "initrd path") + cmd.Flags().StringVar(¶ms.ModulesDir, "modules", "", "modules dir") + cmd.Flags().StringVar(¶ms.KernelRef, "kernel-ref", "", "name of a cataloged kernel (see 'banger kernel list')") + _ = cmd.RegisterFlagCompletionFunc("kernel-ref", d.completeKernelNames) + return cmd +} + +func (d *deps) newImagePullCommand() *cobra.Command { + var ( + params api.ImagePullParams + sizeRaw string + ) + cmd := &cobra.Command{ + Use: "pull ", + Short: "Pull an image bundle (catalog name) or OCI image and register it", + ValidArgsFunction: d.completeImageCatalogNameOnlyAtPos0, + Long: strings.TrimSpace(` +Pull an image into banger. Two paths: + + • Catalog name (e.g. 'debian-bookworm') + Fetches a pre-built bundle from the embedded imagecat catalog. + Kernel-ref comes from the catalog entry; --kernel-ref still + overrides. + + • OCI reference (e.g. 'docker.io/library/debian:bookworm') + Pulls the image, flattens its layers, fixes ownership, injects + banger's guest agents. --kernel-ref or direct --kernel/--initrd/ + --modules are required. + +Use 'banger image list' to see installed images. +`), + Example: strings.TrimSpace(` + banger image pull debian-bookworm + banger image pull debian-bookworm --name sandbox + banger image pull docker.io/library/debian:bookworm --kernel-ref generic-6.12 +`), + Args: exactArgsUsage(1, "usage: banger image pull [--name ] [--kernel-ref ] [--kernel ] [--initrd ] [--modules ] [--size ]"), + RunE: func(cmd *cobra.Command, args []string) error { + params.Ref = args[0] + if strings.TrimSpace(params.KernelRef) != "" && (params.KernelPath != "" || params.InitrdPath != "" || params.ModulesDir != "") { + return errors.New("--kernel-ref is mutually exclusive with --kernel/--initrd/--modules") + } + if strings.TrimSpace(sizeRaw) != "" { + size, err := model.ParseSize(sizeRaw) + if err != nil { + return fmt.Errorf("--size: %w", err) + } + params.SizeBytes = size + } + if err := absolutizePaths(¶ms.KernelPath, ¶ms.InitrdPath, ¶ms.ModulesDir); err != nil { + return err + } + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + var result api.ImageShowResult + err = withHeartbeat(cmd.ErrOrStderr(), "image pull", func() error { + var callErr error + result, callErr = rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.pull", params) + return callErr + }) + if err != nil { + return err + } + return printImageSummary(cmd.OutOrStdout(), result.Image) + }, + } + cmd.Flags().StringVar(¶ms.Name, "name", "", "image name (defaults to the ref's repo+tag, sanitised)") + cmd.Flags().StringVar(¶ms.KernelPath, "kernel", "", "kernel path") + cmd.Flags().StringVar(¶ms.InitrdPath, "initrd", "", "initrd path") + cmd.Flags().StringVar(¶ms.ModulesDir, "modules", "", "modules dir") + cmd.Flags().StringVar(¶ms.KernelRef, "kernel-ref", "", "name of a cataloged kernel (see 'banger kernel list')") + cmd.Flags().StringVar(&sizeRaw, "size", "", "ext4 image size, e.g. 4GiB, 512M, 2G (defaults to content + 25%, min 1GiB)") + _ = cmd.RegisterFlagCompletionFunc("kernel-ref", d.completeKernelNames) + return cmd +} + +func (d *deps) newImagePromoteCommand() *cobra.Command { + return &cobra.Command{ + Use: "promote ", + Short: "Promote an unmanaged image to a managed artifact", + Args: exactArgsUsage(1, "usage: banger image promote "), + ValidArgsFunction: d.completeImageNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.promote", api.ImageRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + return printImageSummary(cmd.OutOrStdout(), result.Image) + }, + } +} + +func (d *deps) newImageListCommand() *cobra.Command { + return &cobra.Command{ + Use: "list", + Aliases: []string{"ls"}, + Short: "List images", + Args: noArgsUsage("usage: banger image list"), + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.ImageListResult](cmd.Context(), layout.SocketPath, "image.list", api.Empty{}) + if err != nil { + return err + } + return printImageListTable(cmd.OutOrStdout(), result.Images) + }, + } +} + +func (d *deps) newImageShowCommand() *cobra.Command { + return &cobra.Command{ + Use: "show ", + Short: "Show image details", + Args: exactArgsUsage(1, "usage: banger image show "), + ValidArgsFunction: d.completeImageNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.show", api.ImageRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result.Image) + }, + } +} + +func (d *deps) newImageDeleteCommand() *cobra.Command { + return &cobra.Command{ + Use: "delete ", + Aliases: []string{"rm"}, + Short: "Delete an image", + Args: exactArgsUsage(1, "usage: banger image delete "), + ValidArgsFunction: d.completeImageNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.ImageShowResult](cmd.Context(), layout.SocketPath, "image.delete", api.ImageRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + return printImageSummary(cmd.OutOrStdout(), result.Image) + }, + } +} diff --git a/internal/cli/commands_internal.go b/internal/cli/commands_internal.go new file mode 100644 index 0000000..2201b21 --- /dev/null +++ b/internal/cli/commands_internal.go @@ -0,0 +1,441 @@ +package cli + +import ( + "archive/tar" + "crypto/sha256" + "encoding/hex" + "encoding/json" + "errors" + "fmt" + "io" + "io/fs" + "os" + "path/filepath" + "strings" + + "banger/internal/config" + "banger/internal/hostnat" + "banger/internal/imagecat" + "banger/internal/imagepull" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" + + "github.com/klauspost/compress/zstd" + "github.com/spf13/cobra" +) + +func (d *deps) newInternalCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "internal", + Hidden: true, + RunE: helpNoArgs, + } + cmd.AddCommand( + newInternalNATCommand(), + newInternalWorkSeedCommand(), + newInternalSSHKeyPathCommand(), + newInternalFirecrackerPathCommand(), + newInternalVSockAgentPathCommand(), + newInternalMakeBundleCommand(), + ) + return cmd +} + +func newInternalSSHKeyPathCommand() *cobra.Command { + return &cobra.Command{ + Use: "ssh-key-path", + Hidden: true, + Args: noArgsUsage("usage: banger internal ssh-key-path"), + RunE: func(cmd *cobra.Command, args []string) error { + layout, err := paths.Resolve() + if err != nil { + return err + } + cfg, err := config.Load(layout) + if err != nil { + return err + } + _, err = fmt.Fprintln(cmd.OutOrStdout(), cfg.SSHKeyPath) + return err + }, + } +} + +func newInternalFirecrackerPathCommand() *cobra.Command { + return &cobra.Command{ + Use: "firecracker-path", + Hidden: true, + Args: noArgsUsage("usage: banger internal firecracker-path"), + RunE: func(cmd *cobra.Command, args []string) error { + layout, err := paths.Resolve() + if err != nil { + return err + } + cfg, err := config.Load(layout) + if err != nil { + return err + } + if strings.TrimSpace(cfg.FirecrackerBin) == "" { + return errors.New("firecracker binary not configured; install firecracker or set firecracker_bin") + } + _, err = fmt.Fprintln(cmd.OutOrStdout(), cfg.FirecrackerBin) + return err + }, + } +} + +func newInternalVSockAgentPathCommand() *cobra.Command { + return &cobra.Command{ + Use: "vsock-agent-path", + Hidden: true, + Args: noArgsUsage("usage: banger internal vsock-agent-path"), + RunE: func(cmd *cobra.Command, args []string) error { + path, err := paths.CompanionBinaryPath("banger-vsock-agent") + if err != nil { + return err + } + _, err = fmt.Fprintln(cmd.OutOrStdout(), path) + return err + }, + } +} + +func newInternalMakeBundleCommand() *cobra.Command { + var ( + rootfsTarPath string + name string + distro string + arch string + kernelRef string + description string + sizeSpec string + outPath string + ) + cmd := &cobra.Command{ + Use: "make-bundle", + Hidden: true, + Short: "Build a banger image bundle (.tar.zst) from a flat rootfs tar", + Args: noArgsUsage("usage: banger internal make-bundle --rootfs-tar --name --out "), + RunE: func(cmd *cobra.Command, args []string) error { + return runInternalMakeBundle(cmd, internalMakeBundleOpts{ + rootfsTarPath: rootfsTarPath, + name: name, + distro: distro, + arch: arch, + kernelRef: kernelRef, + description: description, + sizeSpec: sizeSpec, + outPath: outPath, + }) + }, + } + cmd.Flags().StringVar(&rootfsTarPath, "rootfs-tar", "", "flat rootfs tar file, or '-' for stdin") + cmd.Flags().StringVar(&name, "name", "", "bundle name (filesystem-safe identifier)") + cmd.Flags().StringVar(&distro, "distro", "", "distro label (e.g. debian)") + cmd.Flags().StringVar(&arch, "arch", "x86_64", "architecture label") + cmd.Flags().StringVar(&kernelRef, "kernel-ref", "", "kernelcat entry name this image pairs with") + cmd.Flags().StringVar(&description, "description", "", "short description") + cmd.Flags().StringVar(&sizeSpec, "size", "", "rootfs ext4 size (e.g. 4G); defaults to tree size + 25%") + cmd.Flags().StringVar(&outPath, "out", "", "output bundle path (.tar.zst)") + return cmd +} + +type internalMakeBundleOpts struct { + rootfsTarPath string + name string + distro string + arch string + kernelRef string + description string + sizeSpec string + outPath string +} + +func runInternalMakeBundle(cmd *cobra.Command, opts internalMakeBundleOpts) error { + if err := imagecat.ValidateName(opts.name); err != nil { + return err + } + if strings.TrimSpace(opts.rootfsTarPath) == "" { + return errors.New("--rootfs-tar is required") + } + if strings.TrimSpace(opts.outPath) == "" { + return errors.New("--out is required") + } + if strings.TrimSpace(opts.arch) == "" { + opts.arch = "x86_64" + } + + var sizeBytes int64 + if s := strings.TrimSpace(opts.sizeSpec); s != "" { + n, err := model.ParseSize(s) + if err != nil { + return fmt.Errorf("parse --size: %w", err) + } + sizeBytes = n + } + + ctx := cmd.Context() + stagingRoot, err := os.MkdirTemp("", "banger-mkbundle-") + if err != nil { + return err + } + defer os.RemoveAll(stagingRoot) + rootfsTree := filepath.Join(stagingRoot, "rootfs") + if err := os.MkdirAll(rootfsTree, 0o755); err != nil { + return err + } + + var tarReader io.Reader + if opts.rootfsTarPath == "-" { + tarReader = cmd.InOrStdin() + } else { + f, err := os.Open(opts.rootfsTarPath) + if err != nil { + return fmt.Errorf("open rootfs tar: %w", err) + } + defer f.Close() + tarReader = f + } + + fmt.Fprintln(cmd.ErrOrStderr(), "[make-bundle] extracting rootfs") + meta, err := imagepull.FlattenTar(ctx, tarReader, rootfsTree) + if err != nil { + return fmt.Errorf("flatten rootfs: %w", err) + } + + // docker create drops /.dockerenv (and containerd drops + // /run/.containerenv) into the container's writable layer, so + // `docker export` includes them in the tar. systemd-detect-virt + // reads those files and flags the boot as virtualization=docker, + // which disables udev device-unit activation (including the work- + // disk dev-vdb.device) and leaves systemd waiting forever. Strip + // them before building the ext4. + for _, marker := range []string{".dockerenv", "run/.containerenv"} { + path := filepath.Join(rootfsTree, marker) + if err := os.Remove(path); err != nil && !os.IsNotExist(err) { + return fmt.Errorf("strip %s: %w", marker, err) + } + delete(meta.Entries, marker) + } + + if sizeBytes <= 0 { + treeSize, err := dirSize(rootfsTree) + if err != nil { + return fmt.Errorf("size rootfs tree: %w", err) + } + // +50% headroom for ext4 overhead (inode tables, block-group + // descriptors, journal, 5% reserved margin). + sizeBytes = treeSize + treeSize/2 + if sizeBytes < imagepull.MinExt4Size { + sizeBytes = imagepull.MinExt4Size + } + } + + ext4Path := filepath.Join(stagingRoot, imagecat.RootfsFilename) + runner := system.NewRunner() + fmt.Fprintf(cmd.ErrOrStderr(), "[make-bundle] building rootfs.ext4 (%d bytes)\n", sizeBytes) + if err := imagepull.BuildExt4(ctx, runner, rootfsTree, ext4Path, sizeBytes); err != nil { + return fmt.Errorf("build ext4: %w", err) + } + fmt.Fprintln(cmd.ErrOrStderr(), "[make-bundle] applying ownership fixup") + if err := imagepull.ApplyOwnership(ctx, runner, ext4Path, meta); err != nil { + return fmt.Errorf("apply ownership: %w", err) + } + fmt.Fprintln(cmd.ErrOrStderr(), "[make-bundle] injecting guest agents") + vsockBin, err := paths.CompanionBinaryPath("banger-vsock-agent") + if err != nil { + return fmt.Errorf("locate vsock agent: %w", err) + } + if err := imagepull.InjectGuestAgents(ctx, runner, ext4Path, imagepull.GuestAgentAssets{VsockAgentBin: vsockBin}); err != nil { + return fmt.Errorf("inject guest agents: %w", err) + } + + manifest := imagecat.Manifest{ + Name: opts.name, + Distro: strings.TrimSpace(opts.distro), + Arch: opts.arch, + KernelRef: strings.TrimSpace(opts.kernelRef), + Description: strings.TrimSpace(opts.description), + } + manifestPath := filepath.Join(stagingRoot, imagecat.ManifestFilename) + manifestData, err := json.MarshalIndent(manifest, "", " ") + if err != nil { + return err + } + if err := os.WriteFile(manifestPath, append(manifestData, '\n'), 0o644); err != nil { + return err + } + + fmt.Fprintln(cmd.ErrOrStderr(), "[make-bundle] packaging bundle") + if err := writeBundleTarZst(opts.outPath, ext4Path, manifestPath); err != nil { + return fmt.Errorf("write bundle: %w", err) + } + + sum, err := sha256HexFile(opts.outPath) + if err != nil { + return err + } + stat, err := os.Stat(opts.outPath) + if err != nil { + return err + } + fmt.Fprintf(cmd.OutOrStdout(), "bundle: %s\nsha256: %s\nsize: %d\n", opts.outPath, sum, stat.Size()) + return nil +} + +func dirSize(root string) (int64, error) { + var total int64 + err := filepath.WalkDir(root, func(_ string, d fs.DirEntry, err error) error { + if err != nil { + return err + } + if !d.Type().IsRegular() { + return nil + } + info, err := d.Info() + if err != nil { + return err + } + total += info.Size() + return nil + }) + return total, err +} + +func writeBundleTarZst(outPath, rootfsPath, manifestPath string) error { + if err := os.MkdirAll(filepath.Dir(outPath), 0o755); err != nil { + return err + } + out, err := os.OpenFile(outPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644) + if err != nil { + return err + } + defer out.Close() + zw, err := zstd.NewWriter(out, zstd.WithEncoderLevel(zstd.SpeedBestCompression)) + if err != nil { + return err + } + tw := tar.NewWriter(zw) + for _, src := range []struct{ path, name string }{ + {rootfsPath, imagecat.RootfsFilename}, + {manifestPath, imagecat.ManifestFilename}, + } { + if err := writeBundleFile(tw, src.path, src.name); err != nil { + _ = tw.Close() + _ = zw.Close() + return err + } + } + if err := tw.Close(); err != nil { + _ = zw.Close() + return err + } + if err := zw.Close(); err != nil { + return err + } + return out.Close() +} + +func writeBundleFile(tw *tar.Writer, src, name string) error { + f, err := os.Open(src) + if err != nil { + return err + } + defer f.Close() + fi, err := f.Stat() + if err != nil { + return err + } + if err := tw.WriteHeader(&tar.Header{ + Name: name, + Size: fi.Size(), + Mode: 0o644, + Typeflag: tar.TypeReg, + ModTime: fi.ModTime(), + }); err != nil { + return err + } + _, err = io.Copy(tw, f) + return err +} + +func sha256HexFile(path string) (string, error) { + f, err := os.Open(path) + if err != nil { + return "", err + } + defer f.Close() + h := sha256.New() + if _, err := io.Copy(h, f); err != nil { + return "", err + } + return hex.EncodeToString(h.Sum(nil)), nil +} + +func newInternalWorkSeedCommand() *cobra.Command { + var rootfsPath string + var outPath string + cmd := &cobra.Command{ + Use: "work-seed", + Hidden: true, + Args: noArgsUsage("usage: banger internal work-seed --rootfs [--out ]"), + RunE: func(cmd *cobra.Command, args []string) error { + rootfsPath = strings.TrimSpace(rootfsPath) + outPath = strings.TrimSpace(outPath) + if rootfsPath == "" { + return errors.New("rootfs path is required") + } + if outPath == "" { + outPath = system.WorkSeedPath(rootfsPath) + } + if err := system.EnsureSudo(cmd.Context()); err != nil { + return err + } + return system.BuildWorkSeedImage(cmd.Context(), system.NewRunner(), rootfsPath, outPath) + }, + } + cmd.Flags().StringVar(&rootfsPath, "rootfs", "", "rootfs image path") + cmd.Flags().StringVar(&outPath, "out", "", "output work-seed image path") + return cmd +} + +func newInternalNATCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "nat", + Hidden: true, + RunE: helpNoArgs, + } + cmd.AddCommand( + newInternalNATActionCommand("up", true), + newInternalNATActionCommand("down", false), + ) + return cmd +} + +func newInternalNATActionCommand(use string, enable bool) *cobra.Command { + var guestIP string + var tapDevice string + cmd := &cobra.Command{ + Use: use, + Hidden: true, + Args: noArgsUsage("usage: banger internal nat " + use + " --guest-ip --tap "), + RunE: func(cmd *cobra.Command, args []string) error { + guestIP = strings.TrimSpace(guestIP) + tapDevice = strings.TrimSpace(tapDevice) + if guestIP == "" { + return errors.New("guest IP is required") + } + if tapDevice == "" { + return errors.New("tap device is required") + } + if err := system.EnsureSudo(cmd.Context()); err != nil { + return err + } + return hostnat.Ensure(cmd.Context(), system.NewRunner(), guestIP, tapDevice, enable) + }, + } + cmd.Flags().StringVar(&guestIP, "guest-ip", "", "guest IPv4 address") + cmd.Flags().StringVar(&tapDevice, "tap", "", "tap device name") + return cmd +} diff --git a/internal/cli/commands_kernel.go b/internal/cli/commands_kernel.go new file mode 100644 index 0000000..a4afd55 --- /dev/null +++ b/internal/cli/commands_kernel.go @@ -0,0 +1,185 @@ +package cli + +import ( + "errors" + "fmt" + "path/filepath" + "strings" + + "banger/internal/api" + "banger/internal/rpc" + + "github.com/spf13/cobra" +) + +func (d *deps) newKernelCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "kernel", + Short: "Pull and manage Firecracker-compatible kernels", + Long: strings.TrimSpace(` +Banger boots guests with a separate kernel artifact (vmlinux, plus +optional initrd + modules). Kernels are tracked by name in a local +catalog so multiple images can share one. + +Most users never run these commands directly: 'banger image pull' +auto-pulls the kernel referenced by the catalog entry. Use these +commands when you want to inspect what's installed, switch a VM to +a different kernel via 'image register --kernel-ref', or import a +kernel built locally with scripts/make-*-kernel.sh. + +Subcommands: + pull download a cataloged kernel by name + list show what's installed (or --available for the catalog) + show inspect one entry as JSON + rm remove a local kernel + import register a kernel built from scripts/make-*-kernel.sh +`), + Example: strings.TrimSpace(` + banger kernel list --available + banger kernel pull generic-6.12 + banger kernel import void-kernel --from build/manual/void-kernel +`), + RunE: helpNoArgs, + } + cmd.AddCommand( + d.newKernelListCommand(), + d.newKernelShowCommand(), + d.newKernelRmCommand(), + d.newKernelImportCommand(), + d.newKernelPullCommand(), + ) + return cmd +} + +func (d *deps) newKernelPullCommand() *cobra.Command { + var force bool + cmd := &cobra.Command{ + Use: "pull ", + Short: "Download a cataloged kernel bundle", + Args: exactArgsUsage(1, "usage: banger kernel pull [--force]"), + ValidArgsFunction: d.completeKernelCatalogNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + var result api.KernelShowResult + err = withHeartbeat(cmd.ErrOrStderr(), "kernel pull", func() error { + var callErr error + result, callErr = rpc.Call[api.KernelShowResult](cmd.Context(), layout.SocketPath, "kernel.pull", api.KernelPullParams{Name: args[0], Force: force}) + return callErr + }) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result.Entry) + }, + } + cmd.Flags().BoolVar(&force, "force", false, "re-pull even if already present") + return cmd +} + +func (d *deps) newKernelImportCommand() *cobra.Command { + var params api.KernelImportParams + cmd := &cobra.Command{ + Use: "import ", + Short: "Import a kernel bundle produced by scripts/make-*-kernel.sh", + Long: "Copy the kernel, optional initrd, and optional modules directory from into the local kernel catalog keyed by . is usually build/manual/void-kernel or build/manual/alpine-kernel.", + Args: exactArgsUsage(1, "usage: banger kernel import --from "), + RunE: func(cmd *cobra.Command, args []string) error { + params.Name = args[0] + if strings.TrimSpace(params.FromDir) == "" { + return errors.New("--from is required") + } + abs, err := filepath.Abs(params.FromDir) + if err != nil { + return err + } + params.FromDir = abs + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.KernelShowResult](cmd.Context(), layout.SocketPath, "kernel.import", params) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result.Entry) + }, + } + cmd.Flags().StringVar(¶ms.FromDir, "from", "", "directory produced by make-*-kernel.sh (e.g. build/manual/void-kernel)") + cmd.Flags().StringVar(¶ms.Distro, "distro", "", "distribution label stored in the manifest (e.g. void, alpine)") + cmd.Flags().StringVar(¶ms.Arch, "arch", "", "architecture label stored in the manifest (e.g. x86_64)") + return cmd +} + +func (d *deps) newKernelListCommand() *cobra.Command { + var available bool + cmd := &cobra.Command{ + Use: "list", + Aliases: []string{"ls"}, + Short: "List kernels (local by default, or --available for the catalog)", + Args: noArgsUsage("usage: banger kernel list [--available]"), + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if available { + result, err := rpc.Call[api.KernelCatalogResult](cmd.Context(), layout.SocketPath, "kernel.catalog", api.Empty{}) + if err != nil { + return err + } + return printKernelCatalogTable(cmd.OutOrStdout(), result.Entries) + } + result, err := rpc.Call[api.KernelListResult](cmd.Context(), layout.SocketPath, "kernel.list", api.Empty{}) + if err != nil { + return err + } + return printKernelListTable(cmd.OutOrStdout(), result.Entries) + }, + } + cmd.Flags().BoolVar(&available, "available", false, "show the built-in catalog (with pulled/available status) instead of local entries") + return cmd +} + +func (d *deps) newKernelShowCommand() *cobra.Command { + return &cobra.Command{ + Use: "show ", + Short: "Show kernel catalog entry details", + Args: exactArgsUsage(1, "usage: banger kernel show "), + ValidArgsFunction: d.completeKernelNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.KernelShowResult](cmd.Context(), layout.SocketPath, "kernel.show", api.KernelRefParams{Name: args[0]}) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result.Entry) + }, + } +} + +func (d *deps) newKernelRmCommand() *cobra.Command { + return &cobra.Command{ + Use: "rm ", + Aliases: []string{"remove", "delete"}, + Short: "Remove a kernel catalog entry", + Args: exactArgsUsage(1, "usage: banger kernel rm "), + ValidArgsFunction: d.completeKernelNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if _, err := rpc.Call[api.Empty](cmd.Context(), layout.SocketPath, "kernel.delete", api.KernelRefParams{Name: args[0]}); err != nil { + return err + } + _, err = fmt.Fprintf(cmd.OutOrStdout(), "removed %s\n", args[0]) + return err + }, + } +} diff --git a/internal/cli/commands_ssh_config.go b/internal/cli/commands_ssh_config.go new file mode 100644 index 0000000..87001ef --- /dev/null +++ b/internal/cli/commands_ssh_config.go @@ -0,0 +1,102 @@ +package cli + +import ( + "fmt" + "strings" + + "banger/internal/config" + "banger/internal/daemon" + "banger/internal/paths" + + "github.com/spf13/cobra" +) + +// newSSHConfigCommand exposes the opt-in ergonomics for `ssh .vm`. +// Default mode prints current status + the exact Include line the user +// can paste into ~/.ssh/config themselves. --install does the include +// for them inside a marker-fenced block; --uninstall reverses it. +func newSSHConfigCommand() *cobra.Command { + var ( + install bool + uninstall bool + ) + cmd := &cobra.Command{ + Use: "ssh-config", + Short: "Enable plain 'ssh .vm' from any terminal", + Long: `Banger keeps a self-contained SSH client config under its own config +directory (never touching ~/.ssh/config on its own). Opt in to the +convenience shortcut that lets you run 'ssh .vm' from any +terminal, bypassing 'banger vm ssh': + + banger ssh-config # print status + copy-paste snippet + banger ssh-config --install # add an Include line to ~/.ssh/config + banger ssh-config --uninstall # remove banger's Include from ~/.ssh/config + +After --install, 'ssh agent.vm' works the same as 'banger vm ssh +agent', including for tools like rsync, scp, and editor remotes. +`, + Example: strings.TrimSpace(` + banger ssh-config --install + ssh agent.vm + rsync -avz ./code agent.vm:/root/repo/ +`), + Args: noArgsUsage("usage: banger ssh-config [--install|--uninstall]"), + RunE: func(cmd *cobra.Command, args []string) error { + if install && uninstall { + return fmt.Errorf("use only one of --install or --uninstall") + } + layout, err := paths.Resolve() + if err != nil { + return err + } + cfg, err := config.Load(layout) + if err != nil { + return err + } + if err := daemon.SyncVMSSHClientConfig(layout, cfg.SSHKeyPath); err != nil { + return err + } + bangerConfig := daemon.BangerSSHConfigPath(layout) + switch { + case install: + if err := daemon.InstallUserSSHInclude(layout); err != nil { + return err + } + _, err = fmt.Fprintf(cmd.OutOrStdout(), + "added Include %s to ~/.ssh/config — `ssh .vm` will now route through banger\n", + bangerConfig, + ) + return err + case uninstall: + if err := daemon.UninstallUserSSHInclude(); err != nil { + return err + } + _, err = fmt.Fprintln(cmd.OutOrStdout(), "removed banger's entries from ~/.ssh/config") + return err + default: + installed, err := daemon.UserSSHIncludeInstalled() + if err != nil { + return err + } + out := cmd.OutOrStdout() + fmt.Fprintf(out, "banger ssh_config: %s\n", bangerConfig) + if installed { + fmt.Fprintln(out, "status: included from ~/.ssh/config") + fmt.Fprintln(out, "") + fmt.Fprintln(out, "`ssh .vm` is enabled. Run `banger ssh-config --uninstall` to revert.") + } else { + fmt.Fprintln(out, "status: not included (opt-in)") + fmt.Fprintln(out, "") + fmt.Fprintln(out, "Enable `ssh .vm` in two ways:") + fmt.Fprintln(out, " banger ssh-config --install") + fmt.Fprintln(out, "or add this line to ~/.ssh/config yourself:") + fmt.Fprintf(out, " Include %s\n", bangerConfig) + } + return nil + } + }, + } + cmd.Flags().BoolVar(&install, "install", false, "add an Include line to ~/.ssh/config") + cmd.Flags().BoolVar(&uninstall, "uninstall", false, "remove banger's Include from ~/.ssh/config") + return cmd +} diff --git a/internal/cli/commands_system.go b/internal/cli/commands_system.go new file mode 100644 index 0000000..f1099ac --- /dev/null +++ b/internal/cli/commands_system.go @@ -0,0 +1,485 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "io" + "os" + "path/filepath" + "strconv" + "strings" + "text/tabwriter" + + "banger/internal/buildinfo" + "banger/internal/installmeta" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" + + "github.com/spf13/cobra" +) + +const ( + systemBangerBin = "/usr/local/bin/banger" + systemBangerdBin = "/usr/local/bin/bangerd" + systemCompanionDir = "/usr/local/lib/banger" + systemCompanionAgent = systemCompanionDir + "/banger-vsock-agent" + systemdUserUnitPath = "/etc/systemd/system/" + installmeta.DefaultService + systemdRootUnitPath = "/etc/systemd/system/" + installmeta.DefaultRootHelperService + systemCoverDirEnv = "BANGER_SYSTEM_GOCOVERDIR" + rootCoverDirEnv = "BANGER_ROOT_HELPER_GOCOVERDIR" +) + +func (d *deps) newSystemCommand() *cobra.Command { + var owner string + var purge bool + cmd := &cobra.Command{ + Use: "system", + Short: "Install banger's owner-daemon and root-helper systemd units", + Long: strings.TrimSpace(` +Banger ships as two services: an owner-user daemon for +orchestration and a narrow root helper for bridge/tap, NAT, and +Firecracker launch. 'banger system' installs, restarts, inspects, +and removes them. + +First-run flow (must be run as root): + + sudo banger system install --owner $USER install both services + banger system status confirm they're up + banger doctor check host readiness + +After 'install', the owner user can run 'banger ...' day to day +without sudo. Subsequent invocations: + + sudo banger system restart bounce both services + sudo banger system uninstall remove services + binaries + sudo banger system uninstall --purge also delete /var/lib/banger + +See docs/privileges.md for the full trust model. +`), + Example: strings.TrimSpace(` + sudo banger system install --owner alice + banger system status + sudo banger system uninstall --purge +`), + RunE: helpNoArgs, + } + installCmd := &cobra.Command{ + Use: "install", + Short: "Install or refresh the owner daemon and root helper", + Args: noArgsUsage("usage: banger system install [--owner USER]"), + RunE: func(cmd *cobra.Command, args []string) error { + return d.runSystemInstall(cmd.Context(), cmd.OutOrStdout(), owner) + }, + } + installCmd.Flags().StringVar(&owner, "owner", "", "login user who will operate banger day-to-day") + + statusCmd := &cobra.Command{ + Use: "status", + Short: "Show owner-daemon and root-helper status", + Args: noArgsUsage("usage: banger system status"), + RunE: func(cmd *cobra.Command, args []string) error { + return d.runSystemStatus(cmd.Context(), cmd.OutOrStdout()) + }, + } + + restartCmd := &cobra.Command{ + Use: "restart", + Short: "Restart the installed banger services", + Args: noArgsUsage("usage: banger system restart"), + RunE: func(cmd *cobra.Command, args []string) error { + if err := requireRoot(); err != nil { + return err + } + if err := d.runSystemctl(cmd.Context(), "restart", installmeta.DefaultRootHelperService); err != nil { + return err + } + if err := d.runSystemctl(cmd.Context(), "restart", installmeta.DefaultService); err != nil { + return err + } + if err := d.waitForDaemonReady(cmd.Context(), paths.ResolveSystem().SocketPath); err != nil { + return err + } + _, err := fmt.Fprintln(cmd.OutOrStdout(), "restarted") + return err + }, + } + + uninstallCmd := &cobra.Command{ + Use: "uninstall", + Short: "Remove the installed banger services", + Args: noArgsUsage("usage: banger system uninstall [--purge]"), + RunE: func(cmd *cobra.Command, args []string) error { + return d.runSystemUninstall(cmd.Context(), cmd.OutOrStdout(), purge) + }, + } + uninstallCmd.Flags().BoolVar(&purge, "purge", false, "also delete system-owned banger state and cache") + + cmd.AddCommand(installCmd, statusCmd, restartCmd, uninstallCmd) + return cmd +} + +func (d *deps) runSystemInstall(ctx context.Context, out io.Writer, ownerFlag string) error { + if err := requireRoot(); err != nil { + return err + } + meta, err := resolveInstallOwner(ownerFlag) + if err != nil { + return err + } + info := buildinfo.Current() + meta.Version = info.Version + meta.Commit = info.Commit + meta.BuiltAt = info.BuiltAt + meta.InstalledAt = model.Now() + + bangerBin, err := paths.BangerPath() + if err != nil { + return err + } + bangerdBin, err := paths.BangerdPath() + if err != nil { + return err + } + agentBin, err := paths.CompanionBinaryPath("banger-vsock-agent") + if err != nil { + return err + } + if err := os.MkdirAll(filepath.Dir(systemBangerBin), 0o755); err != nil { + return err + } + if err := os.MkdirAll(systemCompanionDir, 0o755); err != nil { + return err + } + if err := installFile(bangerBin, systemBangerBin, 0o755); err != nil { + return err + } + if err := installFile(bangerdBin, systemBangerdBin, 0o755); err != nil { + return err + } + if err := installFile(agentBin, systemCompanionAgent, 0o755); err != nil { + return err + } + if err := installmeta.Save(installmeta.DefaultPath, meta); err != nil { + return err + } + if err := paths.EnsureSystem(paths.ResolveSystem()); err != nil { + return err + } + if err := os.WriteFile(systemdRootUnitPath, []byte(renderRootHelperSystemdUnit()), 0o644); err != nil { + return err + } + if err := os.WriteFile(systemdUserUnitPath, []byte(renderSystemdUnit(meta)), 0o644); err != nil { + return err + } + if err := d.runSystemctl(ctx, "daemon-reload"); err != nil { + return err + } + if err := d.runSystemctl(ctx, "enable", installmeta.DefaultRootHelperService); err != nil { + return err + } + if err := d.runSystemctl(ctx, "enable", installmeta.DefaultService); err != nil { + return err + } + if err := d.runSystemctl(ctx, "restart", installmeta.DefaultRootHelperService); err != nil { + return err + } + if err := d.runSystemctl(ctx, "restart", installmeta.DefaultService); err != nil { + return err + } + if err := d.waitForDaemonReady(ctx, installmeta.DefaultSocketPath); err != nil { + return err + } + if _, err := fmt.Fprintln(out, "installed"); err != nil { + return err + } + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + fmt.Fprintf(w, "owner\t%s\n", meta.OwnerUser) + fmt.Fprintf(w, "socket\t%s\n", installmeta.DefaultSocketPath) + fmt.Fprintf(w, "helper_socket\t%s\n", installmeta.DefaultRootHelperSocketPath) + fmt.Fprintf(w, "service\t%s\n", installmeta.DefaultService) + fmt.Fprintf(w, "helper_service\t%s\n", installmeta.DefaultRootHelperService) + return w.Flush() +} + +func (d *deps) runSystemStatus(ctx context.Context, out io.Writer) error { + layout := paths.ResolveSystem() + active := d.systemctlQuery(ctx, "is-active", installmeta.DefaultService) + if active == "" { + active = "unknown" + } + enabled := d.systemctlQuery(ctx, "is-enabled", installmeta.DefaultService) + if enabled == "" { + enabled = "unknown" + } + helperActive := d.systemctlQuery(ctx, "is-active", installmeta.DefaultRootHelperService) + if helperActive == "" { + helperActive = "unknown" + } + helperEnabled := d.systemctlQuery(ctx, "is-enabled", installmeta.DefaultRootHelperService) + if helperEnabled == "" { + helperEnabled = "unknown" + } + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + fmt.Fprintf(w, "service\t%s\n", installmeta.DefaultService) + fmt.Fprintf(w, "enabled\t%s\n", enabled) + fmt.Fprintf(w, "active\t%s\n", active) + fmt.Fprintf(w, "helper_service\t%s\n", installmeta.DefaultRootHelperService) + fmt.Fprintf(w, "helper_enabled\t%s\n", helperEnabled) + fmt.Fprintf(w, "helper_active\t%s\n", helperActive) + fmt.Fprintf(w, "socket\t%s\n", layout.SocketPath) + fmt.Fprintf(w, "helper_socket\t%s\n", installmeta.DefaultRootHelperSocketPath) + fmt.Fprintf(w, "log\tjournalctl -u %s -u %s\n", installmeta.DefaultService, installmeta.DefaultRootHelperService) + if ping, err := d.daemonPing(ctx, layout.SocketPath); err == nil { + info := buildinfo.Normalize(ping.Version, ping.Commit, ping.BuiltAt) + fmt.Fprintf(w, "pid\t%d\n", ping.PID) + fmt.Fprintf(w, "version\t%s\n", info.Version) + if info.Commit != "" { + fmt.Fprintf(w, "commit\t%s\n", info.Commit) + } + if info.BuiltAt != "" { + fmt.Fprintf(w, "built_at\t%s\n", info.BuiltAt) + } + } + return w.Flush() +} + +func (d *deps) runSystemUninstall(ctx context.Context, out io.Writer, purge bool) error { + if err := requireRoot(); err != nil { + return err + } + _ = d.runSystemctl(ctx, "disable", "--now", installmeta.DefaultService, installmeta.DefaultRootHelperService) + _ = os.Remove(systemdUserUnitPath) + _ = os.Remove(systemdRootUnitPath) + _ = os.Remove(installmeta.DefaultPath) + _ = os.Remove(installmeta.DefaultDir) + _ = d.runSystemctl(ctx, "daemon-reload") + _ = os.Remove(systemBangerdBin) + _ = os.Remove(systemBangerBin) + _ = os.RemoveAll(systemCompanionDir) + if purge { + _ = os.RemoveAll(paths.ResolveSystem().StateDir) + _ = os.RemoveAll(paths.ResolveSystem().CacheDir) + _ = os.RemoveAll(paths.ResolveSystem().RuntimeDir) + } + msg := "uninstalled" + if purge { + msg += " (purged state)" + } + _, err := fmt.Fprintln(out, msg) + return err +} + +func resolveInstallOwner(ownerFlag string) (installmeta.Metadata, error) { + owner := strings.TrimSpace(ownerFlag) + if owner == "" { + owner = strings.TrimSpace(os.Getenv("SUDO_USER")) + } + if owner == "" { + return installmeta.Metadata{}, errors.New("owner is required; pass --owner USER when installing without sudo") + } + if owner == "root" { + return installmeta.Metadata{}, errors.New("refusing to install with root as the banger owner") + } + return installmeta.LookupOwner(owner) +} + +func renderSystemdUnit(meta installmeta.Metadata) string { + lines := []string{ + "[Unit]", + "Description=banger daemon", + "After=network-online.target", + "Wants=network-online.target " + installmeta.DefaultRootHelperService, + "After=" + installmeta.DefaultRootHelperService, + "Requires=" + installmeta.DefaultRootHelperService, + "", + "[Service]", + "Type=simple", + "User=" + meta.OwnerUser, + "ExecStart=" + systemBangerdBin + " --system", + "Restart=on-failure", + "RestartSec=1s", + // KillMode=process: only signal the main PID on stop/restart. + // The default (control-group) sends SIGKILL to every process in + // the unit's cgroup, including descendants — and during `banger + // update` we restart this unit, which would terminate any + // in-flight subprocesses spawned by the daemon. The daemon + // shuts its own children down explicitly when needed. + "KillMode=process", + "Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", + "Environment=TMPDIR=/run/banger", + "UMask=0077", + "NoNewPrivileges=yes", + "PrivateMounts=yes", + "ProtectSystem=strict", + "ProtectHome=read-only", + "ProtectControlGroups=yes", + "ProtectKernelLogs=yes", + "ProtectKernelModules=yes", + "ProtectClock=yes", + "ProtectHostname=yes", + "RestrictSUIDSGID=yes", + "LockPersonality=yes", + "SystemCallArchitectures=native", + "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK AF_VSOCK", + "StateDirectory=banger", + "StateDirectoryMode=0700", + "CacheDirectory=banger", + "CacheDirectoryMode=0700", + "RuntimeDirectory=banger", + "RuntimeDirectoryMode=0700", + // Keep /run/banger across stop/restart so the api-sock symlinks + // the helper creates for live VMs aren't wiped between the daemon + // stopping and the new daemon's reconcile re-attaching to them. + // Without this, `banger update` restarts the daemon, /run/banger + // is wiped, the api-sock symlinks vanish, and rediscoverHandles + // can't resolve the chroot path it needs to read jailer's pidfile. + "RuntimeDirectoryPreserve=yes", + } + if coverDir := strings.TrimSpace(os.Getenv(systemCoverDirEnv)); coverDir != "" { + lines = append(lines, "Environment=GOCOVERDIR="+systemdQuote(coverDir)) + } + if home := strings.TrimSpace(meta.OwnerHome); home != "" { + lines = append(lines, "ReadOnlyPaths="+systemdQuote(home)) + } + lines = append(lines, + "", + "[Install]", + "WantedBy=multi-user.target", + "", + ) + return strings.Join(lines, "\n") +} + +func renderRootHelperSystemdUnit() string { + lines := []string{ + "[Unit]", + "Description=banger root helper", + "After=network-online.target", + "Wants=network-online.target", + "", + "[Service]", + "Type=simple", + "ExecStart=" + systemBangerdBin + " --root-helper", + "Restart=on-failure", + "RestartSec=1s", + // KillMode=process + SendSIGKILL=no together make the helper + // safe to restart while banger-launched firecrackers are + // running. firecracker lives in this unit's cgroup (jailer + // doesn't open a sub-cgroup), so: + // + // - Default control-group mode SIGKILLs every process in + // the cgroup on stop. + // - KillMode=process limits the initial SIGTERM to the + // helper main PID; systemd leaves remaining cgroup + // processes alone (and logs "Unit process N (firecracker) + // remains running after unit stopped"). + // - SendSIGKILL=no disables the FinalKillSignal escalation + // that would otherwise SIGKILL leftovers after the timeout. + // + // One more pitfall: the firecracker SDK installs a default + // signal-forwarding goroutine in the helper that catches + // SIGTERM (etc.) and forwards it to every firecracker child. + // We disable that explicitly via ForwardSignals: []os.Signal{} + // in firecracker.buildConfig — without that override, systemd + // signaling the helper main would propagate to every running + // VM regardless of what these directives do. + // + // `banger system uninstall` and the daemon's vm-stop path + // explicitly stop firecracker processes when actually needed, + // so we don't lose the systemd-driven kill as a real safety + // net — banger drives those kills itself. + "KillMode=process", + "SendSIGKILL=no", + "Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", + "Environment=TMPDIR=" + installmeta.DefaultRootHelperRuntimeDir, + "UMask=0077", + "NoNewPrivileges=yes", + "PrivateTmp=yes", + "PrivateMounts=yes", + "ProtectSystem=strict", + "ProtectHome=yes", + "ProtectControlGroups=yes", + "ProtectKernelLogs=yes", + "ProtectKernelModules=yes", + "ProtectClock=yes", + "ProtectHostname=yes", + "RestrictSUIDSGID=yes", + "LockPersonality=yes", + "SystemCallArchitectures=native", + "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK AF_VSOCK", + "CapabilityBoundingSet=CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_KILL CAP_MKNOD CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_ADMIN CAP_SYS_CHROOT", + "ReadWritePaths=/var/lib/banger", + "RuntimeDirectory=banger-root", + "RuntimeDirectoryMode=0711", + // Same rationale as bangerd.service: the helper-managed + // /run/banger-root holds the helper's RPC socket and any + // per-VM scratch state; preserving it across restart keeps + // the daemon's reconnect path and reconcile re-attachment + // from racing against systemd's runtime-dir cleanup. + "RuntimeDirectoryPreserve=yes", + } + if coverDir := strings.TrimSpace(os.Getenv(rootCoverDirEnv)); coverDir != "" { + lines = append(lines, "Environment=GOCOVERDIR="+systemdQuote(coverDir)) + } + lines = append(lines, + "", + "[Install]", + "WantedBy=multi-user.target", + "", + ) + return strings.Join(lines, "\n") +} + +func systemdQuote(value string) string { + return strconv.Quote(strings.TrimSpace(value)) +} + +func installFile(sourcePath, targetPath string, mode os.FileMode) error { + if err := os.MkdirAll(filepath.Dir(targetPath), 0o755); err != nil { + return err + } + tempPath := targetPath + ".tmp" + _ = os.Remove(tempPath) + if err := system.CopyFilePreferClone(sourcePath, tempPath); err != nil { + return err + } + if err := os.Chmod(tempPath, mode); err != nil { + _ = os.Remove(tempPath) + return err + } + if err := os.Rename(tempPath, targetPath); err != nil { + _ = os.Remove(tempPath) + return err + } + return nil +} + +func requireRoot() error { + if os.Geteuid() == 0 { + return nil + } + return errors.New("this command requires root; run it with sudo") +} + +func (d *deps) runSystemctl(ctx context.Context, args ...string) error { + _, err := d.hostCommandOutput(ctx, "systemctl", args...) + return err +} + +func (d *deps) systemctlQuery(ctx context.Context, args ...string) string { + output, err := d.hostCommandOutput(ctx, "systemctl", args...) + if err == nil { + return strings.TrimSpace(string(output)) + } + msg := strings.TrimSpace(string(output)) + if msg != "" { + return msg + } + msg = strings.TrimSpace(err.Error()) + if idx := strings.LastIndex(msg, ": "); idx >= 0 { + return strings.TrimSpace(msg[idx+2:]) + } + return msg +} diff --git a/internal/cli/commands_update.go b/internal/cli/commands_update.go new file mode 100644 index 0000000..d4313ac --- /dev/null +++ b/internal/cli/commands_update.go @@ -0,0 +1,420 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "io" + "net/http" + "os" + "os/exec" + "path/filepath" + "strings" + "time" + + "banger/internal/api" + "banger/internal/buildinfo" + "banger/internal/installmeta" + "banger/internal/paths" + "banger/internal/rpc" + "banger/internal/updater" + + "github.com/spf13/cobra" +) + +// stagingTarballName is what the staged release tarball is saved as +// inside the staging dir. Doesn't really matter (the path is internal +// and ephemeral) but a stable name makes it easy to find for +// debugging a stuck update. +const stagingTarballName = "release.tar.gz" + +func (d *deps) newUpdateCommand() *cobra.Command { + var ( + checkOnly bool + dryRun bool + force bool + toVersion string + manifestURL string + pubkeyFile string + ) + cmd := &cobra.Command{ + Use: "update", + Short: "Download and install a newer banger release", + Long: strings.TrimSpace(` +Replace the running banger install with a newer release published +to ` + updater.ManifestURL() + `. + +Flow: + 1. Fetch the release manifest. + 2. Refuse if any banger operation is in flight (use --force to skip). + 3. Download tarball + SHA256SUMS, verify hashes. + 4. Sanity-run the staged binaries; refuse if --check-migrations + reports the new bangerd can't open this host's state DB. + 5. Atomically swap binaries; restart bangerd-root + bangerd. + 6. Run banger doctor; auto-roll back on failure. + 7. Update install metadata with the new version triple. + +Steps 1-4 are non-destructive — failures abort with the install +untouched. Step 5+ is the cutover; auto-rollback in step 6 covers +the half-failed-update case. + +Requires root: the swap writes /usr/local/bin and the restart +talks to systemd. Run with sudo. +`), + Example: strings.TrimSpace(` + banger update --check + sudo banger update + sudo banger update --to v0.1.1 + sudo banger update --dry-run +`), + Args: noArgsUsage("usage: banger update [--check] [--dry-run] [--force] [--to vX.Y.Z]"), + RunE: func(cmd *cobra.Command, args []string) error { + return d.runUpdate(cmd, runUpdateOpts{ + checkOnly: checkOnly, + dryRun: dryRun, + force: force, + toVersion: toVersion, + manifestURL: manifestURL, + pubkeyFile: pubkeyFile, + }) + }, + } + cmd.Flags().BoolVar(&checkOnly, "check", false, "report whether a newer release is available, then exit") + cmd.Flags().BoolVar(&dryRun, "dry-run", false, "fetch and verify, but do not swap or restart anything") + cmd.Flags().BoolVar(&force, "force", false, "skip in-flight-op refusal and post-restart doctor verification") + cmd.Flags().StringVar(&toVersion, "to", "", "specific release version to install (default: latest_stable from manifest)") + // Hidden test/dev hooks: redirect the updater at a non-default + // manifest URL and trust a non-default cosign public key. Used by + // the smoke suite to drive a real update against locally-built + // release artefacts. Production users have no reason to touch + // these; they are not advertised in --help. + cmd.Flags().StringVar(&manifestURL, "manifest-url", "", "") + cmd.Flags().StringVar(&pubkeyFile, "pubkey-file", "", "") + _ = cmd.Flags().MarkHidden("manifest-url") + _ = cmd.Flags().MarkHidden("pubkey-file") + return cmd +} + +type runUpdateOpts struct { + checkOnly bool + dryRun bool + force bool + toVersion string + manifestURL string + pubkeyFile string +} + +func (d *deps) runUpdate(cmd *cobra.Command, opts runUpdateOpts) error { + ctx := cmd.Context() + out := cmd.OutOrStdout() + + // Resolve the test/dev override flags up front so a bad + // --pubkey-file fails fast before any network round-trips. + pubKeyPEM := updater.BangerReleasePublicKey + if strings.TrimSpace(opts.pubkeyFile) != "" { + body, err := os.ReadFile(opts.pubkeyFile) + if err != nil { + return fmt.Errorf("read --pubkey-file: %w", err) + } + pubKeyPEM = string(body) + } + + // Discover. + client := &http.Client{Timeout: 30 * time.Second} + var ( + manifest updater.Manifest + err error + ) + if strings.TrimSpace(opts.manifestURL) != "" { + manifest, err = updater.FetchManifestFrom(ctx, client, opts.manifestURL) + } else { + manifest, err = updater.FetchManifest(ctx, client) + } + if err != nil { + return fmt.Errorf("discover: %w", err) + } + var target updater.Release + if strings.TrimSpace(opts.toVersion) != "" { + target, err = manifest.LookupRelease(opts.toVersion) + } else { + target, err = manifest.Latest() + } + if err != nil { + return fmt.Errorf("resolve target release: %w", err) + } + + currentVersion := buildinfo.Current().Version + if opts.checkOnly { + return reportCheckResult(out, currentVersion, target.Version) + } + if currentVersion == target.Version { + fmt.Fprintf(out, "already on %s\n", target.Version) + return nil + } + + // Past this point we're going to mutate the host. Require root. + if err := requireRoot(); err != nil { + return err + } + socketPath := paths.ResolveSystem().SocketPath + + // Refuse if anything is in flight. + if !opts.force { + if err := refuseIfInFlight(ctx, socketPath); err != nil { + return err + } + } + + // Stage the download. + stagingDir := updater.DefaultStagingDir(paths.ResolveSystem().CacheDir) + if err := updater.PrepareCleanStaging(stagingDir); err != nil { + return fmt.Errorf("staging: %w", err) + } + tarballPath := filepath.Join(stagingDir, stagingTarballName) + fmt.Fprintf(out, "downloading %s …\n", target.TarballURL) + sumsBody, err := updater.DownloadRelease(ctx, client, target, tarballPath) + if err != nil { + return fmt.Errorf("download: %w", err) + } + if err := updater.FetchAndVerifySignatureWithKey(ctx, client, target, sumsBody, pubKeyPEM); err != nil { + // Don't leave the staged tarball around — it failed + // signature verification and shouldn't be re-runnable. + _ = os.Remove(tarballPath) + return fmt.Errorf("signature: %w", err) + } + stagedDir := filepath.Join(stagingDir, "staged") + if err := os.RemoveAll(stagedDir); err != nil && !os.IsNotExist(err) { + return err + } + staged, err := updater.StageTarball(tarballPath, stagedDir) + if err != nil { + return fmt.Errorf("stage: %w", err) + } + + // Sanity-run the staged binaries. + if err := sanityRunStaged(ctx, staged, target.Version); err != nil { + return fmt.Errorf("sanity check: %w", err) + } + + if opts.dryRun { + fmt.Fprintf(out, "dry-run: would install %s → %s, restart services, run doctor\n", currentVersion, target.Version) + return nil + } + + // Swap. + targets := updater.DefaultInstallTargets() + swap, err := updater.Swap(staged, targets) + if err != nil { + // Best-effort rollback of any partial swap that did land + // before failure. If rollback also fails we surface both. + if rbErr := updater.Rollback(swap); rbErr != nil { + return fmt.Errorf("swap: %w (rollback also failed: %v)", err, rbErr) + } + return fmt.Errorf("swap: %w (rolled back)", err) + } + + // Restart services + wait for the new daemon. A `systemctl restart` + // that fails has typically already STOPPED the unit, so the prior + // binary on disk isn't running anywhere — Rollback() must be paired + // with a re-restart to bring the rolled-back binary back into a + // running state. That's rollbackAndRestart's job; rollbackAndWrap + // is for the swap-step failures earlier where the restart never + // fired and the old binary is still in memory. + if err := d.runSystemctl(ctx, "restart", installmeta.DefaultRootHelperService); err != nil { + return rollbackAndRestart(ctx, d, swap, "restart helper", err) + } + if err := d.runSystemctl(ctx, "restart", installmeta.DefaultService); err != nil { + return rollbackAndRestart(ctx, d, swap, "restart daemon", err) + } + if err := d.waitForDaemonReady(ctx, socketPath); err != nil { + return rollbackAndRestart(ctx, d, swap, "wait daemon ready", err) + } + + // Verify with doctor unless --force says otherwise. + if !opts.force { + if err := runPostUpdateDoctor(ctx, d, cmd); err != nil { + return rollbackAndRestart(ctx, d, swap, "post-update doctor", err) + } + } + + // Finalise: refresh install metadata, drop backups, clean staging. + // Read the new binary's identity by exec'ing it; buildinfo.Current() + // reflects the OLD running CLI (we're it), so the commit + built_at + // have to come from the freshly-swapped /usr/local/bin/banger or + // install.toml ends up with mixed-version fields. + newInfo, err := readInstalledBuildinfo(ctx, targets.Banger) + if err != nil { + fmt.Fprintf(out, "warning: read installed buildinfo: %v\n", err) + // Fall back to the manifest version + the running binary's + // commit/built_at. install.toml drift is a doctor warning, + // not a broken host, so don't fail the update. + old := buildinfo.Current() + newInfo = buildinfo.Info{Version: target.Version, Commit: old.Commit, BuiltAt: old.BuiltAt} + } + if err := installmeta.UpdateBuildInfo(installmeta.DefaultPath, newInfo.Version, newInfo.Commit, newInfo.BuiltAt); err != nil { + fmt.Fprintf(out, "warning: update install metadata: %v\n", err) + } + if err := updater.CleanupBackups(swap); err != nil { + fmt.Fprintf(out, "warning: cleanup backups: %v\n", err) + } + _ = os.RemoveAll(stagingDir) + + fmt.Fprintf(out, "updated %s → %s\n", currentVersion, target.Version) + return nil +} + +func reportCheckResult(out io.Writer, current, latest string) error { + if current == latest { + fmt.Fprintf(out, "up to date (%s)\n", current) + return nil + } + fmt.Fprintf(out, "update available: %s → %s\n", current, latest) + return nil +} + +// refuseIfInFlight asks the running daemon for in-flight operations +// and refuses the update if any are not Done. Per the v0.1.0 plan: +// no wait, no drain — the operator runs `banger update` on an idle +// host or passes --force. +func refuseIfInFlight(ctx context.Context, socketPath string) error { + res, err := rpc.Call[api.OperationsListResult](ctx, socketPath, "daemon.operations.list", nil) + if err != nil { + // A daemon that's down or unreachable is itself a reason to + // refuse — we'd be unable to verify anything. Surface that + // clearly rather than blindly proceeding. + return fmt.Errorf("contact daemon: %w (use --force to override)", err) + } + pending := []string{} + for _, op := range res.Operations { + if op.Done { + continue + } + pending = append(pending, fmt.Sprintf("%s/%s (stage=%s)", op.Kind, op.ID, op.Stage)) + } + if len(pending) > 0 { + return fmt.Errorf("refusing update: %d in-flight operation(s): %s", len(pending), strings.Join(pending, ", ")) + } + return nil +} + +// sanityRunStaged executes the staged banger and bangerd to confirm +// they can at least print their own version + report schema state. +// Catches obvious-broken binaries (wrong arch, missing libs, +// embedded panics) before we swap them into place. +func sanityRunStaged(ctx context.Context, staged updater.StagedRelease, expectedVersion string) error { + // banger --version: must succeed and mention the expected version + // somewhere (the format is "banger vX.Y.Z (commit ..., built ...)"). + out, err := exec.CommandContext(ctx, staged.BangerPath, "--version").CombinedOutput() + if err != nil { + return fmt.Errorf("staged banger --version: %w (%s)", err, strings.TrimSpace(string(out))) + } + if !strings.Contains(string(out), expectedVersion) { + return fmt.Errorf("staged banger --version reported %q, expected to mention %s", strings.TrimSpace(string(out)), expectedVersion) + } + + // bangerd --check-migrations against the configured DB. Exit 2 + // means incompatible — we refuse to swap. Exit 0 (compatible) and + // exit 1 (migrations needed; will auto-apply on first Open) are + // both acceptable. + out, err = exec.CommandContext(ctx, staged.BangerdPath, "--check-migrations", "--system").CombinedOutput() + if err != nil { + var exitErr *exec.ExitError + if errors.As(err, &exitErr) && exitErr.ExitCode() == 1 { + return nil // migrations-needed; safe to proceed + } + if errors.As(err, &exitErr) && exitErr.ExitCode() == 2 { + return fmt.Errorf("staged bangerd would not open this host's state DB: %s", strings.TrimSpace(string(out))) + } + return fmt.Errorf("staged bangerd --check-migrations: %w (%s)", err, strings.TrimSpace(string(out))) + } + return nil +} + +// readInstalledBuildinfo execs the just-swapped banger binary, parses +// its three-line `version` output, and returns the parsed identity. +// Used to refresh install.toml after an update so the on-disk record +// reflects the binary that's actually installed — buildinfo.Current() +// in the running process is the OLD binary's identity, not the one we +// just put on disk. +// +// Output shape (from internal/cli/banger.go versionString): +// +// version: vX.Y.Z +// commit: +// built_at: +func readInstalledBuildinfo(ctx context.Context, bangerPath string) (buildinfo.Info, error) { + out, err := exec.CommandContext(ctx, bangerPath, "version").Output() + if err != nil { + return buildinfo.Info{}, fmt.Errorf("exec %s version: %w", bangerPath, err) + } + return parseVersionOutput(string(out)) +} + +// parseVersionOutput extracts the three identity fields from +// `banger version`. Split out of readInstalledBuildinfo so it can be +// unit-tested without exec'ing a real binary. +func parseVersionOutput(out string) (buildinfo.Info, error) { + var info buildinfo.Info + for _, line := range strings.Split(out, "\n") { + k, v, ok := strings.Cut(line, ":") + if !ok { + continue + } + switch strings.TrimSpace(k) { + case "version": + info.Version = strings.TrimSpace(v) + case "commit": + info.Commit = strings.TrimSpace(v) + case "built_at": + info.BuiltAt = strings.TrimSpace(v) + } + } + if info.Version == "" || info.Commit == "" || info.BuiltAt == "" { + return buildinfo.Info{}, fmt.Errorf("could not parse version/commit/built_at from %q", strings.TrimSpace(out)) + } + return info, nil +} + +// runPostUpdateDoctor invokes `banger doctor` on the JUST-INSTALLED +// CLI (not d.doctor — that's the in-process implementation; we want +// to exercise the new binary end-to-end). +func runPostUpdateDoctor(ctx context.Context, d *deps, cmd *cobra.Command) error { + out, err := exec.CommandContext(ctx, "/usr/local/bin/banger", "doctor").CombinedOutput() + if err != nil { + return fmt.Errorf("doctor: %w\n%s", err, string(out)) + } + // banger doctor prints to stdout regardless of pass/fail; print + // it through so the operator can see the new install's check + // result. (Doctor's exit code is what we trust; printing is + // just operator UX.) + fmt.Fprintln(cmd.OutOrStdout(), strings.TrimSpace(string(out))) + return nil +} + +// rollbackAndWrap is for failures BEFORE we restarted services. The +// previous binaries are still on disk under .previous; restoring them +// is an atomic-rename, no service involvement needed (the OLD daemon +// is still running because the restart never happened). +func rollbackAndWrap(swap updater.SwapResult, stage string, err error) error { + if rbErr := updater.Rollback(swap); rbErr != nil { + return fmt.Errorf("%s failed: %w (rollback also failed: %v; install is broken)", stage, err, rbErr) + } + return fmt.Errorf("%s failed: %w (rolled back to previous install)", stage, err) +} + +// rollbackAndRestart is for failures AFTER the service restart. We +// roll back binaries AND re-restart so the OLD versions take over +// again. If even that fails, the install is broken; surface +// everything we know. +func rollbackAndRestart(ctx context.Context, d *deps, swap updater.SwapResult, stage string, err error) error { + if rbErr := updater.Rollback(swap); rbErr != nil { + return fmt.Errorf("%s failed: %w (rollback also failed: %v; install is broken)", stage, err, rbErr) + } + if rsErr := d.runSystemctl(ctx, "restart", installmeta.DefaultRootHelperService); rsErr != nil { + return fmt.Errorf("%s failed: %w (restored binaries but failed to restart helper: %v)", stage, err, rsErr) + } + if rsErr := d.runSystemctl(ctx, "restart", installmeta.DefaultService); rsErr != nil { + return fmt.Errorf("%s failed: %w (restored binaries but failed to restart daemon: %v)", stage, err, rsErr) + } + return fmt.Errorf("%s failed: %w (rolled back to previous install)", stage, err) +} diff --git a/internal/cli/commands_update_test.go b/internal/cli/commands_update_test.go new file mode 100644 index 0000000..7207008 --- /dev/null +++ b/internal/cli/commands_update_test.go @@ -0,0 +1,79 @@ +package cli + +import "testing" + +func TestParseVersionOutput(t *testing.T) { + cases := []struct { + name string + in string + wantVersion string + wantCommit string + wantBuilt string + wantErr bool + }{ + { + name: "happy path — three-line shape from banger version", + in: `version: v0.1.2 +commit: a0b5c7fa3ca95a37ba99b35280fc75e5647b59e8 +built_at: 2026-04-29T17:34:45Z +`, + wantVersion: "v0.1.2", + wantCommit: "a0b5c7fa3ca95a37ba99b35280fc75e5647b59e8", + wantBuilt: "2026-04-29T17:34:45Z", + }, + { + name: "tolerates extra whitespace around the values", + in: ` version : v0.1.2 + commit : abc123 + built_at : 2026-01-01T00:00:00Z`, + wantVersion: "v0.1.2", + wantCommit: "abc123", + wantBuilt: "2026-01-01T00:00:00Z", + }, + { + name: "missing commit field is rejected", + in: "version: v0.1.2\nbuilt_at: 2026-01-01T00:00:00Z\n", + wantErr: true, + }, + { + name: "empty input is rejected", + in: "", + wantErr: true, + }, + { + name: "unrelated lines are ignored", + in: `banger v0.1.2 +some other diagnostic line: with a colon +version: v0.1.2 +commit: abc +built_at: 2026-01-01T00:00:00Z +`, + wantVersion: "v0.1.2", + wantCommit: "abc", + wantBuilt: "2026-01-01T00:00:00Z", + }, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got, err := parseVersionOutput(tc.in) + if tc.wantErr { + if err == nil { + t.Fatalf("want error, got nil; parsed=%+v", got) + } + return + } + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if got.Version != tc.wantVersion { + t.Errorf("Version: got %q, want %q", got.Version, tc.wantVersion) + } + if got.Commit != tc.wantCommit { + t.Errorf("Commit: got %q, want %q", got.Commit, tc.wantCommit) + } + if got.BuiltAt != tc.wantBuilt { + t.Errorf("BuiltAt: got %q, want %q", got.BuiltAt, tc.wantBuilt) + } + }) + } +} diff --git a/internal/cli/commands_vm.go b/internal/cli/commands_vm.go new file mode 100644 index 0000000..d30dfb2 --- /dev/null +++ b/internal/cli/commands_vm.go @@ -0,0 +1,1144 @@ +package cli + +import ( + "bufio" + "context" + "errors" + "fmt" + "io" + "os" + "strings" + "sync" + "text/tabwriter" + + "banger/internal/api" + "banger/internal/config" + "banger/internal/daemon/workspace" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/rpc" + "banger/internal/system" + + "github.com/spf13/cobra" +) + +func (d *deps) newVMCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "vm", + Short: "Manage Firecracker microVM sandboxes", + Long: strings.TrimSpace(` +Lifecycle commands for banger's microVMs. + +For most cases you want 'banger vm run' — it creates, starts, +provisions ssh, and drops you into the guest in one command. Use +'vm create' / 'vm start' / 'vm ssh' separately when you want a +longer-lived VM you'll come back to. + +Quick reference: + banger vm run interactive sandbox (stays alive on disconnect) + banger vm run --rm -- script.sh ephemeral: VM auto-deletes on exit + banger vm run ./repo -- make test ship a repo, run a command, exit with its status + banger vm run --nat ./repo --nat: outbound internet (required for mise bootstrap) + banger vm run -d ./repo --nat -d/--detach: prep + bootstrap, exit (no ssh attach) + banger vm create --name dev persistent VM; pair with 'vm ssh' + banger vm ssh open a shell in a running VM + banger vm exec -- make test run a command in the workspace with mise toolchain + banger vm stop | vm restart graceful lifecycle + banger vm kill force-kill if stop hangs + banger vm delete stop + remove disks + banger ps / banger vm list running / all VMs (use --all) + banger vm logs guest console + daemon log + banger vm set --nat toggle NAT on an existing VM (--no-nat to remove) + banger vm workspace prepare/export ship a repo in, pull diffs back +`), + Example: strings.TrimSpace(` + banger vm run -- uname -a + banger vm run ./project -- npm test + banger vm create --name dev && banger vm workspace prepare dev . && banger vm exec dev -- make test +`), + RunE: helpNoArgs, + } + cmd.AddCommand( + d.newVMCreateCommand(), + d.newVMRunCommand(), + d.newVMListCommand(), + d.newVMShowCommand(), + d.newVMActionCommand("start", "Start a stopped VM", "vm.start"), + d.newVMActionCommand("stop", "Stop a running VM gracefully", "vm.stop"), + d.newVMKillCommand(), + d.newVMActionCommand("restart", "Stop then start a VM", "vm.restart"), + d.newVMDeleteCommand(), + d.newVMPruneCommand(), + d.newVMSetCommand(), + d.newVMSSHCommand(), + d.newVMExecCommand(), + d.newVMWorkspaceCommand(), + d.newVMLogsCommand(), + d.newVMStatsCommand(), + d.newVMPortsCommand(), + ) + return cmd +} + +func (d *deps) newVMRunCommand() *cobra.Command { + defaults := effectiveVMDefaults() + var ( + name string + imageName string + vcpu = defaults.VCPUCount + memory = defaults.MemoryMiB + systemOverlaySize = model.FormatSizeBytes(defaults.SystemOverlaySizeByte) + workDiskSize = model.FormatSizeBytes(defaults.WorkDiskSizeBytes) + natEnabled bool + branchName string + fromRef = "HEAD" + removeOnExit bool + includeUntracked bool + dryRun bool + detach bool + skipBootstrap bool + verbose bool + ) + cmd := &cobra.Command{ + Use: "run [path] [-- command args...]", + Short: "Create and enter a sandbox VM", + Long: strings.TrimSpace(` +Create a sandbox VM and either drop into an interactive shell or run a command. + +Modes: + banger vm run bare sandbox, drops into ssh + banger vm run ./repo workspace sandbox, drops into ssh at /root/repo + banger vm run ./repo -- make test workspace + run command, exit with its status + banger vm run --rm -- script.sh ephemeral: VM auto-deletes when the session/command exits + banger vm run -d ./repo workspace + bootstrap, exit (reconnect with 'vm ssh') + +Workspace mode (path argument): + Passing a path copies the repo's git-tracked files into /root/repo + inside the guest. Untracked files are skipped by default — pass + --include-untracked to ship them too, or --dry-run to preview the + file list without creating a VM. + +Outbound internet (--nat): + Guests have no internet access by default. Pass --nat to enable + host-side MASQUERADE so the VM can reach the public network. NAT is + required whenever the workspace declares mise tooling (see below). + Toggle on an existing VM with 'banger vm set --nat '. + +Tooling bootstrap (workspace mode): + When the workspace contains a .mise.toml or .tool-versions, vm run + installs the listed tools via mise on first boot. The bootstrap + needs internet, so --nat must be set. Pass --no-bootstrap to skip + it entirely (no NAT requirement). + +Exit behaviour: + In command mode (-- ), the guest command's exit code propagates + through banger. Without --rm, the VM stays alive after the session + or command exits — reconnect with 'banger vm ssh '. With --rm, + the VM is deleted on exit (stdout/stderr are preserved). +`), + Args: cobra.ArbitraryArgs, + Example: strings.TrimSpace(` + banger vm run + banger vm run ../repo --name agent-box --branch feature/demo + banger vm run ../repo -- make test + banger vm run -d ../repo --nat + banger vm run -- uname -a +`), + RunE: func(cmd *cobra.Command, args []string) error { + if cmd.Flags().Changed("branch") && strings.TrimSpace(branchName) == "" { + return errors.New("--branch requires a branch name") + } + if cmd.Flags().Changed("from") && strings.TrimSpace(branchName) == "" { + return errors.New("--from requires --branch") + } + + pathArgs, commandArgs := splitVMRunArgs(cmd, args) + if len(pathArgs) > 1 { + return errors.New("usage: banger vm run [path] [-- command args...]") + } + sourcePath := "" + if len(pathArgs) == 1 { + sourcePath = pathArgs[0] + } + if sourcePath == "" && strings.TrimSpace(branchName) != "" { + return errors.New("--branch requires a path argument") + } + if detach && removeOnExit { + return errors.New("cannot combine --detach with --rm") + } + if detach && len(commandArgs) > 0 { + return errors.New("cannot combine --detach with a guest command") + } + + var repoPtr *vmRunRepo + if sourcePath != "" { + resolved, err := d.vmRunPreflightRepo(cmd.Context(), sourcePath) + if err != nil { + return err + } + repoPtr = &vmRunRepo{sourcePath: resolved, branchName: branchName, fromRef: fromRef, includeUntracked: includeUntracked} + } + if dryRun { + if repoPtr == nil { + return errors.New("--dry-run requires a workspace path") + } + dryFromRef := "" + if strings.TrimSpace(repoPtr.branchName) != "" { + dryFromRef = repoPtr.fromRef + } + return d.runWorkspaceDryRun(cmd.Context(), cmd.OutOrStdout(), repoPtr.sourcePath, repoPtr.branchName, dryFromRef, repoPtr.includeUntracked) + } + + layout, err := paths.Resolve() + if err != nil { + return err + } + cfg, err := config.Load(layout) + if err != nil { + return err + } + if repoPtr != nil { + if err := validateVMRunPrereqs(cfg); err != nil { + return err + } + } else { + if err := validateSSHPrereqs(cfg); err != nil { + return err + } + } + params, err := vmCreateParamsFromFlags(cmd, name, imageName, vcpu, memory, systemOverlaySize, workDiskSize, natEnabled, false) + if err != nil { + return err + } + layout, cfg, err = d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + return d.runVMRun(cmd.Context(), layout.SocketPath, cfg, cmd.InOrStdin(), cmd.OutOrStdout(), cmd.ErrOrStderr(), params, repoPtr, commandArgs, removeOnExit, detach, skipBootstrap, verbose) + }, + } + cmd.Flags().StringVar(&name, "name", "", "vm name") + cmd.Flags().StringVar(&imageName, "image", "", "image name or id (defaults to config's default_image_name; auto-pulled from imagecat if missing)") + cmd.Flags().IntVar(&vcpu, "vcpu", defaults.VCPUCount, "vcpu count") + cmd.Flags().IntVar(&memory, "memory", defaults.MemoryMiB, "memory in MiB") + cmd.Flags().StringVar(&systemOverlaySize, "system-overlay-size", model.FormatSizeBytes(defaults.SystemOverlaySizeByte), "system overlay size") + cmd.Flags().StringVar(&workDiskSize, "disk-size", model.FormatSizeBytes(defaults.WorkDiskSizeBytes), "work disk size") + cmd.Flags().BoolVar(&natEnabled, "nat", false, "enable outbound internet from the guest (host-side MASQUERADE; required when the workspace declares mise tooling)") + cmd.Flags().StringVar(&branchName, "branch", "", "create and switch to a new guest branch") + cmd.Flags().StringVar(&fromRef, "from", "HEAD", "git ref to branch from when --branch is set (default: HEAD)") + cmd.Flags().BoolVar(&removeOnExit, "rm", false, "ephemeral mode: delete the VM (and its disks) after the ssh session / command exits") + cmd.Flags().BoolVar(&includeUntracked, "include-untracked", false, "also copy untracked non-ignored files into the guest workspace (default: tracked files only)") + cmd.Flags().BoolVar(&dryRun, "dry-run", false, "list the files that would be copied into the guest workspace and exit without creating a VM") + cmd.Flags().BoolVarP(&detach, "detach", "d", false, "detached mode: create the VM, run workspace prep + bootstrap synchronously, exit without ssh attach (reconnect with 'vm ssh')") + cmd.Flags().BoolVar(&skipBootstrap, "no-bootstrap", false, "skip the mise tooling bootstrap (no --nat requirement)") + cmd.Flags().BoolVarP(&verbose, "verbose", "v", false, "show every progress line instead of a single rewriting status line") + _ = cmd.RegisterFlagCompletionFunc("image", d.completeImageNames) + return cmd +} + +func (d *deps) newVMKillCommand() *cobra.Command { + var signal string + cmd := &cobra.Command{ + Use: "kill ...", + Short: "Force-kill a VM (use when 'vm stop' hangs)", + Long: strings.TrimSpace(` +Send a signal directly to the firecracker process. Default is +SIGTERM; pass --signal SIGKILL when the VM is stuck and a graceful +'vm stop' has already failed. + +This skips the normal stop sequence (no flush, no clean shutdown). +Prefer 'banger vm stop' for routine teardown. +`), + Args: minArgsUsage(1, "usage: banger vm kill [--signal SIGTERM|SIGKILL|...] ..."), + ValidArgsFunction: d.completeVMNames, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if len(args) > 1 { + return runVMBatchAction(cmd, layout.SocketPath, args, func(ctx context.Context, id string) (model.VMRecord, error) { + result, err := rpc.Call[api.VMShowResult]( + ctx, + layout.SocketPath, + "vm.kill", + api.VMKillParams{IDOrName: id, Signal: signal}, + ) + if err != nil { + return model.VMRecord{}, err + } + return result.VM, nil + }) + } + result, err := rpc.Call[api.VMShowResult]( + cmd.Context(), + layout.SocketPath, + "vm.kill", + api.VMKillParams{IDOrName: args[0], Signal: signal}, + ) + if err != nil { + return err + } + return printVMSummary(cmd.OutOrStdout(), result.VM) + }, + } + cmd.Flags().StringVar(&signal, "signal", "TERM", "signal name to send") + return cmd +} + +func (d *deps) newVMPruneCommand() *cobra.Command { + var force bool + cmd := &cobra.Command{ + Use: "prune", + Short: "Delete every VM that isn't running", + Long: "Scan for VMs in state other than 'running' (stopped, created, error) and delete them after confirmation. Use -f to skip the prompt.", + Args: noArgsUsage("usage: banger vm prune [-f|--force]"), + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + return d.runVMPrune(cmd, layout.SocketPath, force) + }, + } + cmd.Flags().BoolVarP(&force, "force", "f", false, "skip the confirmation prompt") + return cmd +} + +func (d *deps) runVMPrune(cmd *cobra.Command, socketPath string, force bool) error { + ctx := cmd.Context() + stdout := cmd.OutOrStdout() + stderr := cmd.ErrOrStderr() + + list, err := d.vmList(ctx, socketPath) + if err != nil { + return err + } + var victims []model.VMRecord + for _, vm := range list.VMs { + if vm.State != model.VMStateRunning { + victims = append(victims, vm) + } + } + if len(victims) == 0 { + _, err := fmt.Fprintln(stdout, "no non-running VMs to prune") + return err + } + + fmt.Fprintf(stdout, "The following %d VM(s) will be deleted:\n", len(victims)) + w := tabwriter.NewWriter(stdout, 0, 8, 2, ' ', 0) + fmt.Fprintln(w, " ID\tNAME\tSTATE") + for _, vm := range victims { + fmt.Fprintf(w, " %s\t%s\t%s\n", shortID(vm.ID), vm.Name, vm.State) + } + if err := w.Flush(); err != nil { + return err + } + + if !force { + ok, err := promptYesNo(cmd.InOrStdin(), stdout, "Delete these VMs? [y/N] ") + if err != nil { + return err + } + if !ok { + _, err := fmt.Fprintln(stdout, "aborted") + return err + } + } + + var failed int + for _, vm := range victims { + ref := vm.Name + if ref == "" { + ref = shortID(vm.ID) + } + if err := d.vmDelete(ctx, socketPath, vm.ID); err != nil { + fmt.Fprintf(stderr, "delete %s: %v\n", ref, err) + failed++ + continue + } + if err := removeUserKnownHosts(vm); err != nil { + fmt.Fprintf(stderr, "known_hosts cleanup %s: %v\n", ref, err) + } + fmt.Fprintln(stdout, "deleted", ref) + } + if failed > 0 { + return fmt.Errorf("%d VM(s) failed to delete", failed) + } + return nil +} + +// promptYesNo reads a line from in and returns true iff the trimmed +// lowercase answer is "y" or "yes". EOF is "no"; other read errors +// surface to the caller. +func promptYesNo(in io.Reader, out io.Writer, prompt string) (bool, error) { + if _, err := fmt.Fprint(out, prompt); err != nil { + return false, err + } + reader := bufio.NewReader(in) + line, err := reader.ReadString('\n') + if err != nil && err != io.EOF { + return false, err + } + answer := strings.ToLower(strings.TrimSpace(line)) + return answer == "y" || answer == "yes", nil +} + +func (d *deps) newVMCreateCommand() *cobra.Command { + defaults := effectiveVMDefaults() + var ( + name string + imageName string + vcpu = defaults.VCPUCount + memory = defaults.MemoryMiB + systemOverlaySize = model.FormatSizeBytes(defaults.SystemOverlaySizeByte) + workDiskSize = model.FormatSizeBytes(defaults.WorkDiskSizeBytes) + natEnabled bool + noStart bool + verbose bool + ) + cmd := &cobra.Command{ + Use: "create", + Short: "Create a VM (without entering it)", + Long: strings.TrimSpace(` +Create a microVM in the 'running' state and return its summary. +Unlike 'banger vm run', this does NOT open an ssh session — pair it +with 'banger vm ssh ' when you want to attach. + +Use 'vm create' for a longer-lived VM you'll come back to. Use +'vm run' for one-shot sandboxes (especially with --rm). +`), + Example: strings.TrimSpace(` + banger vm create --name agent + banger vm create --name big --vcpu 8 --memory 16384 + banger vm create --no-start --name spare # provision but leave stopped +`), + Args: noArgsUsage("usage: banger vm create"), + RunE: func(cmd *cobra.Command, args []string) error { + params, err := vmCreateParamsFromFlags(cmd, name, imageName, vcpu, memory, systemOverlaySize, workDiskSize, natEnabled, noStart) + if err != nil { + return err + } + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + vm, err := d.runVMCreate(cmd.Context(), layout.SocketPath, cmd.ErrOrStderr(), params, verbose) + if err != nil { + return err + } + return printVMSummary(cmd.OutOrStdout(), vm) + }, + } + cmd.Flags().StringVar(&name, "name", "", "vm name") + cmd.Flags().StringVar(&imageName, "image", "", "image name or id (defaults to config's default_image_name; auto-pulled from imagecat if missing)") + cmd.Flags().IntVar(&vcpu, "vcpu", defaults.VCPUCount, "vcpu count") + cmd.Flags().IntVar(&memory, "memory", defaults.MemoryMiB, "memory in MiB") + cmd.Flags().StringVar(&systemOverlaySize, "system-overlay-size", model.FormatSizeBytes(defaults.SystemOverlaySizeByte), "system overlay size") + cmd.Flags().StringVar(&workDiskSize, "disk-size", model.FormatSizeBytes(defaults.WorkDiskSizeBytes), "work disk size") + cmd.Flags().BoolVar(&natEnabled, "nat", false, "enable outbound internet from the guest (host-side MASQUERADE)") + cmd.Flags().BoolVar(&noStart, "no-start", false, "create without starting") + cmd.Flags().BoolVarP(&verbose, "verbose", "v", false, "show every progress line instead of a single rewriting status line") + _ = cmd.RegisterFlagCompletionFunc("image", d.completeImageNames) + return cmd +} + +type vmListOptions struct { + showAll bool + latest bool + quiet bool +} + +func (d *deps) newPSCommand() *cobra.Command { + return d.newVMListLikeCommand("ps", nil, "usage: banger ps") +} + +func (d *deps) newVMListCommand() *cobra.Command { + return d.newVMListLikeCommand("list", []string{"ls", "ps"}, "usage: banger vm list") +} + +func (d *deps) newVMListLikeCommand(use string, aliases []string, usage string) *cobra.Command { + var opts vmListOptions + cmd := &cobra.Command{ + Use: use, + Aliases: aliases, + Short: "List VMs", + Args: noArgsUsage(usage), + RunE: func(cmd *cobra.Command, args []string) error { + return d.runVMList(cmd, opts) + }, + } + cmd.Flags().BoolVarP(&opts.showAll, "all", "a", false, "show all VMs") + cmd.Flags().BoolVarP(&opts.latest, "latest", "l", false, "show only the latest VM") + cmd.Flags().BoolVarP(&opts.quiet, "quiet", "q", false, "only show VM IDs") + return cmd +} + +func (d *deps) runVMList(cmd *cobra.Command, opts vmListOptions) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.VMListResult](cmd.Context(), layout.SocketPath, "vm.list", api.Empty{}) + if err != nil { + return err + } + vms := selectVMListVMs(result.VMs, opts.showAll, opts.latest) + if opts.quiet { + return printVMIDList(cmd.OutOrStdout(), vms) + } + images, err := rpc.Call[api.ImageListResult](cmd.Context(), layout.SocketPath, "image.list", api.Empty{}) + if err != nil { + return err + } + return printVMListTable(cmd.OutOrStdout(), vms, imageNameIndex(images.Images)) +} + +func selectVMListVMs(vms []model.VMRecord, showAll, latest bool) []model.VMRecord { + filtered := make([]model.VMRecord, 0, len(vms)) + for _, vm := range vms { + if !showAll && vm.State != model.VMStateRunning { + continue + } + filtered = append(filtered, vm) + } + if !latest || len(filtered) <= 1 { + return filtered + } + latestVM := filtered[0] + for _, vm := range filtered[1:] { + if vm.CreatedAt.After(latestVM.CreatedAt) { + latestVM = vm + continue + } + if vm.CreatedAt.Equal(latestVM.CreatedAt) && vm.UpdatedAt.After(latestVM.UpdatedAt) { + latestVM = vm + } + } + return []model.VMRecord{latestVM} +} + +func (d *deps) newVMShowCommand() *cobra.Command { + return &cobra.Command{ + Use: "show ", + Short: "Print full VM record as JSON", + Long: strings.TrimSpace(` +Emit the complete VM record (spec, runtime state, image reference, +created/updated timestamps) as a single JSON object. Suitable for +piping into 'jq' or feeding into automation. + +For human-readable summaries use 'banger ps' or 'banger vm stats'. +`), + Args: exactArgsUsage(1, "usage: banger vm show "), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, "vm.show", api.VMRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result.VM) + }, + } +} + +func (d *deps) newVMActionCommand(use, short, method string, aliases ...string) *cobra.Command { + return &cobra.Command{ + Use: use + " ...", + Aliases: aliases, + Short: short, + Args: minArgsUsage(1, fmt.Sprintf("usage: banger vm %s ...", use)), + ValidArgsFunction: d.completeVMNames, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if len(args) > 1 { + return runVMBatchAction(cmd, layout.SocketPath, args, func(ctx context.Context, id string) (model.VMRecord, error) { + result, err := rpc.Call[api.VMShowResult](ctx, layout.SocketPath, method, api.VMRefParams{IDOrName: id}) + if err != nil { + return model.VMRecord{}, err + } + return result.VM, nil + }) + } + result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, method, api.VMRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + return printVMSummary(cmd.OutOrStdout(), result.VM) + }, + } +} + +func (d *deps) newVMDeleteCommand() *cobra.Command { + return &cobra.Command{ + Use: "delete ...", + Aliases: []string{"rm"}, + Short: "Stop a VM and remove its disks (irreversible)", + Long: strings.TrimSpace(` +Stop the VM if it's running, then remove its work disk, system +overlay, snapshot, and metadata. Frees host disk space. The +operation is irreversible — anything written inside the guest is +lost. + +Use 'banger vm prune' to bulk-delete every VM that isn't running. +`), + Args: minArgsUsage(1, "usage: banger vm delete ..."), + ValidArgsFunction: d.completeVMNames, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + deleteOne := func(ctx context.Context, id string) (model.VMRecord, error) { + result, err := rpc.Call[api.VMShowResult](ctx, layout.SocketPath, "vm.delete", api.VMRefParams{IDOrName: id}) + if err != nil { + return model.VMRecord{}, err + } + if err := removeUserKnownHosts(result.VM); err != nil { + _, _ = fmt.Fprintf(cmd.ErrOrStderr(), "known_hosts cleanup for %s: %v\n", id, err) + } + return result.VM, nil + } + if len(args) > 1 { + return runVMBatchAction(cmd, layout.SocketPath, args, deleteOne) + } + vm, err := deleteOne(cmd.Context(), args[0]) + if err != nil { + return err + } + return printVMSummary(cmd.OutOrStdout(), vm) + }, + } +} + +func (d *deps) newVMSetCommand() *cobra.Command { + var ( + vcpu int + memory int + diskSize string + nat bool + noNat bool + ) + cmd := &cobra.Command{ + Use: "set ...", + Short: "Update stopped VM settings", + Long: strings.TrimSpace(` +Reconfigure one or more stopped VMs. The VM must be stopped before +reconfiguring — start it again with 'banger vm start' to apply the new settings. +`), + Example: strings.TrimSpace(` + banger vm set dev --vcpu 4 --memory 8192 +`), + Args: minArgsUsage(1, "usage: banger vm set [--vcpu N] [--memory MiB] [--disk-size SIZE] [--nat|--no-nat] ..."), + ValidArgsFunction: d.completeVMNames, + RunE: func(cmd *cobra.Command, args []string) error { + params, err := vmSetParamsFromFlags(args[0], vcpu, memory, diskSize, nat, noNat) + if err != nil { + return err + } + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if len(args) > 1 { + return runVMBatchAction(cmd, layout.SocketPath, args, func(ctx context.Context, id string) (model.VMRecord, error) { + batchParams := params + batchParams.IDOrName = id + result, err := rpc.Call[api.VMShowResult](ctx, layout.SocketPath, "vm.set", batchParams) + if err != nil { + return model.VMRecord{}, err + } + return result.VM, nil + }) + } + result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, "vm.set", params) + if err != nil { + return err + } + return printVMSummary(cmd.OutOrStdout(), result.VM) + }, + } + cmd.Flags().IntVar(&vcpu, "vcpu", -1, "vcpu count") + cmd.Flags().IntVar(&memory, "memory", -1, "memory in MiB") + cmd.Flags().StringVar(&diskSize, "disk-size", "", "new work disk size") + cmd.Flags().BoolVar(&nat, "nat", false, "enable NAT") + cmd.Flags().BoolVar(&noNat, "no-nat", false, "disable NAT") + return cmd +} + +func (d *deps) newVMSSHCommand() *cobra.Command { + return &cobra.Command{ + Use: "ssh [ssh args...]", + Short: "Open an interactive ssh session to a running VM", + Long: strings.TrimSpace(` +Connect to a running VM as root over the host bridge. Trailing +arguments are passed through to the underlying 'ssh' command, so +'-- -L 8080:localhost:8080' forwards a port and '-- echo hi' runs +a single command and exits. + +To run a one-shot command without holding a session, prefer +'banger vm run --rm -- ' over 'vm ssh -- '. +`), + Example: strings.TrimSpace(` + banger vm ssh agent + banger vm ssh agent -- uname -a + banger vm ssh agent -- -L 8080:localhost:8080 -N +`), + Args: minArgsUsage(1, "usage: banger vm ssh [ssh args...]"), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, cfg, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if err := validateSSHPrereqs(cfg); err != nil { + return err + } + result, err := d.vmSSH(cmd.Context(), layout.SocketPath, args[0]) + if err != nil { + return err + } + sshArgs, err := sshCommandArgs(cfg, result.GuestIP, args[1:]) + if err != nil { + return err + } + return d.runSSHSession(cmd.Context(), layout.SocketPath, result.Name, cmd.InOrStdin(), cmd.OutOrStdout(), cmd.ErrOrStderr(), sshArgs, false) + }, + } +} + +func (d *deps) newVMWorkspaceCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "workspace", + Short: "Ship a host repo into a guest, pull diffs back", + Long: strings.TrimSpace(` +Two-step pattern for round-tripping a working tree through a guest +VM: + + prepare Copy a local git repo into the guest at /root/repo + (or any path you choose). Default ships tracked files + only; pass --include-untracked to ship the rest. + export Capture every change inside the guest workspace as a + host-readable patch. Non-mutating: the guest's working + tree is left untouched. + +This is the supported flow for AI agents and CI runners that want +to evaluate code changes inside a sandbox without touching the +host checkout. 'banger vm run ./repo -- ' is shorthand for +prepare + run + delete. +`), + Example: strings.TrimSpace(` + banger vm workspace prepare agent ../repo + banger vm ssh agent -- bash -lc 'cd /root/repo && make test' + banger vm workspace export agent --base-commit > out.patch +`), + RunE: helpNoArgs, + } + cmd.AddCommand( + d.newVMWorkspacePrepareCommand(), + d.newVMWorkspaceExportCommand(), + ) + return cmd +} + +func (d *deps) newVMWorkspacePrepareCommand() *cobra.Command { + var guestPath string + var branchName string + var fromRef string + var mode string + var includeUntracked bool + var dryRun bool + cmd := &cobra.Command{ + Use: "prepare [path]", + Short: "Copy a local repo into a running VM", + Long: "Prepare a repository workspace from a local git checkout into a running VM. The default guest path is /root/repo and the default mode is shallow_overlay. Repositories with git submodules must use --mode full_copy.", + Args: minArgsUsage(1, "usage: banger vm workspace prepare [path]"), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + Example: strings.TrimSpace(` + banger vm workspace prepare devbox + banger vm workspace prepare devbox ../repo --guest-path /root/repo + banger vm workspace prepare devbox ../repo --mode full_copy +`), + RunE: func(cmd *cobra.Command, args []string) error { + sourcePath := "" + if len(args) > 1 { + sourcePath = args[1] + } + if strings.TrimSpace(sourcePath) == "" { + wd, err := d.cwd() + if err != nil { + return err + } + sourcePath = wd + } + resolvedPath, err := workspace.ResolveSourcePath(sourcePath) + if err != nil { + return err + } + prepareFrom := "" + if strings.TrimSpace(branchName) != "" { + prepareFrom = fromRef + } + if dryRun { + return d.runWorkspaceDryRun(cmd.Context(), cmd.OutOrStdout(), resolvedPath, branchName, prepareFrom, includeUntracked) + } + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if !includeUntracked { + d.noteUntrackedSkipped(cmd.Context(), cmd.ErrOrStderr(), resolvedPath) + } + result, err := d.vmWorkspacePrepare(cmd.Context(), layout.SocketPath, api.VMWorkspacePrepareParams{ + IDOrName: args[0], + SourcePath: resolvedPath, + GuestPath: guestPath, + Branch: branchName, + From: prepareFrom, + Mode: mode, + IncludeUntracked: includeUntracked, + }) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result.Workspace) + }, + } + cmd.Flags().StringVar(&guestPath, "guest-path", "/root/repo", "guest workspace path") + cmd.Flags().StringVar(&branchName, "branch", "", "create and switch to a new guest branch") + cmd.Flags().StringVar(&fromRef, "from", "HEAD", "git ref to branch from when --branch is set (default: HEAD)") + cmd.Flags().StringVar(&mode, "mode", string(model.WorkspacePrepareModeShallowOverlay), "workspace mode: shallow_overlay, full_copy, metadata_only") + cmd.Flags().BoolVar(&includeUntracked, "include-untracked", false, "also copy untracked non-ignored files into the guest workspace (default: tracked files only)") + cmd.Flags().BoolVar(&dryRun, "dry-run", false, "list the files that would be copied and exit without touching the guest") + return cmd +} + +func (d *deps) newVMWorkspaceExportCommand() *cobra.Command { + var guestPath string + var outputPath string + var baseCommit string + cmd := &cobra.Command{ + Use: "export ", + Short: "Pull changes from a guest workspace back to the host as a patch", + Long: "Emit a binary-safe unified diff of every change inside the guest workspace (committed since base + uncommitted + untracked, minus .gitignore). Non-mutating — the guest's index and working tree are untouched. Pass --base-commit with the head_commit from workspace prepare to capture changes even when the worker ran git commit inside the VM. Without --base-commit the diff is against the current guest HEAD, which misses committed changes.", + Args: exactArgsUsage(1, "usage: banger vm workspace export "), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + Example: strings.TrimSpace(` + banger vm workspace export devbox | git apply + banger vm workspace export devbox --base-commit abc1234 | git apply + banger vm workspace export devbox --output worker.diff + banger vm workspace export devbox --guest-path /root/project --output changes.diff +`), + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := d.vmWorkspaceExport(cmd.Context(), layout.SocketPath, api.WorkspaceExportParams{ + IDOrName: args[0], + GuestPath: guestPath, + BaseCommit: baseCommit, + }) + if err != nil { + return err + } + if !result.HasChanges { + _, _ = fmt.Fprintln(cmd.ErrOrStderr(), "no changes") + return nil + } + if outputPath != "" { + if err := os.WriteFile(outputPath, result.Patch, 0o644); err != nil { + return fmt.Errorf("write patch: %w", err) + } + _, err = fmt.Fprintf(cmd.ErrOrStderr(), "patch written to %s (%d bytes, %d files)\n", + outputPath, len(result.Patch), len(result.ChangedFiles)) + return err + } + _, err = cmd.OutOrStdout().Write(result.Patch) + return err + }, + } + cmd.Flags().StringVar(&guestPath, "guest-path", "/root/repo", "guest workspace path") + cmd.Flags().StringVar(&outputPath, "output", "", "write patch to this file instead of stdout") + cmd.Flags().StringVar(&baseCommit, "base-commit", "", "diff from this commit (use head_commit from workspace prepare to capture worker git commits)") + return cmd +} + +func (d *deps) newVMLogsCommand() *cobra.Command { + var follow bool + cmd := &cobra.Command{ + Use: "logs ", + Short: "Show guest console + per-VM daemon log", + Long: strings.TrimSpace(` +Print the firecracker console log (kernel + early init output) and +the per-VM daemon log (lifecycle stages, errors). Pass -f to follow +new lines as they arrive — useful while a VM is starting up or +hanging on boot. +`), + Example: strings.TrimSpace(` + banger vm logs agent + banger vm logs agent -f +`), + Args: exactArgsUsage(1, "usage: banger vm logs [-f] "), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.VMLogsResult](cmd.Context(), layout.SocketPath, "vm.logs", api.VMRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + if result.LogPath == "" { + return errors.New("vm has no log path") + } + return system.CopyStream(cmd.OutOrStdout(), system.TailCommand(result.LogPath, follow)) + }, + } + cmd.Flags().BoolVarP(&follow, "follow", "f", false, "follow logs") + return cmd +} + +func (d *deps) newVMStatsCommand() *cobra.Command { + return &cobra.Command{ + Use: "stats ", + Short: "Show VM stats", + Long: strings.TrimSpace(` +Print real-time resource statistics for a running VM as a JSON object, +including CPU usage, memory balloon metrics, and disk I/O counters. +Pipe into 'jq' for quick field extraction, e.g. banger vm stats dev | jq .mem. +`), + Example: strings.TrimSpace(` + banger vm stats dev + banger vm stats dev | jq . +`), + Args: exactArgsUsage(1, "usage: banger vm stats "), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := rpc.Call[api.VMStatsResult](cmd.Context(), layout.SocketPath, "vm.stats", api.VMRefParams{IDOrName: args[0]}) + if err != nil { + return err + } + return printJSON(cmd.OutOrStdout(), result) + }, + } +} + +func (d *deps) newVMPortsCommand() *cobra.Command { + var jsonOut bool + cmd := &cobra.Command{ + Use: "ports ", + Short: "Show host-reachable listening guest ports", + Args: exactArgsUsage(1, "usage: banger vm ports "), + ValidArgsFunction: d.completeVMNameOnlyAtPos0, + RunE: func(cmd *cobra.Command, args []string) error { + layout, _, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + result, err := d.vmPorts(cmd.Context(), layout.SocketPath, args[0]) + if err != nil { + return err + } + if jsonOut { + return printJSON(cmd.OutOrStdout(), result) + } + return printVMPortsTable(cmd.OutOrStdout(), result) + }, + } + cmd.Flags().BoolVar(&jsonOut, "json", false, "print ports as JSON instead of a table") + return cmd +} + +type resolvedVMTarget struct { + Index int + Ref string + VM model.VMRecord +} + +type vmRefResolutionError struct { + Index int + Ref string + Err error +} + +type vmBatchActionResult struct { + Target resolvedVMTarget + VM model.VMRecord + Err error +} + +func runVMBatchAction(cmd *cobra.Command, socketPath string, refs []string, action func(context.Context, string) (model.VMRecord, error)) error { + listResult, err := rpc.Call[api.VMListResult](cmd.Context(), socketPath, "vm.list", api.Empty{}) + if err != nil { + return err + } + targets, resolutionErrs := resolveVMTargets(listResult.VMs, refs) + results := executeVMActionBatch(cmd.Context(), targets, action) + + failed := false + for _, resolutionErr := range resolutionErrs { + if _, err := fmt.Fprintf(cmd.ErrOrStderr(), "%s: %v\n", resolutionErr.Ref, resolutionErr.Err); err != nil { + return err + } + failed = true + } + for _, result := range results { + if result.Err != nil { + if _, err := fmt.Fprintf(cmd.ErrOrStderr(), "%s: %v\n", result.Target.Ref, result.Err); err != nil { + return err + } + failed = true + continue + } + if err := printVMSummary(cmd.OutOrStdout(), result.VM); err != nil { + return err + } + } + if failed { + return errors.New("one or more VM operations failed") + } + return nil +} + +func resolveVMTargets(vms []model.VMRecord, refs []string) ([]resolvedVMTarget, []vmRefResolutionError) { + targets := make([]resolvedVMTarget, 0, len(refs)) + resolutionErrs := make([]vmRefResolutionError, 0) + seen := make(map[string]struct{}, len(refs)) + for index, ref := range refs { + vm, err := resolveVMRef(vms, ref) + if err != nil { + resolutionErrs = append(resolutionErrs, vmRefResolutionError{Index: index, Ref: ref, Err: err}) + continue + } + if _, ok := seen[vm.ID]; ok { + continue + } + seen[vm.ID] = struct{}{} + targets = append(targets, resolvedVMTarget{Index: index, Ref: ref, VM: vm}) + } + return targets, resolutionErrs +} + +func resolveVMRef(vms []model.VMRecord, ref string) (model.VMRecord, error) { + ref = strings.TrimSpace(ref) + if ref == "" { + return model.VMRecord{}, errors.New("vm id or name is required") + } + exactMatches := make([]model.VMRecord, 0, 1) + for _, vm := range vms { + if vm.ID == ref || vm.Name == ref { + exactMatches = append(exactMatches, vm) + } + } + switch len(exactMatches) { + case 1: + return exactMatches[0], nil + case 0: + default: + return model.VMRecord{}, fmt.Errorf("multiple VMs match %q", ref) + } + + prefixMatches := make([]model.VMRecord, 0, 1) + for _, vm := range vms { + if strings.HasPrefix(vm.ID, ref) || strings.HasPrefix(vm.Name, ref) { + prefixMatches = append(prefixMatches, vm) + } + } + switch len(prefixMatches) { + case 1: + return prefixMatches[0], nil + case 0: + return model.VMRecord{}, fmt.Errorf("vm %q not found", ref) + default: + return model.VMRecord{}, fmt.Errorf("multiple VMs match %q", ref) + } +} + +func executeVMActionBatch(ctx context.Context, targets []resolvedVMTarget, action func(context.Context, string) (model.VMRecord, error)) []vmBatchActionResult { + results := make([]vmBatchActionResult, len(targets)) + var wg sync.WaitGroup + wg.Add(len(targets)) + for index, target := range targets { + index := index + target := target + go func() { + defer wg.Done() + vm, err := action(ctx, target.VM.ID) + results[index] = vmBatchActionResult{ + Target: target, + VM: vm, + Err: err, + } + }() + } + wg.Wait() + return results +} + +func vmSetParamsFromFlags(idOrName string, vcpu, memory int, diskSize string, nat, noNat bool) (api.VMSetParams, error) { + if nat && noNat { + return api.VMSetParams{}, errors.New("use only one of --nat or --no-nat") + } + params := api.VMSetParams{IDOrName: idOrName, WorkDiskSize: diskSize} + if vcpu >= 0 { + if err := validatePositiveSetting("vcpu", vcpu); err != nil { + return api.VMSetParams{}, err + } + params.VCPUCount = &vcpu + } + if memory >= 0 { + if err := validatePositiveSetting("memory", memory); err != nil { + return api.VMSetParams{}, err + } + params.MemoryMiB = &memory + } + if nat || noNat { + value := nat && !noNat + params.NATEnabled = &value + } + if params.VCPUCount == nil && params.MemoryMiB == nil && params.WorkDiskSize == "" && params.NATEnabled == nil { + return api.VMSetParams{}, errors.New("no VM settings changed") + } + return params, nil +} + +func vmCreateParamsFromFlags(cmd *cobra.Command, name, imageName string, vcpu, memory int, systemOverlaySize, workDiskSize string, natEnabled, noStart bool) (api.VMCreateParams, error) { + // Flag defaults were resolved from config + host heuristics at + // command-build time, so we always forward the flag values. The CLI + // becomes the single source of truth for effective defaults and the + // progress renderer shows the exact sizing. + if strings.TrimSpace(name) != "" { + if err := model.ValidateVMName(name); err != nil { + return api.VMCreateParams{}, err + } + } + if err := validatePositiveSetting("vcpu", vcpu); err != nil { + return api.VMCreateParams{}, err + } + if err := validatePositiveSetting("memory", memory); err != nil { + return api.VMCreateParams{}, err + } + params := api.VMCreateParams{ + Name: name, + ImageName: imageName, + NATEnabled: natEnabled, + NoStart: noStart, + VCPUCount: &vcpu, + MemoryMiB: &memory, + SystemOverlaySize: systemOverlaySize, + WorkDiskSize: workDiskSize, + } + return params, nil +} diff --git a/internal/cli/completion.go b/internal/cli/completion.go new file mode 100644 index 0000000..8bb4f8b --- /dev/null +++ b/internal/cli/completion.go @@ -0,0 +1,191 @@ +package cli + +import ( + "context" + + "banger/internal/api" + "banger/internal/paths" + "banger/internal/rpc" + + "github.com/spf13/cobra" +) + +// Completion helpers. Design notes: +// +// - Never auto-start the daemon. If it isn't running, return no +// suggestions + NoFileComp so the shell doesn't fall back to file +// completion (there are no local files that would plausibly match a +// VM or image name). +// - Filter out names already in args — avoids suggesting the same VM +// twice on variadic commands like `vm stop a b `. +// - Fail silently. Completion is advisory; any error path returns an +// empty suggestion list rather than propagating to the user. + +// defaultCompletionLister backs the *deps.completionLister field; +// tests inject their own fake via the struct instead of mutating +// package-level vars. +func defaultCompletionLister(ctx context.Context, socketPath, method string) ([]string, error) { + switch method { + case "vm.list": + result, err := rpc.Call[api.VMListResult](ctx, socketPath, method, api.Empty{}) + if err != nil { + return nil, err + } + names := make([]string, 0, len(result.VMs)) + for _, vm := range result.VMs { + if vm.Name != "" { + names = append(names, vm.Name) + } + } + return names, nil + case "image.list": + result, err := rpc.Call[api.ImageListResult](ctx, socketPath, method, api.Empty{}) + if err != nil { + return nil, err + } + names := make([]string, 0, len(result.Images)) + for _, image := range result.Images { + if image.Name != "" { + names = append(names, image.Name) + } + } + return names, nil + case "kernel.list": + result, err := rpc.Call[api.KernelListResult](ctx, socketPath, method, api.Empty{}) + if err != nil { + return nil, err + } + names := make([]string, 0, len(result.Entries)) + for _, entry := range result.Entries { + if entry.Name != "" { + names = append(names, entry.Name) + } + } + return names, nil + } + return nil, nil +} + +// daemonSocketForCompletion returns the socket path IFF the daemon is +// already running. Returns "", false when no daemon is up — completion +// callers use this as the bail signal. +func (d *deps) daemonSocketForCompletion(ctx context.Context) (string, bool) { + layout := paths.ResolveSystem() + if _, err := d.daemonPing(ctx, layout.SocketPath); err != nil { + return "", false + } + return layout.SocketPath, true +} + +// filterPrefix returns the subset of candidates starting with toComplete +// that aren't in exclude. Comparison is case-sensitive because VM/image +// names preserve case. +func filterPrefix(candidates, exclude []string, toComplete string) []string { + excludeSet := make(map[string]struct{}, len(exclude)) + for _, e := range exclude { + excludeSet[e] = struct{}{} + } + out := make([]string, 0, len(candidates)) + for _, c := range candidates { + if _, skip := excludeSet[c]; skip { + continue + } + if toComplete == "" || hasPrefix(c, toComplete) { + out = append(out, c) + } + } + return out +} + +func hasPrefix(s, prefix string) bool { + return len(s) >= len(prefix) && s[:len(prefix)] == prefix +} + +func (d *deps) completeVMNames(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + socket, ok := d.daemonSocketForCompletion(cmd.Context()) + if !ok { + return nil, cobra.ShellCompDirectiveNoFileComp + } + names, err := d.completionLister(cmd.Context(), socket, "vm.list") + if err != nil { + return nil, cobra.ShellCompDirectiveNoFileComp + } + return filterPrefix(names, args, toComplete), cobra.ShellCompDirectiveNoFileComp +} + +// completeVMNameOnlyAtPos0 restricts VM-name completion to the first +// positional argument. Used by commands like `vm ssh [ssh args...]` +// where args after pos 0 are free-form. +func (d *deps) completeVMNameOnlyAtPos0(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + if len(args) > 0 { + return nil, cobra.ShellCompDirectiveNoFileComp + } + return d.completeVMNames(cmd, args, toComplete) +} + +func (d *deps) completeImageNameOnlyAtPos0(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + if len(args) > 0 { + return nil, cobra.ShellCompDirectiveNoFileComp + } + return d.completeImageNames(cmd, args, toComplete) +} + +func (d *deps) completeKernelNameOnlyAtPos0(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + if len(args) > 0 { + return nil, cobra.ShellCompDirectiveNoFileComp + } + return d.completeKernelNames(cmd, args, toComplete) +} + +func (d *deps) completeImageNames(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + socket, ok := d.daemonSocketForCompletion(cmd.Context()) + if !ok { + return nil, cobra.ShellCompDirectiveNoFileComp + } + names, err := d.completionLister(cmd.Context(), socket, "image.list") + if err != nil { + return nil, cobra.ShellCompDirectiveNoFileComp + } + return filterPrefix(names, args, toComplete), cobra.ShellCompDirectiveNoFileComp +} + +func (d *deps) completeKernelNames(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + socket, ok := d.daemonSocketForCompletion(cmd.Context()) + if !ok { + return nil, cobra.ShellCompDirectiveNoFileComp + } + names, err := d.completionLister(cmd.Context(), socket, "kernel.list") + if err != nil { + return nil, cobra.ShellCompDirectiveNoFileComp + } + return filterPrefix(names, args, toComplete), cobra.ShellCompDirectiveNoFileComp +} + +// completeKernelCatalogNameOnlyAtPos0 completes kernel names from the +// remote catalog (pulled + available) at position 0 only. +func (d *deps) completeKernelCatalogNameOnlyAtPos0(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + if len(args) > 0 { + return nil, cobra.ShellCompDirectiveNoFileComp + } + socket, ok := d.daemonSocketForCompletion(cmd.Context()) + if !ok { + return nil, cobra.ShellCompDirectiveNoFileComp + } + result, err := rpc.Call[api.KernelCatalogResult](cmd.Context(), socket, "kernel.catalog", api.Empty{}) + if err != nil { + return nil, cobra.ShellCompDirectiveNoFileComp + } + names := make([]string, 0, len(result.Entries)) + for _, entry := range result.Entries { + if entry.Name != "" { + names = append(names, entry.Name) + } + } + return filterPrefix(names, args, toComplete), cobra.ShellCompDirectiveNoFileComp +} + +// completeImageCatalogNameOnlyAtPos0 falls back to the locally-installed +// image list (there is no remote image catalog RPC today). +func (d *deps) completeImageCatalogNameOnlyAtPos0(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) { + return d.completeImageNameOnlyAtPos0(cmd, args, toComplete) +} diff --git a/internal/cli/completion_test.go b/internal/cli/completion_test.go new file mode 100644 index 0000000..4c542c4 --- /dev/null +++ b/internal/cli/completion_test.go @@ -0,0 +1,175 @@ +package cli + +import ( + "context" + "errors" + "reflect" + "testing" + + "banger/internal/api" + + "github.com/spf13/cobra" +) + +// stubCompletionSeams installs test doubles for the daemon ping + lister +// seams on the caller's *deps. Tests opt into the sub-functions they +// actually need. +func stubCompletionSeams( + t *testing.T, + d *deps, + pingErr error, + names map[string][]string, + listErr error) { + t.Helper() + + d.daemonPing = func(ctx context.Context, socketPath string) (api.PingResult, error) { + if pingErr != nil { + return api.PingResult{}, pingErr + } + return api.PingResult{}, nil + } + d.completionLister = func(ctx context.Context, socketPath, method string) ([]string, error) { + if listErr != nil { + return nil, listErr + } + return names[method], nil + } +} + +func TestFilterPrefix(t *testing.T) { + cases := []struct { + name string + candidates []string + exclude []string + prefix string + want []string + }{ + {"no filter", []string{"a", "b"}, nil, "", []string{"a", "b"}}, + {"prefix match", []string{"apple", "banana", "apricot"}, nil, "ap", []string{"apple", "apricot"}}, + {"exclude already entered", []string{"a", "b", "c"}, []string{"b"}, "", []string{"a", "c"}}, + {"prefix + exclude", []string{"alpha", "avocado", "banana"}, []string{"alpha"}, "a", []string{"avocado"}}, + {"exact case sensitive", []string{"VM", "vm"}, nil, "v", []string{"vm"}}, + {"empty candidates", nil, nil, "any", nil}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := filterPrefix(tc.candidates, tc.exclude, tc.prefix) + if !reflect.DeepEqual(got, tc.want) { + // Allow nil == empty + if len(got) == 0 && len(tc.want) == 0 { + return + } + t.Errorf("got %v, want %v", got, tc.want) + } + }) + } +} + +func testCmdWithCtx() *cobra.Command { + cmd := &cobra.Command{Use: "test"} + cmd.SetContext(context.Background()) + return cmd +} + +func TestCompleteVMNamesHappyPath(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"vm.list": {"alpha", "beta", "gamma"}}, nil) + + got, directive := d.completeVMNames(testCmdWithCtx(), nil, "") + if directive != cobra.ShellCompDirectiveNoFileComp { + t.Errorf("directive = %d, want NoFileComp", directive) + } + if !reflect.DeepEqual(got, []string{"alpha", "beta", "gamma"}) { + t.Errorf("got %v", got) + } +} + +func TestCompleteVMNamesDaemonDown(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, errors.New("connection refused"), nil, nil) + + got, directive := d.completeVMNames(testCmdWithCtx(), nil, "") + if len(got) != 0 { + t.Errorf("daemon-down should return no suggestions, got %v", got) + } + if directive != cobra.ShellCompDirectiveNoFileComp { + t.Errorf("directive = %d, want NoFileComp", directive) + } +} + +func TestCompleteVMNamesRPCError(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, nil, errors.New("rpc failed")) + + got, _ := d.completeVMNames(testCmdWithCtx(), nil, "") + if len(got) != 0 { + t.Errorf("rpc error should return no suggestions, got %v", got) + } +} + +func TestCompleteVMNamesExcludesAlreadyEntered(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"vm.list": {"alpha", "beta", "gamma"}}, nil) + + got, _ := d.completeVMNames(testCmdWithCtx(), []string{"alpha"}, "") + want := []string{"beta", "gamma"} + if !reflect.DeepEqual(got, want) { + t.Errorf("got %v, want %v", got, want) + } +} + +func TestCompleteVMNamesPrefixFilter(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"vm.list": {"alpha", "beta", "alphabet"}}, nil) + + got, _ := d.completeVMNames(testCmdWithCtx(), nil, "alp") + want := []string{"alpha", "alphabet"} + if !reflect.DeepEqual(got, want) { + t.Errorf("got %v, want %v", got, want) + } +} + +func TestCompleteVMNameOnlyAtPos0(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"vm.list": {"alpha"}}, nil) + + atPos0, _ := d.completeVMNameOnlyAtPos0(testCmdWithCtx(), nil, "") + if len(atPos0) != 1 || atPos0[0] != "alpha" { + t.Errorf("pos 0: got %v", atPos0) + } + + atPos1, _ := d.completeVMNameOnlyAtPos0(testCmdWithCtx(), []string{"alpha"}, "") + if len(atPos1) != 0 { + t.Errorf("pos 1+ should be silent, got %v", atPos1) + } +} + +func TestCompleteImageNames(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"image.list": {"debian-bookworm", "alpine"}}, nil) + + got, _ := d.completeImageNames(testCmdWithCtx(), nil, "") + if !reflect.DeepEqual(got, []string{"debian-bookworm", "alpine"}) { + t.Errorf("got %v", got) + } +} + +func TestCompleteKernelNames(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"kernel.list": {"generic-6.12"}}, nil) + + got, _ := d.completeKernelNames(testCmdWithCtx(), nil, "") + if len(got) != 1 || got[0] != "generic-6.12" { + t.Errorf("got %v", got) + } +} + +func TestCompleteImageNameOnlyAtPos0SilentAfterFirst(t *testing.T) { + d := defaultDeps() + stubCompletionSeams(t, d, nil, map[string][]string{"image.list": {"alpine"}}, nil) + + after, _ := d.completeImageNameOnlyAtPos0(testCmdWithCtx(), []string{"alpine"}, "") + if len(after) != 0 { + t.Errorf("expected silence at pos 1+, got %v", after) + } +} diff --git a/internal/cli/daemon_lifecycle.go b/internal/cli/daemon_lifecycle.go new file mode 100644 index 0000000..4c9f8c1 --- /dev/null +++ b/internal/cli/daemon_lifecycle.go @@ -0,0 +1,93 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "os" + "strings" + "time" + + "banger/internal/config" + "banger/internal/installmeta" + "banger/internal/model" + "banger/internal/paths" +) + +const ( + daemonReadyTimeout = 15 * time.Second + daemonReadyPollInterval = 100 * time.Millisecond +) + +// waitForDaemonReady blocks until the daemon at socketPath answers +// ping, the context is cancelled, or daemonReadyTimeout elapses. +// Used by `system install` and `system restart` so they don't return +// before the daemon has actually finished binding its socket — the +// systemd Type=simple unit reports "active" the moment the binary +// is exec()'d, well before bangerd has read its config and listened +// on the unix socket. +func (d *deps) waitForDaemonReady(ctx context.Context, socketPath string) error { + deadline := time.Now().Add(daemonReadyTimeout) + pingCtx, cancel := context.WithDeadline(ctx, deadline) + defer cancel() + for { + if _, err := d.daemonPing(pingCtx, socketPath); err == nil { + return nil + } + if time.Now().After(deadline) { + return fmt.Errorf("daemon did not become ready at %s within %s", socketPath, daemonReadyTimeout) + } + select { + case <-pingCtx.Done(): + return fmt.Errorf("daemon did not become ready at %s: %w", socketPath, pingCtx.Err()) + case <-time.After(daemonReadyPollInterval): + } + } +} + +var ( + loadInstallMetadata = func() (installmeta.Metadata, error) { + return installmeta.Load(installmeta.DefaultPath) + } + currentUID = os.Getuid +) + +// ensureDaemon validates that the current CLI user matches the +// installed banger owner, then pings the system socket. Every CLI +// command that needs to talk to the daemon routes through here. +func (d *deps) ensureDaemon(ctx context.Context) (paths.Layout, model.DaemonConfig, error) { + meta, metaErr := loadInstallMetadata() + if metaErr == nil && currentUID() != meta.OwnerUID { + return paths.Layout{}, model.DaemonConfig{}, fmt.Errorf("banger is installed for %s; switch to that user or reinstall with `sudo banger system install --owner %s`", meta.OwnerUser, userHint()) + } + if metaErr != nil && !errors.Is(metaErr, os.ErrNotExist) { + return paths.Layout{}, model.DaemonConfig{}, fmt.Errorf("load %s: %w", installmeta.DefaultPath, metaErr) + } + + userLayout, err := paths.Resolve() + if err != nil { + return paths.Layout{}, model.DaemonConfig{}, err + } + cfg, err := config.Load(userLayout) + if err != nil { + return paths.Layout{}, model.DaemonConfig{}, err + } + layout := paths.ResolveSystem() + if _, err := d.daemonPing(ctx, layout.SocketPath); err == nil { + return layout, cfg, nil + } + if metaErr == nil { + return paths.Layout{}, model.DaemonConfig{}, fmt.Errorf("banger service not reachable at %s; run `sudo banger system restart`", layout.SocketPath) + } + return paths.Layout{}, model.DaemonConfig{}, fmt.Errorf("banger service not running at %s; run `sudo banger system install`", layout.SocketPath) +} + +func userHint() string { + if sudoUser := strings.TrimSpace(os.Getenv("SUDO_USER")); sudoUser != "" { + return sudoUser + } + if user := strings.TrimSpace(os.Getenv("USER")); user != "" { + return user + } + return "" +} diff --git a/internal/cli/daemon_lifecycle_test.go b/internal/cli/daemon_lifecycle_test.go new file mode 100644 index 0000000..f4c7779 --- /dev/null +++ b/internal/cli/daemon_lifecycle_test.go @@ -0,0 +1,227 @@ +package cli + +import ( + "context" + "errors" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/installmeta" +) + +func TestEnsureDaemonRequiresSystemInstallWhenMetadataMissing(t *testing.T) { + t.Setenv("XDG_CONFIG_HOME", filepath.Join(t.TempDir(), "config")) + t.Setenv("XDG_STATE_HOME", filepath.Join(t.TempDir(), "state")) + t.Setenv("XDG_CACHE_HOME", filepath.Join(t.TempDir(), "cache")) + t.Setenv("XDG_RUNTIME_DIR", filepath.Join(t.TempDir(), "run")) + + restoreLoad := loadInstallMetadata + restoreUID := currentUID + t.Cleanup(func() { + loadInstallMetadata = restoreLoad + currentUID = restoreUID + }) + + loadInstallMetadata = func() (installmeta.Metadata, error) { + return installmeta.Metadata{}, os.ErrNotExist + } + currentUID = os.Getuid + + d := defaultDeps() + d.daemonPing = func(context.Context, string) (api.PingResult, error) { + return api.PingResult{}, errors.New("dial unix /run/banger/bangerd.sock: no such file") + } + + _, _, err := d.ensureDaemon(context.Background()) + if err == nil || !strings.Contains(err.Error(), "sudo banger system install") { + t.Fatalf("ensureDaemon error = %v, want install guidance", err) + } +} + +func TestEnsureDaemonSuggestsRestartWhenInstalledButUnavailable(t *testing.T) { + t.Setenv("XDG_CONFIG_HOME", filepath.Join(t.TempDir(), "config")) + t.Setenv("XDG_STATE_HOME", filepath.Join(t.TempDir(), "state")) + t.Setenv("XDG_CACHE_HOME", filepath.Join(t.TempDir(), "cache")) + t.Setenv("XDG_RUNTIME_DIR", filepath.Join(t.TempDir(), "run")) + + restoreLoad := loadInstallMetadata + restoreUID := currentUID + t.Cleanup(func() { + loadInstallMetadata = restoreLoad + currentUID = restoreUID + }) + + loadInstallMetadata = func() (installmeta.Metadata, error) { + return installmeta.Metadata{ + OwnerUser: "tester", + OwnerUID: os.Getuid(), + OwnerGID: os.Getgid(), + OwnerHome: t.TempDir(), + }, nil + } + currentUID = os.Getuid + + d := defaultDeps() + d.daemonPing = func(context.Context, string) (api.PingResult, error) { + return api.PingResult{}, errors.New("dial unix /run/banger/bangerd.sock: connection refused") + } + + _, _, err := d.ensureDaemon(context.Background()) + if err == nil || !strings.Contains(err.Error(), "sudo banger system restart") { + t.Fatalf("ensureDaemon error = %v, want restart guidance", err) + } +} + +func TestEnsureDaemonRejectsNonOwnerUser(t *testing.T) { + restoreLoad := loadInstallMetadata + restoreUID := currentUID + t.Cleanup(func() { + loadInstallMetadata = restoreLoad + currentUID = restoreUID + }) + + loadInstallMetadata = func() (installmeta.Metadata, error) { + return installmeta.Metadata{ + OwnerUser: "alice", + OwnerUID: os.Getuid() + 1, + OwnerGID: os.Getgid(), + OwnerHome: t.TempDir(), + }, nil + } + currentUID = os.Getuid + + d := defaultDeps() + d.daemonPing = func(context.Context, string) (api.PingResult, error) { + t.Fatal("daemonPing should not be called for a non-owner user") + return api.PingResult{}, nil + } + + _, _, err := d.ensureDaemon(context.Background()) + if err == nil || !strings.Contains(err.Error(), "installed for alice") { + t.Fatalf("ensureDaemon error = %v, want owner mismatch guidance", err) + } +} + +func TestSystemSubcommandFlagsAreScoped(t *testing.T) { + root := NewBangerCommand() + + systemCmd, _, err := root.Find([]string{"system"}) + if err != nil { + t.Fatalf("find system: %v", err) + } + installCmd, _, err := systemCmd.Find([]string{"install"}) + if err != nil { + t.Fatalf("find system install: %v", err) + } + uninstallCmd, _, err := systemCmd.Find([]string{"uninstall"}) + if err != nil { + t.Fatalf("find system uninstall: %v", err) + } + if installCmd.Flags().Lookup("owner") == nil { + t.Fatal("system install is missing --owner") + } + if uninstallCmd.Flags().Lookup("purge") == nil { + t.Fatal("system uninstall is missing --purge") + } +} + +func TestRenderSystemdUnitIncludesHardeningDirectives(t *testing.T) { + unit := renderSystemdUnit(installmeta.Metadata{ + OwnerUser: "alice", + OwnerUID: 1000, + OwnerGID: 1000, + OwnerHome: "/home/alice/dev home", + }) + + for _, want := range []string{ + "ExecStart=/usr/local/bin/bangerd --system", + "User=alice", + "Wants=network-online.target bangerd-root.service", + "After=bangerd-root.service", + "Requires=bangerd-root.service", + "KillMode=process", + "UMask=0077", + "Environment=TMPDIR=/run/banger", + "NoNewPrivileges=yes", + "PrivateMounts=yes", + "ProtectSystem=strict", + "ProtectHome=read-only", + "ProtectControlGroups=yes", + "ProtectKernelLogs=yes", + "ProtectKernelModules=yes", + "ProtectClock=yes", + "ProtectHostname=yes", + "RestrictSUIDSGID=yes", + "LockPersonality=yes", + "SystemCallArchitectures=native", + "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK AF_VSOCK", + "StateDirectory=banger", + "StateDirectoryMode=0700", + "CacheDirectory=banger", + "CacheDirectoryMode=0700", + "RuntimeDirectory=banger", + "RuntimeDirectoryMode=0700", + "RuntimeDirectoryPreserve=yes", + `ReadOnlyPaths="/home/alice/dev home"`, + } { + if !strings.Contains(unit, want) { + t.Fatalf("unit = %q, want %q", unit, want) + } + } +} + +func TestRenderRootHelperSystemdUnitIncludesRequiredCapabilities(t *testing.T) { + unit := renderRootHelperSystemdUnit() + + for _, want := range []string{ + "ExecStart=/usr/local/bin/bangerd --root-helper", + // Both directives are load-bearing for "VM survives helper + // restart": KillMode=process limits the initial SIGTERM to + // the helper main, SendSIGKILL=no disables the SIGKILL + // escalation. The helper itself does the cgroup reparent + // (see roothelper.reparentToBangerFCCgroup) — without + // that, even these directives leave firecracker exposed to + // systemd's stop-time cleanup. + "KillMode=process", + "SendSIGKILL=no", + "Environment=TMPDIR=/run/banger-root", + "NoNewPrivileges=yes", + "PrivateTmp=yes", + "PrivateMounts=yes", + "ProtectSystem=strict", + "ProtectHome=yes", + "RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK AF_VSOCK", + "CapabilityBoundingSet=CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_KILL CAP_MKNOD CAP_NET_ADMIN CAP_NET_RAW CAP_SETGID CAP_SETUID CAP_SYS_ADMIN CAP_SYS_CHROOT", + "ReadWritePaths=/var/lib/banger", + "RuntimeDirectory=banger-root", + "RuntimeDirectoryMode=0711", + "RuntimeDirectoryPreserve=yes", + } { + if !strings.Contains(unit, want) { + t.Fatalf("unit = %q, want %q", unit, want) + } + } +} + +func TestRenderSystemdUnitsIncludeOptionalCoverageEnv(t *testing.T) { + t.Setenv(systemCoverDirEnv, "/var/lib/banger") + t.Setenv(rootCoverDirEnv, "/var/lib/banger") + + userUnit := renderSystemdUnit(installmeta.Metadata{ + OwnerUser: "alice", + OwnerUID: 1000, + OwnerGID: 1000, + OwnerHome: "/home/alice", + }) + if !strings.Contains(userUnit, `Environment=GOCOVERDIR="/var/lib/banger"`) { + t.Fatalf("user unit = %q, want GOCOVERDIR env", userUnit) + } + + rootUnit := renderRootHelperSystemdUnit() + if !strings.Contains(rootUnit, `Environment=GOCOVERDIR="/var/lib/banger"`) { + t.Fatalf("root unit = %q, want GOCOVERDIR env", rootUnit) + } +} diff --git a/internal/cli/deps.go b/internal/cli/deps.go new file mode 100644 index 0000000..e2665ff --- /dev/null +++ b/internal/cli/deps.go @@ -0,0 +1,139 @@ +package cli + +import ( + "context" + "fmt" + "io" + "os" + "os/exec" + "path/filepath" + "strings" + "time" + + "banger/internal/api" + "banger/internal/daemon" + "banger/internal/daemon/workspace" + "banger/internal/guest" + "banger/internal/paths" + "banger/internal/rpc" + "banger/internal/system" + "banger/internal/toolingplan" +) + +// deps holds the function seams production code dispatches through and +// tests replace with fakes. Keeping these on a per-invocation struct +// (instead of package-level mutable vars) makes the CLI's external +// surface explicit and lets tests run in parallel without leaking fakes +// across test cases. +// +// Every command builder, orchestrator, and helper that touches the RPC +// socket, spawns a subprocess, or reads host state hangs off a *deps +// receiver. Pure helpers (formatters, path resolvers, arg-count +// validators) stay package-level because they hold no references to +// external systems. +type deps struct { + bangerdPath func() (string, error) + daemonExePath func(pid int) string + doctor func(ctx context.Context) (system.Report, error) + sshExec func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error + hostCommandOutput func(ctx context.Context, name string, args ...string) ([]byte, error) + vmHealth func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) + vmSSH func(ctx context.Context, socketPath, idOrName string) (api.VMSSHResult, error) + vmDelete func(ctx context.Context, socketPath, idOrName string) error + vmList func(ctx context.Context, socketPath string) (api.VMListResult, error) + daemonPing func(ctx context.Context, socketPath string) (api.PingResult, error) + vmCreateBegin func(ctx context.Context, socketPath string, params api.VMCreateParams) (api.VMCreateBeginResult, error) + vmCreateStatus func(ctx context.Context, socketPath, operationID string) (api.VMCreateStatusResult, error) + vmCreateCancel func(ctx context.Context, socketPath, operationID string) error + vmPorts func(ctx context.Context, socketPath, idOrName string) (api.VMPortsResult, error) + vmWorkspacePrepare func(ctx context.Context, socketPath string, params api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) + vmWorkspaceExport func(ctx context.Context, socketPath string, params api.WorkspaceExportParams) (api.WorkspaceExportResult, error) + guestWaitForSSH func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error + guestDial func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) + buildVMRunToolingPlan func(ctx context.Context, repoRoot string) toolingplan.Plan + cwd func() (string, error) + completionLister func(ctx context.Context, socketPath, method string) ([]string, error) + // repoInspector is the CLI's single workspace-package Inspector. + // Every code path that needs to shell out to git on the host + // (preflight, dry-run, untracked-count note) goes through it, so + // tests inject a stub Runner via this field instead of mutating a + // package global. + repoInspector *workspace.Inspector +} + +func defaultDeps() *deps { + return &deps{ + bangerdPath: paths.BangerdPath, + daemonExePath: func(pid int) string { + return filepath.Join("/proc", fmt.Sprintf("%d", pid), "exe") + }, + doctor: daemon.Doctor, + sshExec: func(ctx context.Context, stdin io.Reader, stdout, stderr io.Writer, args []string) error { + sshCmd := exec.CommandContext(ctx, "ssh", args...) + sshCmd.Stdout = stdout + sshCmd.Stderr = stderr + sshCmd.Stdin = stdin + return sshCmd.Run() + }, + hostCommandOutput: func(ctx context.Context, name string, args ...string) ([]byte, error) { + cmd := exec.CommandContext(ctx, name, args...) + output, err := cmd.CombinedOutput() + if err == nil { + return output, nil + } + command := strings.TrimSpace(strings.Join(append([]string{name}, args...), " ")) + detail := strings.TrimSpace(string(output)) + if detail == "" { + return output, fmt.Errorf("%s: %w", command, err) + } + return output, fmt.Errorf("%s: %w: %s", command, err, detail) + }, + vmHealth: func(ctx context.Context, socketPath, idOrName string) (api.VMHealthResult, error) { + return rpc.Call[api.VMHealthResult](ctx, socketPath, "vm.health", api.VMRefParams{IDOrName: idOrName}) + }, + vmSSH: func(ctx context.Context, socketPath, idOrName string) (api.VMSSHResult, error) { + return rpc.Call[api.VMSSHResult](ctx, socketPath, "vm.ssh", api.VMRefParams{IDOrName: idOrName}) + }, + vmDelete: func(ctx context.Context, socketPath, idOrName string) error { + _, err := rpc.Call[api.VMShowResult](ctx, socketPath, "vm.delete", api.VMRefParams{IDOrName: idOrName}) + return err + }, + vmList: func(ctx context.Context, socketPath string) (api.VMListResult, error) { + return rpc.Call[api.VMListResult](ctx, socketPath, "vm.list", api.Empty{}) + }, + daemonPing: func(ctx context.Context, socketPath string) (api.PingResult, error) { + return rpc.Call[api.PingResult](ctx, socketPath, "ping", api.Empty{}) + }, + vmCreateBegin: func(ctx context.Context, socketPath string, params api.VMCreateParams) (api.VMCreateBeginResult, error) { + return rpc.Call[api.VMCreateBeginResult](ctx, socketPath, "vm.create.begin", params) + }, + vmCreateStatus: func(ctx context.Context, socketPath, operationID string) (api.VMCreateStatusResult, error) { + return rpc.Call[api.VMCreateStatusResult](ctx, socketPath, "vm.create.status", api.VMCreateStatusParams{ID: operationID}) + }, + vmCreateCancel: func(ctx context.Context, socketPath, operationID string) error { + _, err := rpc.Call[api.Empty](ctx, socketPath, "vm.create.cancel", api.VMCreateStatusParams{ID: operationID}) + return err + }, + vmPorts: func(ctx context.Context, socketPath, idOrName string) (api.VMPortsResult, error) { + return rpc.Call[api.VMPortsResult](ctx, socketPath, "vm.ports", api.VMRefParams{IDOrName: idOrName}) + }, + vmWorkspacePrepare: func(ctx context.Context, socketPath string, params api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + return rpc.Call[api.VMWorkspacePrepareResult](ctx, socketPath, "vm.workspace.prepare", params) + }, + vmWorkspaceExport: func(ctx context.Context, socketPath string, params api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + return rpc.Call[api.WorkspaceExportResult](ctx, socketPath, "vm.workspace.export", params) + }, + guestWaitForSSH: func(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { + knownHosts, _ := bangerKnownHostsPath() + return guest.WaitForSSH(ctx, address, privateKeyPath, knownHosts, interval) + }, + guestDial: func(ctx context.Context, address, privateKeyPath string) (vmRunGuestClient, error) { + knownHosts, _ := bangerKnownHostsPath() + return guest.Dial(ctx, address, privateKeyPath, knownHosts) + }, + buildVMRunToolingPlan: toolingplan.Build, + cwd: os.Getwd, + completionLister: defaultCompletionLister, + repoInspector: workspace.NewInspector(), + } +} diff --git a/internal/cli/errors.go b/internal/cli/errors.go new file mode 100644 index 0000000..29355c1 --- /dev/null +++ b/internal/cli/errors.go @@ -0,0 +1,90 @@ +package cli + +import ( + "errors" + "strings" + + "banger/internal/cli/style" + "banger/internal/rpc" + "io" +) + +// TranslateError is the public entry point used by cmd/banger/main.go +// to render any error reaching the top of the cobra tree. Forwards +// to the package-internal helper so tests can reach it directly. +func TranslateError(w io.Writer, err error) string { + return translateRPCError(w, err) +} + +// translateRPCError turns an error returned by rpc.Call into a +// user-facing string. Known codes get short, friendly prefixes; +// unknown codes pass through verbatim so debuggability is preserved. +// When the daemon attached an op_id the helper appends it in parens +// so an operator can paste it into journalctl --grep. +// +// Color is applied only when w is a TTY (and NO_COLOR is unset). +// The returned string never includes a trailing newline — caller +// chooses where it goes. +func translateRPCError(w io.Writer, err error) string { + if err == nil { + return "" + } + var rpcErr *rpc.ErrorResponse + if !errors.As(err, &rpcErr) || rpcErr == nil { + // Non-RPC failures (dialing the socket, decode errors, + // context cancellation, ...) come through as plain Go + // errors. Surface them verbatim — they already mention + // the underlying cause clearly enough. + return err.Error() + } + prefix := errorCodePrefix(rpcErr.Code) + body := rpcErr.Message + if prefix != "" { + body = prefix + ": " + rpcErr.Message + } else if rpcErr.Message == "" { + // Defensive: a server that returned a code with no + // message still has SOMETHING to report; default to the + // raw code so we never print an empty error. + body = rpcErr.Code + } + if rpcErr.OpID != "" { + body = body + " (" + style.Dim(w, rpcErr.OpID) + ")" + } + return body +} + +// errorCodePrefix maps the small set of codes the daemon emits to +// short user-facing labels. Unknown codes return "" so the message +// alone is shown — keeps the door open for future codes the CLI +// hasn't been updated to recognise. +// +// "operation_failed" is the catch-all the generic dispatcher uses +// when a service returned an error; the message is already self- +// explanatory, so we strip the code entirely. Specialised codes +// (not_found, already_exists, ...) keep a label because the +// message body alone may not say what kind of failure it is. +func errorCodePrefix(code string) string { + switch strings.TrimSpace(code) { + case "", "operation_failed": + return "" + case "not_found": + return "not found" + case "not_running": + return "not running" + case "already_exists": + return "already exists" + case "bad_request", "bad_params": + return "bad request" + case "bad_version": + return "version mismatch" + case "unauthorized": + return "unauthorized" + case "unknown_method": + return "unknown method" + default: + // Surface the raw code so an operator filing a bug has + // something concrete to grep for. Strips the boilerplate + // "operation_failed" but keeps anything novel. + return code + } +} diff --git a/internal/cli/errors_test.go b/internal/cli/errors_test.go new file mode 100644 index 0000000..bdf7de1 --- /dev/null +++ b/internal/cli/errors_test.go @@ -0,0 +1,60 @@ +package cli + +import ( + "bytes" + "errors" + "strings" + "testing" + + "banger/internal/rpc" +) + +// TestTranslateRPCError pins the user-facing error rendering for +// every code the daemon emits today plus the catch-all unknown-code +// path. Buffer is non-TTY so style helpers no-op and assertions +// stay readable. +func TestTranslateRPCError(t *testing.T) { + var buf bytes.Buffer + cases := []struct { + name string + code string + msg string + opID string + expect string + }{ + {"operation_failed strips code", "operation_failed", "vm running", "", "vm running"}, + {"empty code drops prefix", "", "raw boom", "", "raw boom"}, + {"not_found", "not_found", `vm "x" not found`, "", `not found: vm "x" not found`}, + {"not_running", "not_running", "vm is not running", "", "not running: vm is not running"}, + {"already_exists", "already_exists", "image foo", "", "already exists: image foo"}, + {"bad_request", "bad_request", "missing rootfs", "", "bad request: missing rootfs"}, + {"bad_params", "bad_params", "invalid tap name", "", "bad request: invalid tap name"}, + {"bad_version", "bad_version", "unsupported version 99", "", "version mismatch: unsupported version 99"}, + {"unauthorized", "unauthorized", "uid 1000 not allowed", "", "unauthorized: uid 1000 not allowed"}, + {"unknown_method", "unknown_method", "no.such.method", "", "unknown method: no.such.method"}, + {"unknown code falls through", "weird_new_code", "boom", "", "weird_new_code: boom"}, + {"op_id appended in parens", "operation_failed", "boom", "op-deadbeef00ff", "boom (op-deadbeef00ff)"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + err := &rpc.ErrorResponse{Code: tc.code, Message: tc.msg, OpID: tc.opID} + got := translateRPCError(&buf, err) + if got != tc.expect { + t.Errorf("got %q, want %q", got, tc.expect) + } + }) + } +} + +// TestTranslateRPCErrorPassesThroughNonRPCErrors covers the dial +// failure / decode failure paths where rpc.Call returns a plain Go +// error rather than *rpc.ErrorResponse. The translator must not +// hide the original message — that's the only signal an operator +// has when the daemon is down. +func TestTranslateRPCErrorPassesThroughNonRPCErrors(t *testing.T) { + var buf bytes.Buffer + got := translateRPCError(&buf, errors.New("dial unix /run/banger/bangerd.sock: connect: no such file or directory")) + if !strings.Contains(got, "no such file or directory") { + t.Fatalf("plain error lost: got %q", got) + } +} diff --git a/internal/cli/formatters_test.go b/internal/cli/formatters_test.go new file mode 100644 index 0000000..f712266 --- /dev/null +++ b/internal/cli/formatters_test.go @@ -0,0 +1,287 @@ +package cli + +import ( + "bytes" + "errors" + "fmt" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/model" + + "github.com/spf13/cobra" +) + +func TestHumanSize(t *testing.T) { + cases := []struct { + bytes int64 + want string + }{ + {-1, "-"}, + {0, "-"}, + {1, "1 B"}, + {1023, "1023 B"}, + {1024, "1.0 KiB"}, + {2048, "2.0 KiB"}, + {1024 * 1024, "1.0 MiB"}, + {5 * 1024 * 1024, "5.0 MiB"}, + {1024 * 1024 * 1024, "1.0 GiB"}, + {3 * 1024 * 1024 * 1024, "3.0 GiB"}, + } + for _, tc := range cases { + if got := humanSize(tc.bytes); got != tc.want { + t.Errorf("humanSize(%d) = %q, want %q", tc.bytes, got, tc.want) + } + } +} + +func TestDashIfEmpty(t *testing.T) { + cases := map[string]string{ + "": "-", + " ": "-", + "\t\n": "-", + "value": "value", + " hello ": " hello ", + } + for in, want := range cases { + if got := dashIfEmpty(in); got != want { + t.Errorf("dashIfEmpty(%q) = %q, want %q", in, got, want) + } + } +} + +func TestExitCodeErrorError(t *testing.T) { + e := ExitCodeError{Code: 42} + got := e.Error() + if !strings.Contains(got, "42") { + t.Fatalf("error %q missing code", got) + } + + var target ExitCodeError + if !errors.As(error(e), &target) { + t.Fatal("errors.As failed to match ExitCodeError") + } + if target.Code != 42 { + t.Fatalf("target.Code = %d, want 42", target.Code) + } +} + +func TestShortID(t *testing.T) { + cases := map[string]string{ + "": "", + "abc": "abc", + "0123456789ab": "0123456789ab", + "0123456789abcd": "0123456789ab", + "0123456789abcdefghij": "0123456789ab", + } + for in, want := range cases { + if got := shortID(in); got != want { + t.Errorf("shortID(%q) = %q, want %q", in, got, want) + } + } +} + +func TestImageNameIndex(t *testing.T) { + images := []model.Image{ + {ID: "id-a", Name: "alpha"}, + {ID: "id-b", Name: "beta"}, + } + idx := imageNameIndex(images) + if len(idx) != 2 { + t.Fatalf("len = %d, want 2", len(idx)) + } + if idx["id-a"] != "alpha" || idx["id-b"] != "beta" { + t.Fatalf("unexpected index %v", idx) + } + + empty := imageNameIndex(nil) + if empty == nil || len(empty) != 0 { + t.Fatalf("expected empty non-nil map, got %v", empty) + } +} + +func TestHelpNoArgs(t *testing.T) { + called := false + cmd := &cobra.Command{ + Use: "x", + RunE: func(cmd *cobra.Command, args []string) error { + called = true + return nil + }, + } + cmd.SetOut(&bytes.Buffer{}) + cmd.SetErr(&bytes.Buffer{}) + + if err := helpNoArgs(cmd, nil); err != nil { + t.Fatalf("helpNoArgs(nil): %v", err) + } + if called { + t.Fatal("helpNoArgs should not invoke Run") + } + + if err := helpNoArgs(cmd, []string{"bogus"}); err == nil { + t.Fatal("expected error for unexpected args") + } +} + +func TestArgsValidators(t *testing.T) { + cmd := &cobra.Command{Use: "x"} + + exact := exactArgsUsage(2, "need exactly two") + if err := exact(cmd, []string{"a", "b"}); err != nil { + t.Fatalf("exact(2 args): %v", err) + } + if err := exact(cmd, []string{"a"}); err == nil { + t.Fatal("expected error for 1 arg with exactArgsUsage(2)") + } + + minArgs := minArgsUsage(1, "need at least one") + if err := minArgs(cmd, []string{"a"}); err != nil { + t.Fatalf("min(1 arg): %v", err) + } + if err := minArgs(cmd, nil); err == nil { + t.Fatal("expected error for 0 args with minArgsUsage(1)") + } + + maxArgs := maxArgsUsage(1, "at most one") + if err := maxArgs(cmd, []string{"a"}); err != nil { + t.Fatalf("max(1 arg): %v", err) + } + if err := maxArgs(cmd, []string{"a", "b"}); err == nil { + t.Fatal("expected error for 2 args with maxArgsUsage(1)") + } + + noArgs := noArgsUsage("none allowed") + if err := noArgs(cmd, nil); err != nil { + t.Fatalf("no args: %v", err) + } + if err := noArgs(cmd, []string{"a"}); err == nil { + t.Fatal("expected error for args with noArgsUsage") + } +} + +func TestPrintKernelListTable(t *testing.T) { + var buf bytes.Buffer + entries := []api.KernelEntry{ + {Name: "generic-6.12", Distro: "debian", Arch: "x86_64", KernelVersion: "6.12", ImportedAt: "2026-01-01"}, + {Name: "bare"}, + } + if err := printKernelListTable(&buf, entries); err != nil { + t.Fatalf("printKernelListTable: %v", err) + } + got := buf.String() + for _, want := range []string{"NAME", "DISTRO", "generic-6.12", "bare"} { + if !strings.Contains(got, want) { + t.Errorf("output missing %q:\n%s", want, got) + } + } + // Empty fields render as "-". + if !strings.Contains(got, "-") { + t.Errorf("expected dash for empty fields, got:\n%s", got) + } +} + +func TestPrintKernelCatalogTable(t *testing.T) { + var buf bytes.Buffer + entries := []api.KernelCatalogEntry{ + {Name: "generic-6.12", Arch: "x86_64", KernelVersion: "6.12", SizeBytes: 2 * 1024 * 1024, Pulled: true}, + {Name: "new-kernel", SizeBytes: 0, Pulled: false}, + } + if err := printKernelCatalogTable(&buf, entries); err != nil { + t.Fatalf("printKernelCatalogTable: %v", err) + } + got := buf.String() + for _, want := range []string{"generic-6.12", "pulled", "available", "new-kernel"} { + if !strings.Contains(got, want) { + t.Errorf("output missing %q:\n%s", want, got) + } + } + if !strings.Contains(got, "2.0 MiB") { + t.Errorf("expected humanSize(2 MiB), got:\n%s", got) + } +} + +func TestPrintJSON(t *testing.T) { + var buf bytes.Buffer + if err := printJSON(&buf, map[string]int{"a": 1, "b": 2}); err != nil { + t.Fatalf("printJSON: %v", err) + } + got := buf.String() + if !strings.Contains(got, `"a": 1`) || !strings.Contains(got, `"b": 2`) { + t.Errorf("unexpected JSON output:\n%s", got) + } + if !strings.HasSuffix(got, "\n") { + t.Error("printJSON should terminate with newline") + } +} + +func TestPrintJSONUnmarshalableValue(t *testing.T) { + var buf bytes.Buffer + // Channels are not JSON-marshalable. + err := printJSON(&buf, make(chan int)) + if err == nil { + t.Fatal("expected error for unmarshalable value") + } +} + +func TestPrintVMSummary(t *testing.T) { + var buf bytes.Buffer + vm := model.VMRecord{ + ID: "0123456789abcdef", + Name: "demo", + State: model.VMStateRunning, + } + vm.Runtime.GuestIP = "172.16.0.5" + vm.Runtime.DNSName = "demo.vm" + vm.Spec.WorkDiskSizeBytes = 0 + if err := printVMSummary(&buf, vm); err != nil { + t.Fatalf("printVMSummary: %v", err) + } + got := buf.String() + for _, want := range []string{"0123456789ab", "demo", "172.16.0.5", "demo.vm"} { + if !strings.Contains(got, want) { + t.Errorf("summary missing %q:\n%s", want, got) + } + } +} + +func TestPrintImageSummary(t *testing.T) { + var buf bytes.Buffer + img := model.Image{ID: "img-id", Name: "debian-bookworm", Managed: true, RootfsPath: "/var/rootfs.ext4"} + if err := printImageSummary(&buf, img); err != nil { + t.Fatalf("printImageSummary: %v", err) + } + got := buf.String() + for _, want := range []string{"debian-bookworm", "true", "/var/rootfs.ext4"} { + if !strings.Contains(got, want) { + t.Errorf("summary missing %q:\n%s", want, got) + } + } +} + +func TestVMImageLabel(t *testing.T) { + names := map[string]string{"img-1": "debian"} + if got := vmImageLabel("img-1", names); got != "debian" { + t.Errorf("got %q, want debian", got) + } + if got := vmImageLabel("img-2", names); got != "img-2" { + t.Errorf("fallback: got %q, want img-2", got) + } +} + +// failWriter lets us exercise io-error branches of the printers. +type failWriter struct{} + +func (failWriter) Write([]byte) (int, error) { return 0, fmt.Errorf("boom") } + +func TestPrintersPropagateWriteErrors(t *testing.T) { + kernels := []api.KernelEntry{{Name: "k"}} + if err := printKernelListTable(failWriter{}, kernels); err == nil { + t.Error("expected write error from printKernelListTable") + } + catalog := []api.KernelCatalogEntry{{Name: "k"}} + if err := printKernelCatalogTable(failWriter{}, catalog); err == nil { + t.Error("expected write error from printKernelCatalogTable") + } +} diff --git a/internal/cli/known_hosts.go b/internal/cli/known_hosts.go new file mode 100644 index 0000000..806e3ad --- /dev/null +++ b/internal/cli/known_hosts.go @@ -0,0 +1,26 @@ +package cli + +import ( + "strings" + + "banger/internal/guest" + "banger/internal/model" +) + +func removeUserKnownHosts(vm model.VMRecord) error { + knownHostsPath, err := bangerKnownHostsPath() + if err != nil { + return err + } + var hosts []string + if ip := strings.TrimSpace(vm.Runtime.GuestIP); ip != "" { + hosts = append(hosts, ip) + } + if dns := strings.TrimSpace(vm.Runtime.DNSName); dns != "" { + hosts = append(hosts, dns) + } + if len(hosts) == 0 { + return nil + } + return guest.RemoveKnownHosts(knownHostsPath, hosts...) +} diff --git a/internal/cli/make_bundle_test.go b/internal/cli/make_bundle_test.go new file mode 100644 index 0000000..fdce359 --- /dev/null +++ b/internal/cli/make_bundle_test.go @@ -0,0 +1,320 @@ +package cli + +import ( + "archive/tar" + "bytes" + "context" + "crypto/sha256" + "encoding/hex" + "encoding/json" + "io" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + + "banger/internal/imagecat" + + "github.com/klauspost/compress/zstd" +) + +func TestInternalMakeBundleFlagsExist(t *testing.T) { + root := NewBangerCommand() + internal, _, err := root.Find([]string{"internal"}) + if err != nil { + t.Fatalf("find internal: %v", err) + } + mk, _, err := internal.Find([]string{"make-bundle"}) + if err != nil { + t.Fatalf("find make-bundle: %v", err) + } + for _, name := range []string{"rootfs-tar", "name", "distro", "arch", "kernel-ref", "description", "size", "out"} { + if mk.Flags().Lookup(name) == nil { + t.Errorf("missing flag %q", name) + } + } +} + +func TestMakeBundleRequiresName(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"internal", "make-bundle", "--rootfs-tar", "some.tar", "--out", "out.tar.zst"}) + cmd.SetOut(&bytes.Buffer{}) + cmd.SetErr(&bytes.Buffer{}) + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "image name is required") { + t.Fatalf("execute error = %v, want image-name-required", err) + } +} + +func TestMakeBundleRequiresRootfsTar(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"internal", "make-bundle", "--name", "x", "--out", "out.tar.zst"}) + cmd.SetOut(&bytes.Buffer{}) + cmd.SetErr(&bytes.Buffer{}) + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "--rootfs-tar is required") { + t.Fatalf("execute error = %v, want --rootfs-tar required", err) + } +} + +func TestMakeBundleRequiresOut(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"internal", "make-bundle", "--name", "x", "--rootfs-tar", "-"}) + cmd.SetOut(&bytes.Buffer{}) + cmd.SetErr(&bytes.Buffer{}) + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "--out is required") { + t.Fatalf("execute error = %v, want --out required", err) + } +} + +func TestWriteBundleTarZstRoundTrip(t *testing.T) { + stage := t.TempDir() + rootfsContent := []byte("fake-rootfs-bytes") + rootfsPath := filepath.Join(stage, "rootfs.ext4") + if err := os.WriteFile(rootfsPath, rootfsContent, 0o644); err != nil { + t.Fatal(err) + } + manifest := imagecat.Manifest{Name: "debian-bookworm", Distro: "debian"} + manifestJSON, _ := json.Marshal(manifest) + manifestPath := filepath.Join(stage, "manifest.json") + if err := os.WriteFile(manifestPath, manifestJSON, 0o644); err != nil { + t.Fatal(err) + } + + bundlePath := filepath.Join(stage, "bundle.tar.zst") + if err := writeBundleTarZst(bundlePath, rootfsPath, manifestPath); err != nil { + t.Fatalf("writeBundleTarZst: %v", err) + } + + // Decode and verify. + raw, err := os.Open(bundlePath) + if err != nil { + t.Fatal(err) + } + t.Cleanup(func() { raw.Close() }) + zr, err := zstd.NewReader(raw) + if err != nil { + t.Fatal(err) + } + tr := tar.NewReader(zr) + got := map[string][]byte{} + for { + hdr, err := tr.Next() + if err == io.EOF { + break + } + if err != nil { + t.Fatal(err) + } + b, _ := io.ReadAll(tr) + got[hdr.Name] = b + } + if !bytes.Equal(got[imagecat.RootfsFilename], rootfsContent) { + t.Errorf("rootfs mismatch: got %q want %q", got[imagecat.RootfsFilename], rootfsContent) + } + if !bytes.Equal(got[imagecat.ManifestFilename], manifestJSON) { + t.Errorf("manifest mismatch: got %q want %q", got[imagecat.ManifestFilename], manifestJSON) + } +} + +func TestSha256HexFile(t *testing.T) { + dir := t.TempDir() + content := []byte("hello world") + p := filepath.Join(dir, "f") + if err := os.WriteFile(p, content, 0o644); err != nil { + t.Fatal(err) + } + got, err := sha256HexFile(p) + if err != nil { + t.Fatal(err) + } + expected := sha256.Sum256(content) + if got != hex.EncodeToString(expected[:]) { + t.Fatalf("sha256 = %q, want %q", got, hex.EncodeToString(expected[:])) + } +} + +func TestDirSize(t *testing.T) { + dir := t.TempDir() + _ = os.MkdirAll(filepath.Join(dir, "sub"), 0o755) + _ = os.WriteFile(filepath.Join(dir, "a"), []byte("abc"), 0o644) // 3 + _ = os.WriteFile(filepath.Join(dir, "sub", "b"), []byte("defgh"), 0o644) // 5 + // Symlink must not be counted. + _ = os.Symlink(filepath.Join(dir, "a"), filepath.Join(dir, "link")) + n, err := dirSize(dir) + if err != nil { + t.Fatal(err) + } + if n != 8 { + t.Fatalf("dirSize = %d, want 8", n) + } +} + +// TestMakeBundleEndToEnd exercises the full pipeline against a tiny +// synthesized rootfs tar. Skips if any external tool (mkfs.ext4 / +// debugfs) or the companion banger-vsock-agent binary is unavailable. +func TestMakeBundleEndToEnd(t *testing.T) { + if _, err := exec.LookPath("mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not installed") + } + if _, err := exec.LookPath("debugfs"); err != nil { + t.Skip("debugfs not installed") + } + // Build companion binary if the build tree doesn't already have one. + buildDir := findBuildBinDir(t) + if buildDir == "" { + t.Skip("build/bin not found; run `make build` to enable this test") + } + if _, err := os.Stat(filepath.Join(buildDir, "banger-vsock-agent")); err != nil { + t.Skip("banger-vsock-agent not in build/bin; run `make build`") + } + // Ensure the banger binary also exists so CompanionBinaryPath + // resolves (it looks alongside the banger binary). + if _, err := os.Stat(filepath.Join(buildDir, "banger")); err != nil { + t.Skip("banger not in build/bin; run `make build`") + } + + // Build a minimal rootfs tar: just /etc/os-release and /tmp (a dir). + dir := t.TempDir() + tarPath := filepath.Join(dir, "rootfs.tar") + if err := writeMinimalTar(tarPath); err != nil { + t.Fatal(err) + } + outPath := filepath.Join(dir, "bundle.tar.zst") + + // Invoke via the cobra command to cover arg handling too. + cmd := NewBangerCommand() + cmd.SetArgs([]string{ + "internal", "make-bundle", + "--rootfs-tar", tarPath, + "--name", "test-bundle", + "--distro", "debian", + "--arch", "x86_64", + "--kernel-ref", "generic-6.12", + "--size", "64M", + "--out", outPath, + }) + var stderr bytes.Buffer + cmd.SetOut(&bytes.Buffer{}) + cmd.SetErr(&stderr) + // paths.CompanionBinaryPath looks alongside the banger binary, but + // the test binary lives elsewhere. Use the env override instead. + t.Setenv("BANGER_VSOCK_AGENT_BIN", filepath.Join(buildDir, "banger-vsock-agent")) + cmd.SetContext(context.Background()) + if err := cmd.Execute(); err != nil { + t.Fatalf("execute: %v\nstderr:\n%s", err, stderr.String()) + } + + if stat, err := os.Stat(outPath); err != nil { + t.Fatalf("output not written: %v", err) + } else if stat.Size() < 1024 { + t.Fatalf("output suspiciously small: %d bytes", stat.Size()) + } + + // Verify we can fetch-reparse it (mirror of imagecat.Fetch logic, + // but reading straight from disk instead of HTTP). + extractDir := t.TempDir() + verifyBundle(t, outPath, extractDir) +} + +// findBuildBinDir returns the absolute path to the project's build/bin, +// or "" if it can't be located. Walks up from CWD to find go.mod. +func findBuildBinDir(t *testing.T) string { + t.Helper() + cwd, err := os.Getwd() + if err != nil { + return "" + } + for d := cwd; d != "/" && d != "."; d = filepath.Dir(d) { + if _, err := os.Stat(filepath.Join(d, "go.mod")); err == nil { + return filepath.Join(d, "build", "bin") + } + } + return "" +} + +func writeMinimalTar(path string) error { + f, err := os.Create(path) + if err != nil { + return err + } + defer f.Close() + tw := tar.NewWriter(f) + defer tw.Close() + + // /etc dir + if err := tw.WriteHeader(&tar.Header{ + Name: "etc/", Typeflag: tar.TypeDir, Mode: 0o755, Uid: 0, Gid: 0, + }); err != nil { + return err + } + // /etc/os-release + body := []byte(`ID=debian` + "\n" + `PRETTY_NAME="banger test"` + "\n") + if err := tw.WriteHeader(&tar.Header{ + Name: "etc/os-release", Typeflag: tar.TypeReg, Mode: 0o644, + Size: int64(len(body)), Uid: 0, Gid: 0, + }); err != nil { + return err + } + if _, err := tw.Write(body); err != nil { + return err + } + // /tmp dir + return tw.WriteHeader(&tar.Header{ + Name: "tmp/", Typeflag: tar.TypeDir, Mode: 0o1777, Uid: 0, Gid: 0, + }) +} + +func verifyBundle(t *testing.T, bundlePath, extractDir string) { + t.Helper() + f, err := os.Open(bundlePath) + if err != nil { + t.Fatal(err) + } + defer f.Close() + zr, err := zstd.NewReader(f) + if err != nil { + t.Fatal(err) + } + defer zr.Close() + tr := tar.NewReader(zr) + seen := map[string]bool{} + for { + hdr, err := tr.Next() + if err == io.EOF { + break + } + if err != nil { + t.Fatal(err) + } + dst := filepath.Join(extractDir, hdr.Name) + if err := os.MkdirAll(filepath.Dir(dst), 0o755); err != nil { + t.Fatal(err) + } + out, err := os.Create(dst) + if err != nil { + t.Fatal(err) + } + if _, err := io.Copy(out, tr); err != nil { + t.Fatal(err) + } + out.Close() + seen[hdr.Name] = true + } + if !seen[imagecat.RootfsFilename] || !seen[imagecat.ManifestFilename] { + t.Fatalf("bundle missing expected files: seen=%v", seen) + } + manifestData, err := os.ReadFile(filepath.Join(extractDir, imagecat.ManifestFilename)) + if err != nil { + t.Fatal(err) + } + var m imagecat.Manifest + if err := json.Unmarshal(manifestData, &m); err != nil { + t.Fatal(err) + } + if m.Name != "test-bundle" || m.KernelRef != "generic-6.12" || m.Distro != "debian" { + t.Fatalf("manifest = %+v", m) + } +} diff --git a/internal/cli/printers.go b/internal/cli/printers.go new file mode 100644 index 0000000..afedbc8 --- /dev/null +++ b/internal/cli/printers.go @@ -0,0 +1,338 @@ +package cli + +import ( + "encoding/json" + "fmt" + "io" + "os" + "sort" + "strings" + "text/tabwriter" + + "banger/internal/api" + "banger/internal/cli/style" + "banger/internal/model" + "banger/internal/system" +) + +// anyWriter is the minimal writer surface every printer needs. Split +// out from io.Writer because some of our callers already hold a +// tabwriter/bytes.Buffer by value. +type anyWriter interface { + Write(p []byte) (n int, err error) +} + +// -- small helpers -------------------------------------------------- + +func humanSize(bytes int64) string { + if bytes <= 0 { + return "-" + } + const ( + kib = 1024 + mib = 1024 * kib + gib = 1024 * mib + ) + switch { + case bytes >= gib: + return fmt.Sprintf("%.1f GiB", float64(bytes)/float64(gib)) + case bytes >= mib: + return fmt.Sprintf("%.1f MiB", float64(bytes)/float64(mib)) + case bytes >= kib: + return fmt.Sprintf("%.1f KiB", float64(bytes)/float64(kib)) + default: + return fmt.Sprintf("%d B", bytes) + } +} + +func dashIfEmpty(s string) string { + if strings.TrimSpace(s) == "" { + return "-" + } + return s +} + +// -- generic printers ----------------------------------------------- + +func printJSON(out anyWriter, v any) error { + data, err := json.MarshalIndent(v, "", " ") + if err != nil { + return err + } + _, err = fmt.Fprintln(out, string(data)) + return err +} + +// -- VM printers ---------------------------------------------------- + +func printVMSummary(out anyWriter, vm model.VMRecord) error { + _, err := fmt.Fprintf( + out, + "%s\t%s\t%s\t%s\t%s\t%s\n", + shortID(vm.ID), + vm.Name, + vm.State, + vm.Runtime.GuestIP, + model.FormatSizeBytes(vm.Spec.WorkDiskSizeBytes), + vm.Runtime.DNSName, + ) + return err +} + +func printVMIDList(out anyWriter, vms []model.VMRecord) error { + for _, vm := range vms { + if _, err := fmt.Fprintln(out, vm.ID); err != nil { + return err + } + } + return nil +} + +func printVMListTable(out anyWriter, vms []model.VMRecord, imageNames map[string]string) error { + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + if _, err := fmt.Fprintln(w, "ID\tNAME\tSTATE\tIMAGE\tIP\tVCPU\tMEM\tDISK\tWORKSPACE\tCREATED"); err != nil { + return err + } + for _, vm := range vms { + if _, err := fmt.Fprintf( + w, + "%s\t%s\t%s\t%s\t%s\t%d\t%d MiB\t%s\t%s\t%s\n", + shortID(vm.ID), + vm.Name, + vm.State, + vmImageLabel(vm.ImageID, imageNames), + vm.Runtime.GuestIP, + vm.Spec.VCPUCount, + vm.Spec.MemoryMiB, + model.FormatSizeBytes(vm.Spec.WorkDiskSizeBytes), + dashIfEmpty(vm.Workspace.GuestPath), + relativeTime(vm.CreatedAt), + ); err != nil { + return err + } + } + return w.Flush() +} + +func printVMPortsTable(out anyWriter, result api.VMPortsResult) error { + type portRow struct { + Proto string + Endpoint string + Process string + Command string + Port int + } + rows := make([]portRow, 0, len(result.Ports)) + for _, port := range result.Ports { + rows = append(rows, portRow{ + Proto: port.Proto, + Endpoint: port.Endpoint, + Process: port.Process, + Command: port.Command, + Port: port.Port, + }) + } + sort.Slice(rows, func(i, j int) bool { + if rows[i].Proto != rows[j].Proto { + return rows[i].Proto < rows[j].Proto + } + if rows[i].Port != rows[j].Port { + return rows[i].Port < rows[j].Port + } + if rows[i].Process != rows[j].Process { + return rows[i].Process < rows[j].Process + } + return rows[i].Command < rows[j].Command + }) + if len(rows) == 0 { + return nil + } + + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + if _, err := fmt.Fprintln(w, "PROTO\tENDPOINT\tPROCESS\tCOMMAND"); err != nil { + return err + } + for _, row := range rows { + if _, err := fmt.Fprintf( + w, + "%s\t%s\t%s\t%s\n", + row.Proto, + dashIfEmpty(row.Endpoint), + dashIfEmpty(row.Process), + dashIfEmpty(row.Command), + ); err != nil { + return err + } + } + return w.Flush() +} + +// -- image printers ------------------------------------------------- + +func printImageSummary(out anyWriter, image model.Image) error { + _, err := fmt.Fprintf(out, "%s\t%s\t%t\t%s\n", shortID(image.ID), image.Name, image.Managed, image.RootfsPath) + return err +} + +func imageNameIndex(images []model.Image) map[string]string { + index := make(map[string]string, len(images)) + for _, image := range images { + index[image.ID] = image.Name + } + return index +} + +func vmImageLabel(imageID string, imageNames map[string]string) string { + if name := strings.TrimSpace(imageNames[imageID]); name != "" { + return name + } + return shortID(imageID) +} + +func printImageListTable(out anyWriter, images []model.Image) error { + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + if _, err := fmt.Fprintln(w, "ID\tNAME\tMANAGED\tROOTFS SIZE\tCREATED"); err != nil { + return err + } + for _, image := range images { + if _, err := fmt.Fprintf( + w, + "%s\t%s\t%t\t%s\t%s\n", + shortID(image.ID), + image.Name, + image.Managed, + rootfsSizeLabel(image.RootfsPath), + relativeTime(image.CreatedAt), + ); err != nil { + return err + } + } + return w.Flush() +} + +func rootfsSizeLabel(path string) string { + info, err := os.Stat(path) + if err != nil { + return "-" + } + if info.Size() <= 0 { + return "0" + } + return model.FormatSizeBytes(info.Size()) +} + +// -- kernel printers ------------------------------------------------ + +func printKernelListTable(out anyWriter, entries []api.KernelEntry) error { + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + if _, err := fmt.Fprintln(w, "NAME\tDISTRO\tARCH\tKERNEL\tIMPORTED"); err != nil { + return err + } + for _, entry := range entries { + if _, err := fmt.Fprintf( + w, + "%s\t%s\t%s\t%s\t%s\n", + entry.Name, + dashIfEmpty(entry.Distro), + dashIfEmpty(entry.Arch), + dashIfEmpty(entry.KernelVersion), + dashIfEmpty(entry.ImportedAt), + ); err != nil { + return err + } + } + return w.Flush() +} + +func printKernelCatalogTable(out anyWriter, entries []api.KernelCatalogEntry) error { + w := tabwriter.NewWriter(out, 0, 8, 2, ' ', 0) + if _, err := fmt.Fprintln(w, "NAME\tDISTRO\tARCH\tKERNEL\tSIZE\tSTATE"); err != nil { + return err + } + for _, entry := range entries { + state := "available" + if entry.Pulled { + state = "pulled" + } + if _, err := fmt.Fprintf( + w, + "%s\t%s\t%s\t%s\t%s\t%s\n", + entry.Name, + dashIfEmpty(entry.Distro), + dashIfEmpty(entry.Arch), + dashIfEmpty(entry.KernelVersion), + humanSize(entry.SizeBytes), + state, + ); err != nil { + return err + } + } + return w.Flush() +} + +// -- doctor printer ------------------------------------------------- + +func printDoctorReport(out anyWriter, report system.Report, verbose bool) error { + colorWriter, _ := out.(io.Writer) + + var passes, warns, fails int + for _, c := range report.Checks { + switch c.Status { + case system.CheckStatusPass: + passes++ + case system.CheckStatusWarn: + warns++ + case system.CheckStatusFail: + fails++ + } + } + + if !verbose && warns == 0 && fails == 0 { + msg := fmt.Sprintf("all %d checks passed", passes) + if colorWriter != nil { + msg = style.Pass(colorWriter, msg) + } + _, err := fmt.Fprintln(out, msg) + return err + } + + for _, check := range report.Checks { + if !verbose && check.Status == system.CheckStatusPass { + continue + } + status := strings.ToUpper(string(check.Status)) + if colorWriter != nil { + switch check.Status { + case system.CheckStatusPass: + status = style.Pass(colorWriter, status) + case system.CheckStatusFail: + status = style.Fail(colorWriter, status) + case system.CheckStatusWarn: + status = style.Warn(colorWriter, status) + } + } + if _, err := fmt.Fprintf(out, "%s\t%s\n", status, check.Name); err != nil { + return err + } + for _, detail := range check.Details { + if _, err := fmt.Fprintf(out, " - %s\n", detail); err != nil { + return err + } + } + } + + if !verbose { + if _, err := fmt.Fprintf(out, "\n%d passed, %s, %s\n", passes, pluralCount(warns, "warning"), pluralCount(fails, "failure")); err != nil { + return err + } + } + + return nil +} + +func pluralCount(n int, word string) string { + if n == 1 { + return fmt.Sprintf("%d %s", n, word) + } + return fmt.Sprintf("%d %ss", n, word) +} diff --git a/internal/cli/printers_test.go b/internal/cli/printers_test.go new file mode 100644 index 0000000..3018ca8 --- /dev/null +++ b/internal/cli/printers_test.go @@ -0,0 +1,88 @@ +package cli + +import ( + "bytes" + "strings" + "testing" + + "banger/internal/system" +) + +func TestPrintDoctorReport_BriefAllPass(t *testing.T) { + report := system.Report{} + report.AddPass("first", "detail one") + report.AddPass("second", "detail two") + report.AddPass("third") + + var buf bytes.Buffer + if err := printDoctorReport(&buf, report, false); err != nil { + t.Fatalf("printDoctorReport: %v", err) + } + + got := buf.String() + want := "all 3 checks passed\n" + if got != want { + t.Fatalf("brief all-pass output\n got: %q\nwant: %q", got, want) + } +} + +func TestPrintDoctorReport_BriefHidesPassDetails(t *testing.T) { + report := system.Report{} + report.AddPass("first", "detail one") + report.AddWarn("second", "warn detail") + report.AddPass("third", "detail three") + report.AddFail("fourth", "fail detail") + + var buf bytes.Buffer + if err := printDoctorReport(&buf, report, false); err != nil { + t.Fatalf("printDoctorReport: %v", err) + } + + got := buf.String() + if strings.Contains(got, "PASS") || strings.Contains(got, "first") || strings.Contains(got, "third") { + t.Fatalf("brief mode leaked PASS rows: %q", got) + } + for _, want := range []string{"WARN\tsecond", "warn detail", "FAIL\tfourth", "fail detail"} { + if !strings.Contains(got, want) { + t.Fatalf("missing %q in brief output: %q", want, got) + } + } + if !strings.Contains(got, "2 passed, 1 warning, 1 failure") { + t.Fatalf("missing summary footer in: %q", got) + } +} + +func TestPrintDoctorReport_BriefSummaryPlurals(t *testing.T) { + report := system.Report{} + report.AddPass("a") + report.AddWarn("b") + report.AddWarn("c") + + var buf bytes.Buffer + if err := printDoctorReport(&buf, report, false); err != nil { + t.Fatalf("printDoctorReport: %v", err) + } + if !strings.Contains(buf.String(), "1 passed, 2 warnings, 0 failures") { + t.Fatalf("plural counts wrong: %q", buf.String()) + } +} + +func TestPrintDoctorReport_VerboseShowsEverything(t *testing.T) { + report := system.Report{} + report.AddPass("first", "detail one") + report.AddWarn("second", "warn detail") + + var buf bytes.Buffer + if err := printDoctorReport(&buf, report, true); err != nil { + t.Fatalf("printDoctorReport: %v", err) + } + got := buf.String() + for _, want := range []string{"PASS\tfirst", "detail one", "WARN\tsecond", "warn detail"} { + if !strings.Contains(got, want) { + t.Fatalf("verbose mode missing %q: %q", want, got) + } + } + if strings.Contains(got, "passed,") { + t.Fatalf("verbose mode should not print summary footer: %q", got) + } +} diff --git a/internal/cli/prune_test.go b/internal/cli/prune_test.go new file mode 100644 index 0000000..cdf86c8 --- /dev/null +++ b/internal/cli/prune_test.go @@ -0,0 +1,205 @@ +package cli + +import ( + "bytes" + "context" + "errors" + "fmt" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/model" + + "github.com/spf13/cobra" +) + +// stubPruneSeams installs list + delete fakes onto the caller's *deps +// and returns a pointer to a slice that records every ID passed to the +// delete fake. +func stubPruneSeams(t *testing.T, d *deps, vms []model.VMRecord, listErr error, deleteErr map[string]error) *[]string { + t.Helper() + + var deleted []string + d.vmList = func(ctx context.Context, socketPath string) (api.VMListResult, error) { + return api.VMListResult{VMs: vms}, listErr + } + d.vmDelete = func(ctx context.Context, socketPath, idOrName string) error { + if err, ok := deleteErr[idOrName]; ok { + return err + } + deleted = append(deleted, idOrName) + return nil + } + return &deleted +} + +func newPruneTestCmd(stdin string) (*cobra.Command, *bytes.Buffer, *bytes.Buffer) { + cmd := &cobra.Command{Use: "prune"} + cmd.SetContext(context.Background()) + stdout := &bytes.Buffer{} + stderr := &bytes.Buffer{} + cmd.SetIn(strings.NewReader(stdin)) + cmd.SetOut(stdout) + cmd.SetErr(stderr) + return cmd, stdout, stderr +} + +func TestPromptYesNo(t *testing.T) { + cases := map[string]bool{ + "y\n": true, + "Y\n": true, + "yes\n": true, + "YES\n": true, + " y \n": true, + "n\n": false, + "no\n": false, + "\n": false, + "anything\n": false, + } + for input, want := range cases { + out := &bytes.Buffer{} + got, err := promptYesNo(strings.NewReader(input), out, "go? ") + if err != nil { + t.Errorf("input %q: error %v", input, err) + continue + } + if got != want { + t.Errorf("input %q: got %v, want %v", input, got, want) + } + if !strings.Contains(out.String(), "go?") { + t.Errorf("input %q: prompt not written; got %q", input, out.String()) + } + } +} + +func TestPromptYesNoEOF(t *testing.T) { + got, err := promptYesNo(strings.NewReader(""), &bytes.Buffer{}, "? ") + if err != nil { + t.Fatalf("EOF should not error: %v", err) + } + if got { + t.Fatal("EOF should be treated as no") + } +} + +func TestRunVMPruneNoVictims(t *testing.T) { + d := defaultDeps() + stubPruneSeams(t, d, []model.VMRecord{ + {ID: "id-1", Name: "running-vm", State: model.VMStateRunning}, + }, nil, nil) + + cmd, stdout, _ := newPruneTestCmd("") + if err := d.runVMPrune(cmd, "sock", false); err != nil { + t.Fatalf("d.runVMPrune: %v", err) + } + if !strings.Contains(stdout.String(), "no non-running VMs") { + t.Errorf("expected no-op message, got %q", stdout.String()) + } +} + +func TestRunVMPruneAbortedByUser(t *testing.T) { + d := defaultDeps() + deleted := stubPruneSeams(t, d, []model.VMRecord{ + {ID: "id-1", Name: "stale", State: model.VMStateStopped}, + }, nil, nil) + + cmd, stdout, _ := newPruneTestCmd("n\n") + if err := d.runVMPrune(cmd, "sock", false); err != nil { + t.Fatalf("d.runVMPrune: %v", err) + } + if !strings.Contains(stdout.String(), "aborted") { + t.Errorf("expected 'aborted' output, got %q", stdout.String()) + } + if len(*deleted) != 0 { + t.Errorf("should not have deleted anything, got %v", *deleted) + } +} + +func TestRunVMPruneConfirmedDeletesNonRunning(t *testing.T) { + d := defaultDeps() + deleted := stubPruneSeams(t, d, []model.VMRecord{ + {ID: "id-run", Name: "keeper", State: model.VMStateRunning}, + {ID: "id-stop", Name: "stale", State: model.VMStateStopped}, + {ID: "id-err", Name: "broken", State: model.VMStateError}, + {ID: "id-created", Name: "fresh", State: model.VMStateCreated}, + }, nil, nil) + + cmd, stdout, _ := newPruneTestCmd("y\n") + if err := d.runVMPrune(cmd, "sock", false); err != nil { + t.Fatalf("d.runVMPrune: %v", err) + } + // Deleted must be exactly the three non-running IDs, in list order. + want := []string{"id-stop", "id-err", "id-created"} + if len(*deleted) != len(want) { + t.Fatalf("deleted = %v, want %v", *deleted, want) + } + for i, id := range want { + if (*deleted)[i] != id { + t.Errorf("deleted[%d] = %q, want %q", i, (*deleted)[i], id) + } + } + for _, want := range []string{"stale", "broken", "fresh"} { + if !strings.Contains(stdout.String(), "deleted "+want) { + t.Errorf("output missing 'deleted %s':\n%s", want, stdout.String()) + } + } + if strings.Contains(stdout.String(), "deleted keeper") { + t.Errorf("running VM should not be deleted:\n%s", stdout.String()) + } +} + +func TestRunVMPruneForceSkipsPrompt(t *testing.T) { + d := defaultDeps() + deleted := stubPruneSeams(t, d, []model.VMRecord{ + {ID: "id-1", Name: "stale", State: model.VMStateStopped}, + }, nil, nil) + + // Empty stdin + force=true: must not block on prompt. + cmd, stdout, _ := newPruneTestCmd("") + if err := d.runVMPrune(cmd, "sock", true); err != nil { + t.Fatalf("d.runVMPrune: %v", err) + } + if len(*deleted) != 1 || (*deleted)[0] != "id-1" { + t.Errorf("deleted = %v, want [id-1]", *deleted) + } + // Prompt should not appear in output. + if strings.Contains(stdout.String(), "Delete these VMs?") { + t.Errorf("force=true should skip prompt:\n%s", stdout.String()) + } +} + +func TestRunVMPruneReportsPartialFailure(t *testing.T) { + d := defaultDeps() + stubPruneSeams(t, d, + []model.VMRecord{ + {ID: "id-a", Name: "a", State: model.VMStateStopped}, + {ID: "id-b", Name: "b", State: model.VMStateStopped}, + }, + nil, + map[string]error{"id-a": errors.New("simulated")}, + ) + + cmd, _, stderr := newPruneTestCmd("") + err := d.runVMPrune(cmd, "sock", true) + if err == nil { + t.Fatal("expected non-zero exit when any delete fails") + } + if !strings.Contains(err.Error(), "1 VM(s) failed") { + t.Errorf("unexpected error: %v", err) + } + if !strings.Contains(stderr.String(), "delete a:") { + t.Errorf("stderr missing failure log: %q", stderr.String()) + } +} + +func TestRunVMPruneListErrorPropagates(t *testing.T) { + d := defaultDeps() + stubPruneSeams(t, d, nil, fmt.Errorf("rpc failed"), nil) + + cmd, _, _ := newPruneTestCmd("") + err := d.runVMPrune(cmd, "sock", true) + if err == nil || !strings.Contains(err.Error(), "rpc failed") { + t.Fatalf("expected rpc error to propagate, got %v", err) + } +} diff --git a/internal/cli/ssh.go b/internal/cli/ssh.go new file mode 100644 index 0000000..eab58ce --- /dev/null +++ b/internal/cli/ssh.go @@ -0,0 +1,138 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "io" + "os/exec" + "strings" + "time" + + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" + "banger/internal/vsockagent" +) + +// runSSHSession executes ssh with the given args. On exit it decides +// whether to print the "vm is still running" reminder: we skip it if +// the caller asked (e.g. --rm is about to delete the VM), if the +// ctx is already done, or if the ssh error isn't the one that +// typically means "user disconnected cleanly". +func (d *deps) runSSHSession(ctx context.Context, socketPath, vmRef string, stdin io.Reader, stdout, stderr io.Writer, sshArgs []string, skipReminder bool) error { + sshErr := d.sshExec(ctx, stdin, stdout, stderr, sshArgs) + if skipReminder || !shouldCheckSSHReminder(sshErr) || ctx.Err() != nil { + return sshErr + } + pingCtx, cancel := context.WithTimeout(context.Background(), 3*time.Second) + defer cancel() + health, err := d.vmHealth(pingCtx, socketPath, vmRef) + if err != nil { + _, _ = fmt.Fprintln(stderr, vsockagent.WarningMessage(vmRef, err)) + return sshErr + } + if health.Healthy { + name := health.Name + if strings.TrimSpace(name) == "" { + name = vmRef + } + _, _ = fmt.Fprintln(stderr, vsockagent.ReminderMessage(name)) + } + return sshErr +} + +func shouldCheckSSHReminder(err error) bool { + if err == nil { + return true + } + var exitErr *exec.ExitError + if !errors.As(err, &exitErr) { + return false + } + return exitErr.ExitCode() != 255 +} + +// sshCommandArgs builds the argv for `ssh` invocations against a VM. +// Host-key verification uses a banger-owned known_hosts file +// populated by the daemon's first successful Go-SSH dial to each VM +// (trust-on-first-use). `accept-new` means: accept-and-pin on first +// contact; strict-verify afterwards. The user's own +// ~/.ssh/known_hosts is never touched. +func sshCommandArgs(cfg model.DaemonConfig, guestIP string, extra []string) ([]string, error) { + if guestIP == "" { + return nil, errors.New("vm has no guest IP") + } + args := []string{} + args = append(args, "-F", "/dev/null") + if cfg.SSHKeyPath != "" { + args = append(args, "-i", cfg.SSHKeyPath) + } + knownHosts, khErr := bangerKnownHostsPath() + args = append( + args, + "-o", "IdentitiesOnly=yes", + "-o", "BatchMode=yes", + "-o", "PreferredAuthentications=publickey", + "-o", "PasswordAuthentication=no", + "-o", "KbdInteractiveAuthentication=no", + ) + if khErr == nil { + args = append(args, + "-o", "UserKnownHostsFile="+knownHosts, + "-o", "StrictHostKeyChecking=accept-new", + ) + } else { + // If we can't resolve the banger path (unusual — paths.Resolve + // basically can't fail), fall through to a hard-fail posture + // rather than silently disabling verification. + args = append(args, + "-o", "StrictHostKeyChecking=yes", + ) + } + args = append(args, "root@"+guestIP) + // ssh(1) concatenates every argument after the host with spaces + // before sending to the remote shell. That means passing extra + // args raw — `ssh host sh -c 'exit 42'` — re-tokenises on the + // remote side to `sh -c exit 42`, where `42` is $0 for the + // already-completed `exit`, and the rc the user asked for is + // lost. Shell-quote each element and join them ourselves so the + // remote shell sees exactly the argv the user typed locally. + if len(extra) > 0 { + quoted := make([]string, len(extra)) + for i, a := range extra { + quoted[i] = shellQuote(a) + } + args = append(args, strings.Join(quoted, " ")) + } + return args, nil +} + +// bangerKnownHostsPath resolves the TOFU file the daemon writes into +// and the CLI reads back. Both sides must agree on the path or the +// pin doesn't round-trip. +func bangerKnownHostsPath() (string, error) { + layout, err := paths.Resolve() + if err != nil { + return "", err + } + return layout.KnownHostsPath, nil +} + +func validateSSHPrereqs(cfg model.DaemonConfig) error { + checks := system.NewPreflight() + checks.RequireCommand("ssh", "install openssh-client") + if strings.TrimSpace(cfg.SSHKeyPath) != "" { + checks.RequireFile(cfg.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`) + } + return checks.Err("ssh preflight failed") +} + +func validateVMRunPrereqs(cfg model.DaemonConfig) error { + checks := system.NewPreflight() + checks.RequireCommand("git", "install git") + if strings.TrimSpace(cfg.SSHKeyPath) != "" { + checks.RequireFile(cfg.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`) + } + return checks.Err("vm run preflight failed") +} diff --git a/internal/cli/style/style.go b/internal/cli/style/style.go new file mode 100644 index 0000000..8753335 --- /dev/null +++ b/internal/cli/style/style.go @@ -0,0 +1,70 @@ +// Package style provides a tiny, conservative ANSI-color helper for +// banger's CLI. The contract: +// +// - Each helper takes the writer the styled string is going to and +// returns either the wrapped string or the plain one. +// - "Wrapped" only happens when the writer is a TTY AND the +// NO_COLOR environment variable is unset. +// - No 256-color or truecolor; no theme system; no external dep. +// +// Banger's CLI uses these for status (pass/fail/warn), error +// prefixes, and dim secondary text. Anything richer belongs in a +// dedicated TUI layer that this package isn't. +package style + +import ( + "io" + "os" + "strings" +) + +// ANSI escape sequences. Kept private — callers compose meaning via +// the named helpers (Pass/Fail/Warn/...), not raw codes. +const ( + ansiReset = "\x1b[0m" + ansiBold = "\x1b[1m" + ansiDim = "\x1b[2m" + ansiRed = "\x1b[31m" + ansiGreen = "\x1b[32m" + ansiYel = "\x1b[33m" +) + +// Pass wraps s in green when w is a TTY and NO_COLOR is unset. +func Pass(w io.Writer, s string) string { return wrap(w, ansiGreen, s) } + +// Fail wraps s in red. +func Fail(w io.Writer, s string) string { return wrap(w, ansiRed, s) } + +// Warn wraps s in yellow. +func Warn(w io.Writer, s string) string { return wrap(w, ansiYel, s) } + +// Dim wraps s in dim. +func Dim(w io.Writer, s string) string { return wrap(w, ansiDim, s) } + +// Bold wraps s in bold. +func Bold(w io.Writer, s string) string { return wrap(w, ansiBold, s) } + +// SupportsColor reports whether colored output should be emitted to +// w. Exposed so callers that build multi-segment strings can avoid +// duplicating the gate per call. +func SupportsColor(w io.Writer) bool { + if strings.TrimSpace(os.Getenv("NO_COLOR")) != "" { + return false + } + file, ok := w.(*os.File) + if !ok { + return false + } + info, err := file.Stat() + if err != nil { + return false + } + return info.Mode()&os.ModeCharDevice != 0 +} + +func wrap(w io.Writer, code, s string) string { + if !SupportsColor(w) { + return s + } + return code + s + ansiReset +} diff --git a/internal/cli/style/style_test.go b/internal/cli/style/style_test.go new file mode 100644 index 0000000..b51e6ed --- /dev/null +++ b/internal/cli/style/style_test.go @@ -0,0 +1,64 @@ +package style + +import ( + "bytes" + "os" + "strings" + "testing" +) + +// TestStyleNoOpsForNonTTYWriter pins that styled helpers don't emit +// ANSI escapes when the destination isn't a terminal. Buffers stand +// in for any non-TTY writer (CI, redirected stdout, log files). +func TestStyleNoOpsForNonTTYWriter(t *testing.T) { + var buf bytes.Buffer + cases := map[string]string{ + "pass": Pass(&buf, "ok"), + "fail": Fail(&buf, "boom"), + "warn": Warn(&buf, "huh"), + "dim": Dim(&buf, "sub"), + "bold": Bold(&buf, "bold"), + } + for label, got := range cases { + if strings.Contains(got, "\x1b[") { + t.Errorf("%s: contains ANSI escape on non-TTY writer: %q", label, got) + } + } +} + +// TestStyleSuppressedByNoColor pins https://no-color.org compliance: +// even on a "real" TTY, NO_COLOR forces plain output. +func TestStyleSuppressedByNoColor(t *testing.T) { + t.Setenv("NO_COLOR", "1") + r, w, err := os.Pipe() + if err != nil { + t.Fatalf("Pipe: %v", err) + } + defer r.Close() + defer w.Close() + // w is a pipe end, not a char device — NO_COLOR is the dominant + // gate but verifying the helper still suppresses guards against + // a future TTY-detection regression that would otherwise need a + // pty harness to surface. + if got := Pass(w, "ok"); strings.Contains(got, "\x1b[") { + t.Errorf("NO_COLOR set but Pass() emitted ANSI: %q", got) + } + if got := Fail(w, "boom"); strings.Contains(got, "\x1b[") { + t.Errorf("NO_COLOR set but Fail() emitted ANSI: %q", got) + } +} + +// TestSupportsColorRespectsNoColor confirms the gate function used +// by the helpers. Required for callers that compose multi-segment +// strings and want to ask once. +func TestSupportsColorRespectsNoColor(t *testing.T) { + t.Setenv("NO_COLOR", "1") + tmp, err := os.CreateTemp(t.TempDir(), "style-*") + if err != nil { + t.Fatalf("CreateTemp: %v", err) + } + defer tmp.Close() + if SupportsColor(tmp) { + t.Fatal("SupportsColor returned true with NO_COLOR set") + } +} diff --git a/internal/cli/vm_create.go b/internal/cli/vm_create.go new file mode 100644 index 0000000..144050f --- /dev/null +++ b/internal/cli/vm_create.go @@ -0,0 +1,330 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "io" + "os" + "strings" + "time" + + "banger/internal/api" + "banger/internal/cli/style" + "banger/internal/config" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" +) + +// effectiveVMDefaults resolves the default VM sizing applied when +// --vcpu/--memory/--disk-size aren't given: config overrides win +// over host-derived heuristics, both fall back to baked-in +// constants. Called at command-build time so the cobra flag defaults +// reflect the resolved values. +func effectiveVMDefaults() model.VMDefaults { + var override model.VMDefaultsOverride + if layout, err := paths.Resolve(); err == nil { + if cfg, err := config.Load(layout); err == nil { + override = cfg.VMDefaults + } + } + host, err := system.ReadHostResources() + if err != nil { + return model.ResolveVMDefaults(override, 0, 0) + } + return model.ResolveVMDefaults(override, host.CPUCount, host.TotalMemoryBytes) +} + +// printVMSpecLine writes a one-line sizing summary to out. Always +// emitted (even non-TTY) so logs and CI output carry the numbers. +func printVMSpecLine(out io.Writer, params api.VMCreateParams) { + vcpu := model.DefaultVCPUCount + if params.VCPUCount != nil { + vcpu = *params.VCPUCount + } + memory := model.DefaultMemoryMiB + if params.MemoryMiB != nil { + memory = *params.MemoryMiB + } + diskBytes := int64(model.DefaultWorkDiskSize) + if strings.TrimSpace(params.WorkDiskSize) != "" { + if parsed, err := model.ParseSize(params.WorkDiskSize); err == nil { + diskBytes = parsed + } + } + _, _ = fmt.Fprintf(out, "spec: %d vcpu | %d MiB | %s disk\n", + vcpu, memory, model.FormatSizeBytes(diskBytes)) +} + +// runVMCreate drives the create RPC + polls for progress. stderr +// gets the spec line up front and the progress renderer thereafter. +// On context cancel we cooperate with the daemon to cancel the +// in-flight op so it doesn't leak partially-created VM state. +func (d *deps) runVMCreate(ctx context.Context, socketPath string, stderr io.Writer, params api.VMCreateParams, verbose bool) (model.VMRecord, error) { + start := time.Now() + printVMSpecLine(stderr, params) + begin, err := d.vmCreateBegin(ctx, socketPath, params) + if err != nil { + return model.VMRecord{}, err + } + renderer := newVMCreateProgressRenderer(stderr, verbose) + renderer.render(begin.Operation) + + op := begin.Operation + for { + if op.Done { + renderer.render(op) + if op.Success && op.VM != nil { + renderer.clear() + elapsed := formatVMCreateElapsed(time.Since(start)) + _, _ = fmt.Fprintf(stderr, "[vm create] ready in %s\n", style.Dim(stderr, elapsed)) + return *op.VM, nil + } + if strings.TrimSpace(op.Error) == "" { + return model.VMRecord{}, errors.New("vm create failed") + } + return model.VMRecord{}, errors.New(op.Error) + } + + select { + case <-ctx.Done(): + cancelCtx, cancel := context.WithTimeout(context.Background(), time.Second) + defer cancel() + _ = d.vmCreateCancel(cancelCtx, socketPath, op.ID) + return model.VMRecord{}, ctx.Err() + case <-time.After(200 * time.Millisecond): + } + + status, err := d.vmCreateStatus(ctx, socketPath, op.ID) + if err != nil { + if ctx.Err() != nil { + cancelCtx, cancel := context.WithTimeout(context.Background(), time.Second) + defer cancel() + _ = d.vmCreateCancel(cancelCtx, socketPath, op.ID) + return model.VMRecord{}, ctx.Err() + } + return model.VMRecord{}, err + } + op = status.Operation + renderer.render(op) + } +} + +type vmCreateProgressRenderer struct { + out io.Writer + enabled bool + inline bool + active bool + lastLine string +} + +// newVMCreateProgressRenderer wires up progress for `vm create`. On +// non-TTY writers it stays disabled (CI/test logs already capture the +// spec + ready lines); on TTY it rewrites a single line via \r unless +// verbose is set or BANGER_NO_PROGRESS is exported, in which case it +// falls back to one line per stage. +func newVMCreateProgressRenderer(out io.Writer, verbose bool) *vmCreateProgressRenderer { + tty := writerSupportsProgress(out) + return &vmCreateProgressRenderer{ + out: out, + enabled: tty, + inline: tty && !verbose && !progressDisabledByEnv(), + } +} + +func (r *vmCreateProgressRenderer) render(op api.VMCreateOperation) { + if r == nil || !r.enabled { + return + } + line := formatVMCreateProgress(op) + if line == "" || line == r.lastLine { + return + } + r.lastLine = line + if r.inline { + _, _ = fmt.Fprint(r.out, "\r\x1b[K", line) + r.active = true + return + } + _, _ = fmt.Fprintln(r.out, line) +} + +// clear resets the live inline line so the caller can write a clean +// terminating message. No-op outside inline mode. +func (r *vmCreateProgressRenderer) clear() { + if r == nil || !r.enabled || !r.inline || !r.active { + return + } + _, _ = fmt.Fprint(r.out, "\r\x1b[K") + r.active = false + r.lastLine = "" +} + +// progressDisabledByEnv is the BANGER_NO_PROGRESS escape hatch — a +// non-empty value forces line-per-stage output even on a TTY, so users +// can pipe `script(1)` / tmux capture without \r artifacts. +func progressDisabledByEnv() bool { + return strings.TrimSpace(os.Getenv("BANGER_NO_PROGRESS")) != "" +} + +// writerSupportsProgress returns true only when out is a terminal. +// Keeps stage lines + heartbeat dots out of piped / logged output +// where they'd just be noise. +func writerSupportsProgress(out io.Writer) bool { + file, ok := out.(*os.File) + if !ok { + return false + } + info, err := file.Stat() + if err != nil { + return false + } + return info.Mode()&os.ModeCharDevice != 0 +} + +// withHeartbeat runs fn while emitting a dot to stderr every 2 +// seconds so the user sees long-running RPCs (bundle downloads, etc.) +// aren't wedged. No-op when stderr isn't a terminal, so piped or +// logged output stays clean. +func withHeartbeat(stderr io.Writer, label string, fn func() error) error { + if !writerSupportsProgress(stderr) { + return fn() + } + fmt.Fprintf(stderr, "[%s] ", label) + stop := make(chan struct{}) + done := make(chan struct{}) + go func() { + defer close(done) + ticker := time.NewTicker(2 * time.Second) + defer ticker.Stop() + for { + select { + case <-stop: + return + case <-ticker.C: + fmt.Fprint(stderr, ".") + } + } + }() + err := fn() + close(stop) + <-done + fmt.Fprintln(stderr) + return err +} + +func formatVMCreateProgress(op api.VMCreateOperation) string { + stage := strings.TrimSpace(op.Stage) + detail := strings.TrimSpace(op.Detail) + label := vmCreateStageLabel(stage) + if label == "" && detail == "" { + return "" + } + if label == "" { + return "[vm create] " + detail + } + if detail == "" { + return "[vm create] " + label + } + return "[vm create] " + label + ": " + detail +} + +// vmCreateStageLabel humanises the daemon-side stage IDs. Anything +// unknown falls through to `strings.ReplaceAll(_, "_", " ")` so new +// stages still render meaningfully without a code change. +func vmCreateStageLabel(stage string) string { + switch strings.TrimSpace(stage) { + case "queued": + return "queued" + case "resolve_image": + return "resolving image" + case "reserve_vm": + return "allocating vm" + case "preflight": + return "checking host prerequisites" + case "prepare_rootfs": + return "preparing root filesystem" + case "prepare_host_features": + return "preparing host features" + case "prepare_work_disk": + return "preparing work disk" + case "boot_firecracker": + return "starting firecracker" + case "wait_vsock_agent": + return "waiting for vsock agent" + case "wait_guest_ready": + return "waiting for guest services" + case "apply_dns": + return "publishing dns" + case "apply_nat": + return "configuring nat" + case "finalize": + return "finalizing" + case "ready": + return "ready" + default: + return strings.ReplaceAll(stage, "_", " ") + } +} + +// formatVMCreateElapsed renders a wall-clock duration as a friendly +// "ready in 4.7s" / "ready in 1m02s" string. Sub-second durations +// keep one decimal so quick smoke runs don't print "0s". +func formatVMCreateElapsed(d time.Duration) string { + if d < time.Second { + return fmt.Sprintf("%dms", d.Milliseconds()) + } + if d < time.Minute { + return fmt.Sprintf("%.1fs", d.Seconds()) + } + d = d.Round(time.Second) + minutes := int(d / time.Minute) + seconds := int((d % time.Minute) / time.Second) + return fmt.Sprintf("%dm%02ds", minutes, seconds) +} + +func validatePositiveSetting(label string, value int) error { + if value <= 0 { + return fmt.Errorf("%s must be a positive integer", label) + } + return nil +} + +// shortID and relativeTime are small display helpers used across +// every printer; kept here alongside the other render-time helpers. +func shortID(id string) string { + if len(id) <= 12 { + return id + } + return id[:12] +} + +func relativeTime(t time.Time) string { + if t.IsZero() { + return "-" + } + delta := time.Since(t) + switch { + case delta < 30*time.Second: + return "moments ago" + case delta < time.Minute: + return fmt.Sprintf("%d seconds ago", int(delta.Seconds())) + case delta < 2*time.Minute: + return "1 minute ago" + case delta < time.Hour: + return fmt.Sprintf("%d minutes ago", int(delta.Minutes())) + case delta < 2*time.Hour: + return "1 hour ago" + case delta < 24*time.Hour: + return fmt.Sprintf("%d hours ago", int(delta.Hours())) + case delta < 48*time.Hour: + return "1 day ago" + case delta < 7*24*time.Hour: + return fmt.Sprintf("%d days ago", int(delta.Hours()/24)) + case delta < 14*24*time.Hour: + return "1 week ago" + default: + return fmt.Sprintf("%d weeks ago", int(delta.Hours()/(24*7))) + } +} diff --git a/internal/cli/vm_exec.go b/internal/cli/vm_exec.go new file mode 100644 index 0000000..2ec862a --- /dev/null +++ b/internal/cli/vm_exec.go @@ -0,0 +1,192 @@ +package cli + +import ( + "context" + "errors" + "fmt" + "os/exec" + "strings" + + "banger/internal/api" + "banger/internal/model" + "banger/internal/rpc" + + "github.com/spf13/cobra" +) + +func (d *deps) newVMExecCommand() *cobra.Command { + var guestPath string + var autoPrepare bool + cmd := &cobra.Command{ + Use: "exec -- [args...]", + Short: "Run a command in the VM workspace with the repo toolchain", + Long: strings.TrimSpace(` +Run a command inside a persistent VM, wrapping it with 'mise exec' so +all mise-managed tools (Go, Node, Python, etc.) are on PATH. + +If the VM has a prepared workspace (from 'vm workspace prepare' or +'vm run ./repo'), the command runs from that directory and a stale- +workspace warning is printed when the host repo has advanced since the +last prepare; pass --auto-prepare to re-sync first. Otherwise the +command runs from root's home directory. --guest-path overrides both. + +Exit code of the guest command is propagated verbatim. +`), + Example: strings.TrimSpace(` + banger vm exec dev -- make test + banger vm exec dev -- go build ./... + banger vm exec dev --auto-prepare -- npm ci && npm test + banger vm exec dev --guest-path /root/other -- make lint +`), + Args: cobra.ArbitraryArgs, + RunE: func(cmd *cobra.Command, args []string) error { + // Split on -- : everything before is [vm-name], everything after is the command. + dash := cmd.ArgsLenAtDash() + var vmRef string + var command []string + switch { + case dash < 0: + // No -- separator: first arg is VM, rest is command. + if len(args) < 2 { + return errors.New("usage: banger vm exec -- [args...]") + } + vmRef = args[0] + command = args[1:] + case dash == 0 || len(args[dash:]) == 0: + return errors.New("usage: banger vm exec -- [args...]") + default: + vmRef = args[:dash][0] + command = args[dash:] + } + + layout, cfg, err := d.ensureDaemon(cmd.Context()) + if err != nil { + return err + } + if err := validateSSHPrereqs(cfg); err != nil { + return err + } + + // Fetch the full VM record — we need Workspace and GuestIP. + result, err := rpc.Call[api.VMShowResult](cmd.Context(), layout.SocketPath, "vm.show", api.VMRefParams{IDOrName: vmRef}) + if err != nil { + return err + } + vm := result.VM + if vm.State != model.VMStateRunning { + return fmt.Errorf("vm %q is not running (state: %s)", vm.Name, vm.State) + } + + // Resolve effective guest workspace path. Empty means "no + // cd": run from the SSH session's default cwd ($HOME). We + // only auto-cd when the user explicitly passed --guest-path + // or the VM actually has a recorded workspace — otherwise + // arbitrary VMs (no repo) would fail with cd errors. + execGuestPath := strings.TrimSpace(guestPath) + if execGuestPath == "" { + execGuestPath = strings.TrimSpace(vm.Workspace.GuestPath) + } + + // Dirty-workspace check: compare stored HEAD with current host HEAD. + isDirty, currentHead, _ := d.vmExecDirtyCheck(cmd.Context(), vm.Workspace) + if isDirty { + storedShort := shortRef(vm.Workspace.HeadCommit) + currentShort := shortRef(currentHead) + preparedLabel := relativeTime(vm.Workspace.PreparedAt) + + if autoPrepare && vm.Workspace.SourcePath != "" { + _, _ = fmt.Fprintf(cmd.ErrOrStderr(), + "[vm exec] workspace stale (prepared %s from %s, host HEAD now %s) — re-preparing\n", + preparedLabel, storedShort, currentShort) + if err := validateVMRunPrereqs(cfg); err != nil { + return err + } + if _, err := d.vmWorkspacePrepare(cmd.Context(), layout.SocketPath, api.VMWorkspacePrepareParams{ + IDOrName: vmRef, + SourcePath: vm.Workspace.SourcePath, + GuestPath: execGuestPath, + Mode: string(model.WorkspacePrepareModeShallowOverlay), + }); err != nil { + return fmt.Errorf("auto-prepare workspace: %w", err) + } + } else { + _, _ = fmt.Fprintf(cmd.ErrOrStderr(), + "[vm exec] warning: workspace stale (prepared %s from %s, host HEAD now %s) — use --auto-prepare to re-sync\n", + preparedLabel, storedShort, currentShort) + } + } + + // Build and run the exec script. + script := buildVMExecScript(execGuestPath, command) + sshArgs, err := sshCommandArgs(cfg, vm.Runtime.GuestIP, []string{"bash", "-lc", script}) + if err != nil { + return fmt.Errorf("vm %q: build ssh args: %w", vm.Name, err) + } + if err := d.sshExec(cmd.Context(), cmd.InOrStdin(), cmd.OutOrStdout(), cmd.ErrOrStderr(), sshArgs); err != nil { + var exitErr *exec.ExitError + if errors.As(err, &exitErr) { + return ExitCodeError{Code: exitErr.ExitCode()} + } + return err + } + return nil + }, + } + cmd.Flags().StringVar(&guestPath, "guest-path", "", "workspace directory in the guest (default: from last workspace prepare; otherwise root's home)") + cmd.Flags().BoolVar(&autoPrepare, "auto-prepare", false, "re-sync the workspace from the host repo before running if it's stale") + _ = cmd.RegisterFlagCompletionFunc("guest-path", cobra.NoFileCompletions) + return cmd +} + +// buildVMExecScript returns the bash -lc argument that runs the +// command through mise exec when mise is available, falling back to a +// plain exec if it's not. When guestPath is non-empty, the script +// cd's into it first (workspace mode); when empty, the command runs +// from the SSH session's default cwd so VMs without a prepared +// workspace don't blow up on a non-existent /root/repo. Each command +// argument is shell-quoted so spaces and special characters survive +// the bash re-parse inside the -lc string. +func buildVMExecScript(guestPath string, command []string) string { + parts := make([]string, len(command)) + for i, a := range command { + parts[i] = shellQuote(a) + } + quotedCmd := strings.Join(parts, " ") + body := fmt.Sprintf( + "if command -v mise >/dev/null 2>&1; then mise exec -- %s; else %s; fi", + quotedCmd, + quotedCmd, + ) + if guestPath == "" { + return body + } + return fmt.Sprintf("cd %s && %s", shellQuote(guestPath), body) +} + +// vmExecDirtyCheck compares the HEAD commit stored in the VM's +// workspace record against the current HEAD of the host repo. Returns +// (false, "", nil) when the check can't be performed (no workspace +// recorded, path gone, not a repo, git not installed) so callers +// treat unknown as "not dirty" rather than blocking the exec. +func (d *deps) vmExecDirtyCheck(ctx context.Context, ws model.VMWorkspace) (isDirty bool, currentHead string, err error) { + if ws.SourcePath == "" || ws.HeadCommit == "" { + return false, "", nil + } + out, err := d.hostCommandOutput(ctx, "git", "-C", ws.SourcePath, "rev-parse", "HEAD") + if err != nil { + // Source path gone, not a git repo, or git not installed — + // treat as unknown rather than blocking. + return false, "", nil + } + currentHead = strings.TrimSpace(string(out)) + return currentHead != ws.HeadCommit, currentHead, nil +} + +// shortRef returns the first 8 characters of a git ref / commit SHA +// for display. Returns the full string if it's already short. +func shortRef(ref string) string { + if len(ref) > 8 { + return ref[:8] + } + return ref +} diff --git a/internal/cli/vm_exec_test.go b/internal/cli/vm_exec_test.go new file mode 100644 index 0000000..e57f5af --- /dev/null +++ b/internal/cli/vm_exec_test.go @@ -0,0 +1,35 @@ +package cli + +import ( + "strings" + "testing" +) + +func TestBuildVMExecScriptWithGuestPath(t *testing.T) { + got := buildVMExecScript("/root/repo", []string{"make", "test"}) + want := "cd '/root/repo' && if command -v mise >/dev/null 2>&1; then mise exec -- 'make' 'test'; else 'make' 'test'; fi" + if got != want { + t.Fatalf("buildVMExecScript with path:\n got: %q\n want: %q", got, want) + } +} + +func TestBuildVMExecScriptWithoutGuestPath(t *testing.T) { + got := buildVMExecScript("", []string{"whoami"}) + want := "if command -v mise >/dev/null 2>&1; then mise exec -- 'whoami'; else 'whoami'; fi" + if got != want { + t.Fatalf("buildVMExecScript without path:\n got: %q\n want: %q", got, want) + } + if strings.Contains(got, "cd ") { + t.Fatalf("expected no cd when guestPath is empty, got: %q", got) + } +} + +func TestBuildVMExecScriptShellQuotesPathWithSpaces(t *testing.T) { + got := buildVMExecScript("/tmp/with space", []string{"echo", "a b"}) + if !strings.Contains(got, "cd '/tmp/with space'") { + t.Fatalf("expected guest path to be shell-quoted, got: %q", got) + } + if !strings.Contains(got, "mise exec -- 'echo' 'a b'") { + t.Fatalf("expected command args to be shell-quoted, got: %q", got) + } +} diff --git a/internal/cli/vm_run.go b/internal/cli/vm_run.go new file mode 100644 index 0000000..2a8f60b --- /dev/null +++ b/internal/cli/vm_run.go @@ -0,0 +1,540 @@ +package cli + +import ( + "bytes" + "context" + "errors" + "fmt" + "io" + "net" + "os" + "os/exec" + "path/filepath" + "strings" + "time" + + "banger/internal/api" + "banger/internal/daemon/workspace" + "banger/internal/model" + "banger/internal/toolingplan" + + "github.com/spf13/cobra" +) + +// vmRunGuestClient is the narrow guest-SSH surface vm run needs. The +// daemon's guest-SSH package returns a value that satisfies this +// interface directly; we restate it here so tests can plug in fakes +// without pulling the full daemon in. +type vmRunGuestClient interface { + Close() error + UploadFile(ctx context.Context, remotePath string, mode os.FileMode, data []byte, logWriter io.Writer) error + RunScript(ctx context.Context, script string, logWriter io.Writer) error + StreamTar(ctx context.Context, sourceDir, remoteCommand string, logWriter io.Writer) error + StreamTarEntries(ctx context.Context, sourceDir string, entries []string, remoteCommand string, logWriter io.Writer) error +} + +// vmRunRepo is the CLI-local view of the workspace argument to +// `vm run`: an absolute source path that passed preflight, plus the +// two branch flags. Everything else the flow needs (RepoRoot, +// RepoName, HEAD commit, etc.) comes back from the workspace.prepare +// RPC, which does the full git inspection daemon-side. +type vmRunRepo struct { + sourcePath string + branchName string + fromRef string + includeUntracked bool +} + +const vmRunToolingInstallTimeoutSeconds = 120 + +// vmRunSSHTimeout bounds how long `vm run` waits for guest ssh after +// the vsock agent is ready. vsock readiness already means systemd +// should be up within seconds; a minute plus change is generous +// headroom for a slow first boot while still short enough that a +// wedged sshd surfaces promptly instead of hanging forever. Var, not +// const, so tests can shrink it. +var vmRunSSHTimeout = 90 * time.Second + +// ExitCodeError wraps a remote command's exit status so the CLI's main() +// can propagate it verbatim. Only errors explicitly wrapped in this +// type get forwarded as process exit codes — plain *exec.ExitError +// values (from unrelated subprocesses like mkfs.ext4) must still +// surface as regular errors so the user sees a message. +type ExitCodeError struct { + Code int +} + +func (e ExitCodeError) Error() string { + return fmt.Sprintf("exit status %d", e.Code) +} + +// vmRunPreflightRepo validates a vm run workspace path BEFORE the VM +// is created, so bad paths fail fast instead of leaving the user +// with an orphaned VM. The check is intentionally minimal: the +// daemon's PrepareVMWorkspace does a full git inspection (branch, +// HEAD, identity, overlay) and returns everything the tooling +// harness needs, so duplicating the heavy lifting here just doubles +// the I/O. We only enforce what the user can fix locally before +// banger commits to creating a VM: +// +// - the path exists and is a directory, +// - it sits inside a non-bare git repository, +// - the repository has no submodules (unsupported in the shallow +// overlay mode vm run uses). +func (d *deps) vmRunPreflightRepo(ctx context.Context, rawPath string) (string, error) { + if strings.TrimSpace(rawPath) == "" { + wd, err := d.cwd() + if err != nil { + return "", err + } + rawPath = wd + } + sourcePath, err := workspace.ResolveSourcePath(rawPath) + if err != nil { + return "", err + } + repoRoot, err := d.repoInspector.GitTrimmedOutput(ctx, sourcePath, "rev-parse", "--show-toplevel") + if err != nil { + return "", fmt.Errorf("%s is not inside a git repository", sourcePath) + } + isBare, err := d.repoInspector.GitTrimmedOutput(ctx, repoRoot, "rev-parse", "--is-bare-repository") + if err != nil { + return "", fmt.Errorf("inspect git repository %s: %w", repoRoot, err) + } + if isBare == "true" { + return "", fmt.Errorf("vm run requires a non-bare git repository: %s", repoRoot) + } + submodules, err := d.repoInspector.ListSubmodules(ctx, repoRoot) + if err != nil { + return "", err + } + if len(submodules) > 0 { + return "", fmt.Errorf("vm run does not support git submodules in %s (%s); use `vm create` + `vm workspace prepare --mode full_copy`", repoRoot, strings.Join(submodules, ", ")) + } + return sourcePath, nil +} + +// repoHasMiseFiles reports whether the repo at sourcePath contains a +// mise tooling manifest. Used as a host-side preflight: when --nat is +// off and a manifest is present, vm run refuses early instead of +// committing to a VM that will silently fail to install tools. +func repoHasMiseFiles(sourcePath string) (bool, error) { + for _, name := range []string{".mise.toml", ".tool-versions"} { + info, err := os.Stat(filepath.Join(sourcePath, name)) + if err == nil && !info.IsDir() { + return true, nil + } + if err != nil && !errors.Is(err, os.ErrNotExist) { + return false, fmt.Errorf("inspect %s: %w", name, err) + } + } + return false, nil +} + +// splitVMRunArgs partitions cobra positional args into the optional path +// argument and the trailing command (everything after a `--` separator). +// The path slice may contain 0..1 entries; the command slice may be empty. +func splitVMRunArgs(cmd *cobra.Command, args []string) (pathArgs, commandArgs []string) { + dash := cmd.ArgsLenAtDash() + if dash < 0 { + return args, nil + } + if dash > len(args) { + dash = len(args) + } + return args[:dash], args[dash:] +} + +// runVMRun orchestrates the full `vm run` flow: create the VM, wait +// for guest ssh, optionally materialise a workspace and kick off the +// tooling bootstrap, then either attach interactively or run the +// user's command and propagate its exit status. +func (d *deps) runVMRun(ctx context.Context, socketPath string, cfg model.DaemonConfig, stdin io.Reader, stdout, stderr io.Writer, params api.VMCreateParams, repo *vmRunRepo, command []string, removeOnExit, detach, skipBootstrap, verbose bool) error { + if repo != nil && !skipBootstrap && !params.NATEnabled { + hasMise, err := repoHasMiseFiles(repo.sourcePath) + if err != nil { + return err + } + if hasMise { + return errors.New("tooling bootstrap requires --nat (or pass --no-bootstrap to skip)") + } + } + progress := newVMRunProgressRenderer(stderr, verbose) + defer progress.clear() + vm, err := d.runVMCreate(ctx, socketPath, stderr, params, verbose) + if err != nil { + return err + } + vmRef := strings.TrimSpace(vm.Name) + if vmRef == "" { + vmRef = shortID(vm.ID) + } + // --rm cleanup is wired AFTER ssh is confirmed. An ssh-wait + // timeout leaves the VM alive for `vm logs` inspection (our + // error message tells the user that); the cleanup only fires + // once the session phase runs. + shouldRemove := false + if removeOnExit { + defer func() { + if !shouldRemove { + return + } + // Use a fresh context so Ctrl-C during the session + // doesn't abort the delete RPC. + cleanupCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + if err := d.vmDelete(cleanupCtx, socketPath, vmRef); err != nil { + progress.clear() + printVMRunWarning(stderr, fmt.Sprintf("--rm cleanup failed: %v (leaked vm %q; delete manually)", err, vmRef)) + } else if err := removeUserKnownHosts(vm); err != nil { + progress.clear() + printVMRunWarning(stderr, fmt.Sprintf("known_hosts cleanup failed: %v", err)) + } + }() + } + sshAddress := net.JoinHostPort(vm.Runtime.GuestIP, "22") + progress.render("waiting for guest ssh") + sshCtx, cancelSSH := context.WithTimeout(ctx, vmRunSSHTimeout) + if err := d.guestWaitForSSH(sshCtx, sshAddress, cfg.SSHKeyPath, 250*time.Millisecond); err != nil { + cancelSSH() + // Surface parent-context cancellation (Ctrl-C, caller + // timeout) as-is. Only the guest-side timeout needs the + // actionable hint. + if errors.Is(ctx.Err(), context.Canceled) || errors.Is(ctx.Err(), context.DeadlineExceeded) { + return fmt.Errorf("vm %q: %w", vmRef, ctx.Err()) + } + return fmt.Errorf( + "vm %q is running but guest ssh did not come up within %s. "+ + "sshd is the likely suspect — inspect the guest console with "+ + "`banger vm logs %s` (look for `Failed to start ssh.service`). "+ + "The VM is still alive; leave it for inspection or remove with `banger vm delete %s`. "+ + "underlying error: %w", + vmRef, vmRunSSHTimeout, vmRef, vmRef, err, + ) + } + cancelSSH() + shouldRemove = removeOnExit + if repo != nil { + progress.render("preparing guest workspace") + // --from is only meaningful paired with --branch; the daemon + // rejects "from without branch" outright. Our flag default is + // "HEAD" (useful only when --branch is set), so scrub it when + // branch is empty to avoid a false "workspace from requires + // branch" error. + fromRef := "" + if strings.TrimSpace(repo.branchName) != "" { + fromRef = repo.fromRef + } + if !repo.includeUntracked { + progress.clear() + d.noteUntrackedSkipped(ctx, stderr, repo.sourcePath) + } + prepared, err := d.vmWorkspacePrepare(ctx, socketPath, api.VMWorkspacePrepareParams{ + IDOrName: vmRef, + SourcePath: repo.sourcePath, + GuestPath: vmRunGuestDir(), + Branch: repo.branchName, + From: fromRef, + Mode: string(model.WorkspacePrepareModeShallowOverlay), + IncludeUntracked: repo.includeUntracked, + }) + if err != nil { + return fmt.Errorf("vm %q is running but workspace prepare failed: %w", vmRef, err) + } + // The prepare RPC already did the full git inspection on the + // daemon side; grab what the tooling harness needs from its + // result instead of re-inspecting here. + if len(command) == 0 && !skipBootstrap { + client, err := d.guestDial(ctx, sshAddress, cfg.SSHKeyPath) + if err != nil { + return fmt.Errorf("vm %q is running but guest ssh is unavailable: %w", vmRef, err) + } + if err := d.startVMRunToolingHarness(ctx, client, prepared.Workspace.RepoRoot, prepared.Workspace.RepoName, progress, detach, stderr); err != nil { + progress.clear() + printVMRunWarning(stderr, fmt.Sprintf("guest tooling bootstrap start failed: %v", err)) + } + _ = client.Close() + } + } + if detach { + progress.commitLine(fmt.Sprintf("vm %s running; reconnect with: banger vm ssh %s", vmRef, vmRef)) + return nil + } + sshArgs, err := sshCommandArgs(cfg, vm.Runtime.GuestIP, command) + if err != nil { + return fmt.Errorf("vm %q is running but ssh args could not be built: %w", vmRef, err) + } + if len(command) > 0 { + progress.render("running command in guest") + progress.clear() + if err := d.sshExec(ctx, stdin, stdout, stderr, sshArgs); err != nil { + var exitErr *exec.ExitError + if errors.As(err, &exitErr) { + return ExitCodeError{Code: exitErr.ExitCode()} + } + return err + } + return nil + } + progress.render("attaching to guest") + progress.clear() + return d.runSSHSession(ctx, socketPath, vmRef, stdin, stdout, stderr, sshArgs, removeOnExit) +} + +func vmRunGuestDir() string { + return "/root/repo" +} + +func vmRunToolingHarnessPath(repoName string) string { + return filepath.ToSlash(filepath.Join("/tmp", "banger-vm-run-tooling-"+repoName+".sh")) +} + +func vmRunToolingHarnessLogPath(repoName string) string { + return filepath.ToSlash(filepath.Join("/root/.cache/banger", "vm-run-tooling-"+repoName+".log")) +} + +// startVMRunToolingHarness uploads + launches the mise bootstrap +// script inside the guest. repoRoot / repoName both come from the +// daemon's workspace.prepare RPC response so the CLI doesn't have +// to re-inspect the git tree. +// +// When wait is true (used by --detach), the harness runs in the +// foreground so the CLI can return only after bootstrap finishes; +// the harness's stdout is streamed to syncOut for live visibility. +// When wait is false (interactive mode), the harness is nohup'd so +// the user's ssh session can start while bootstrap continues. +func (d *deps) startVMRunToolingHarness(ctx context.Context, client vmRunGuestClient, repoRoot, repoName string, progress *vmRunProgressRenderer, wait bool, syncOut io.Writer) error { + if progress != nil { + progress.render("starting guest tooling bootstrap") + } + plan := d.buildVMRunToolingPlan(ctx, repoRoot) + var uploadLog bytes.Buffer + if err := client.UploadFile(ctx, vmRunToolingHarnessPath(repoName), 0o755, []byte(vmRunToolingHarnessScript(plan)), &uploadLog); err != nil { + return formatVMRunStepError("upload guest tooling bootstrap", err, uploadLog.String()) + } + if wait { + var launchLog bytes.Buffer + out := io.Writer(&launchLog) + if syncOut != nil { + out = io.MultiWriter(syncOut, &launchLog) + } + if err := client.RunScript(ctx, vmRunToolingHarnessSyncScript(repoName), out); err != nil { + return formatVMRunStepError("run guest tooling bootstrap", err, launchLog.String()) + } + if progress != nil { + progress.render("guest tooling bootstrap done (log: " + vmRunToolingHarnessLogPath(repoName) + ")") + } + return nil + } + var launchLog bytes.Buffer + if err := client.RunScript(ctx, vmRunToolingHarnessLaunchScript(repoName), &launchLog); err != nil { + return formatVMRunStepError("launch guest tooling bootstrap", err, launchLog.String()) + } + if progress != nil { + progress.render("guest tooling log: " + vmRunToolingHarnessLogPath(repoName)) + } + return nil +} + +func vmRunToolingHarnessScript(plan toolingplan.Plan) string { + var script strings.Builder + script.WriteString("set -uo pipefail\n") + fmt.Fprintf(&script, "DIR=%s\n", shellQuote(vmRunGuestDir())) + script.WriteString("export PATH=/usr/local/bin:/root/.local/share/mise/shims:$PATH\n") + script.WriteString("if [ -f /etc/profile.d/mise.sh ]; then . /etc/profile.d/mise.sh || true; fi\n") + script.WriteString("log() { printf '%s\\n' \"$*\"; }\n") + script.WriteString("run_best_effort() {\n") + script.WriteString(" \"$@\"\n") + script.WriteString(" rc=$?\n") + script.WriteString(" if [ \"$rc\" -ne 0 ]; then\n") + script.WriteString(" log \"command failed ($rc): $*\"\n") + script.WriteString(" fi\n") + script.WriteString(" return 0\n") + script.WriteString("}\n") + script.WriteString("run_bounded_best_effort() {\n") + script.WriteString(" timeout_secs=\"$1\"\n") + script.WriteString(" shift\n") + script.WriteString(" timeout_marker=\"$(mktemp)\"\n") + script.WriteString(" rm -f \"$timeout_marker\"\n") + script.WriteString(" \"$@\" &\n") + script.WriteString(" cmd_pid=$!\n") + script.WriteString(" (\n") + script.WriteString(" sleep \"$timeout_secs\"\n") + script.WriteString(" if kill -0 \"$cmd_pid\" 2>/dev/null; then\n") + script.WriteString(" : >\"$timeout_marker\"\n") + script.WriteString(" log \"command timed out after ${timeout_secs}s: $*\"\n") + script.WriteString(" kill -TERM \"$cmd_pid\" 2>/dev/null || true\n") + script.WriteString(" if command -v pkill >/dev/null 2>&1; then pkill -TERM -P \"$cmd_pid\" 2>/dev/null || true; fi\n") + script.WriteString(" sleep 2\n") + script.WriteString(" kill -KILL \"$cmd_pid\" 2>/dev/null || true\n") + script.WriteString(" if command -v pkill >/dev/null 2>&1; then pkill -KILL -P \"$cmd_pid\" 2>/dev/null || true; fi\n") + script.WriteString(" fi\n") + script.WriteString(" ) &\n") + script.WriteString(" watchdog_pid=$!\n") + script.WriteString(" wait \"$cmd_pid\"\n") + script.WriteString(" rc=$?\n") + script.WriteString(" kill \"$watchdog_pid\" 2>/dev/null || true\n") + script.WriteString(" wait \"$watchdog_pid\" 2>/dev/null || true\n") + script.WriteString(" if [ -f \"$timeout_marker\" ]; then\n") + script.WriteString(" rm -f \"$timeout_marker\"\n") + script.WriteString(" return 0\n") + script.WriteString(" fi\n") + script.WriteString(" rm -f \"$timeout_marker\"\n") + script.WriteString(" if [ \"$rc\" -ne 0 ]; then\n") + script.WriteString(" log \"command failed ($rc): $*\"\n") + script.WriteString(" fi\n") + script.WriteString(" return 0\n") + script.WriteString("}\n") + script.WriteString("cd \"$DIR\" || { log \"missing repo directory: $DIR\"; exit 0; }\n") + script.WriteString("MISE_BIN=\"$(command -v mise || true)\"\n") + script.WriteString("if [ -z \"$MISE_BIN\" ]; then log \"mise not found; skipping guest tooling bootstrap\"; exit 0; fi\n") + script.WriteString("log \"starting guest tooling bootstrap in $DIR\"\n") + if len(plan.RepoManagedTools) > 0 { + fmt.Fprintf(&script, "log %s\n", shellQuote("repo-managed mise tools: "+strings.Join(plan.RepoManagedTools, ", "))) + } + script.WriteString("if [ -f .mise.toml ] || [ -f .tool-versions ]; then\n") + script.WriteString(" log \"running mise install from repo declarations\"\n") + script.WriteString(" run_best_effort \"$MISE_BIN\" install\n") + script.WriteString("fi\n") + fmt.Fprintf(&script, "INSTALL_TIMEOUT_SECS=%d\n", vmRunToolingInstallTimeoutSeconds) + for _, step := range plan.Steps { + stepLabel := fmt.Sprintf("deterministic install: %s@%s (%s)", step.Tool, step.Version, step.Source) + fmt.Fprintf(&script, "log %s\n", shellQuote(stepLabel)) + fmt.Fprintf(&script, "run_bounded_best_effort \"$INSTALL_TIMEOUT_SECS\" \"$MISE_BIN\" use -g --pin %s\n", shellQuote(step.Tool+"@"+step.Version)) + } + for _, skip := range plan.Skips { + skipLabel := fmt.Sprintf("deterministic skip: %s (%s)", skip.Target, skip.Reason) + fmt.Fprintf(&script, "log %s\n", shellQuote(skipLabel)) + } + if len(plan.Steps) > 0 { + script.WriteString("run_best_effort \"$MISE_BIN\" reshim\n") + } + script.WriteString("log \"guest tooling bootstrap finished\"\n") + return script.String() +} + +func vmRunToolingHarnessLaunchScript(repoName string) string { + var script strings.Builder + script.WriteString("set -euo pipefail\n") + fmt.Fprintf(&script, "HELPER=%s\n", shellQuote(vmRunToolingHarnessPath(repoName))) + fmt.Fprintf(&script, "LOG=%s\n", shellQuote(vmRunToolingHarnessLogPath(repoName))) + script.WriteString("mkdir -p \"$(dirname \"$LOG\")\"\n") + script.WriteString("nohup bash \"$HELPER\" >\"$LOG\" 2>&1 &1 | tee \"$LOG\"\n") + return script.String() +} + +func formatVMRunStepError(action string, err error, log string) error { + log = strings.TrimSpace(log) + if log == "" { + return fmt.Errorf("%s: %w", action, err) + } + return fmt.Errorf("%s: %w: %s", action, err, log) +} + +type vmRunProgressRenderer struct { + out io.Writer + enabled bool + inline bool + active bool + lastLine string +} + +// newVMRunProgressRenderer wires up progress for `vm run`. Unlike the +// vm_create renderer, this one emits in line mode even on non-TTY +// writers (covers tests and piped output that the existing tooling +// already parses); inline mode kicks in only when stderr is a TTY, +// verbose is unset, and BANGER_NO_PROGRESS is unset. +func newVMRunProgressRenderer(out io.Writer, verbose bool) *vmRunProgressRenderer { + if out == nil { + return &vmRunProgressRenderer{} + } + return &vmRunProgressRenderer{ + out: out, + enabled: true, + inline: writerSupportsProgress(out) && !verbose && !progressDisabledByEnv(), + } +} + +func (r *vmRunProgressRenderer) render(detail string) { + if r == nil || !r.enabled { + return + } + line := formatVMRunProgress(detail) + if line == "" || line == r.lastLine { + return + } + r.lastLine = line + if r.inline { + _, _ = fmt.Fprint(r.out, "\r\x1b[K", line) + r.active = true + return + } + _, _ = fmt.Fprintln(r.out, line) +} + +// clear erases the live inline line so the caller can write a clean +// terminating message (warning, ssh attach, command output). No-op +// outside inline mode. +func (r *vmRunProgressRenderer) clear() { + if r == nil || !r.enabled || !r.inline || !r.active { + return + } + _, _ = fmt.Fprint(r.out, "\r\x1b[K") + r.active = false + r.lastLine = "" +} + +// commitLine prints detail as a final, persistent line. In inline +// mode it overwrites the live status; in line mode it just appends. +// Used for terminal messages like the --detach hand-off summary. +func (r *vmRunProgressRenderer) commitLine(detail string) { + if r == nil || !r.enabled { + return + } + line := formatVMRunProgress(detail) + if line == "" { + return + } + if r.inline { + _, _ = fmt.Fprint(r.out, "\r\x1b[K", line, "\n") + r.active = false + r.lastLine = "" + return + } + if line == r.lastLine { + return + } + r.lastLine = line + _, _ = fmt.Fprintln(r.out, line) +} + +func formatVMRunProgress(detail string) string { + detail = strings.TrimSpace(detail) + if detail == "" { + return "" + } + return "[vm run] " + detail +} + +func printVMRunWarning(out io.Writer, detail string) { + detail = strings.TrimSpace(detail) + if out == nil || detail == "" { + return + } + _, _ = fmt.Fprintln(out, "[vm run] warning: "+detail) +} diff --git a/internal/cli/vm_run_test.go b/internal/cli/vm_run_test.go new file mode 100644 index 0000000..cab4f5d --- /dev/null +++ b/internal/cli/vm_run_test.go @@ -0,0 +1,278 @@ +package cli + +import ( + "bytes" + "context" + "io" + "os" + "path/filepath" + "strings" + "testing" + "time" + + "banger/internal/api" + "banger/internal/model" + "banger/internal/toolingplan" +) + +func TestVMRunRejectsDetachWithRm(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"vm", "run", "-d", "--rm"}) + + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "cannot combine --detach with --rm") { + t.Fatalf("Execute() error = %v, want --detach + --rm rejection", err) + } +} + +func TestVMRunRejectsDetachWithCommand(t *testing.T) { + cmd := NewBangerCommand() + cmd.SetArgs([]string{"vm", "run", "-d", "--", "whoami"}) + + err := cmd.Execute() + if err == nil || !strings.Contains(err.Error(), "cannot combine --detach with a guest command") { + t.Fatalf("Execute() error = %v, want --detach + command rejection", err) + } +} + +func TestRepoHasMiseFiles(t *testing.T) { + dir := t.TempDir() + got, err := repoHasMiseFiles(dir) + if err != nil { + t.Fatalf("repoHasMiseFiles(empty): %v", err) + } + if got { + t.Fatalf("repoHasMiseFiles(empty) = true, want false") + } + + if err := os.WriteFile(filepath.Join(dir, ".mise.toml"), []byte(""), 0o600); err != nil { + t.Fatalf("write .mise.toml: %v", err) + } + got, err = repoHasMiseFiles(dir) + if err != nil { + t.Fatalf("repoHasMiseFiles(.mise.toml): %v", err) + } + if !got { + t.Fatalf("repoHasMiseFiles(.mise.toml) = false, want true") + } + + dir2 := t.TempDir() + if err := os.WriteFile(filepath.Join(dir2, ".tool-versions"), []byte(""), 0o600); err != nil { + t.Fatalf("write .tool-versions: %v", err) + } + got, err = repoHasMiseFiles(dir2) + if err != nil { + t.Fatalf("repoHasMiseFiles(.tool-versions): %v", err) + } + if !got { + t.Fatalf("repoHasMiseFiles(.tool-versions) = false, want true") + } +} + +// runVMRunDepsRunningVM returns a deps wired so runVMRun reaches a +// point where it would create a VM and proceed — used by precondition +// tests that should refuse before any of these fakes get called. +func runVMRunDepsRunningVM(t *testing.T) (*deps, *model.VMRecord) { + t.Helper() + d := defaultDeps() + vm := &model.VMRecord{ + ID: "vm-id", + Name: "devbox", + Runtime: model.VMRuntime{ + State: model.VMStateRunning, + GuestIP: "172.16.0.2", + DNSName: "devbox.vm", + }, + } + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + return api.VMCreateBeginResult{Operation: api.VMCreateOperation{ID: "op-1", Stage: "ready", Done: true, Success: true, VM: vm}}, nil + } + d.guestWaitForSSH = func(context.Context, string, string, time.Duration) error { return nil } + d.vmWorkspacePrepare = func(context.Context, string, api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + return api.VMWorkspacePrepareResult{Workspace: model.WorkspacePrepareResult{VMID: vm.ID, GuestPath: "/root/repo", RepoName: "repo", RepoRoot: "/tmp/repo"}}, nil + } + d.buildVMRunToolingPlan = func(context.Context, string) toolingplan.Plan { + return toolingplan.Plan{} + } + d.vmHealth = func(context.Context, string, string) (api.VMHealthResult, error) { + return api.VMHealthResult{Healthy: true}, nil + } + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { return nil } + return d, vm +} + +func TestRunVMRunRefusesBootstrapWithoutNAT(t *testing.T) { + repoRoot := t.TempDir() + if err := os.WriteFile(filepath.Join(repoRoot, ".mise.toml"), []byte(""), 0o600); err != nil { + t.Fatalf("write .mise.toml: %v", err) + } + + d := defaultDeps() + d.vmCreateBegin = func(context.Context, string, api.VMCreateParams) (api.VMCreateBeginResult, error) { + t.Fatal("vmCreateBegin should not be called when NAT precondition refuses") + return api.VMCreateBeginResult{}, nil + } + + repo := vmRunRepo{sourcePath: repoRoot} + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox", NATEnabled: false}, + &repo, + nil, + false, false, false, false, + ) + if err == nil || !strings.Contains(err.Error(), "tooling bootstrap requires --nat") { + t.Fatalf("runVMRun = %v, want NAT precondition refusal", err) + } +} + +func TestRunVMRunBootstrapPreconditionRespectsNoBootstrap(t *testing.T) { + repoRoot := t.TempDir() + if err := os.WriteFile(filepath.Join(repoRoot, ".mise.toml"), []byte(""), 0o600); err != nil { + t.Fatalf("write .mise.toml: %v", err) + } + + d, _ := runVMRunDepsRunningVM(t) + dialed := false + d.guestDial = func(context.Context, string, string) (vmRunGuestClient, error) { + dialed = true + return &testVMRunGuestClient{}, nil + } + + repo := vmRunRepo{sourcePath: repoRoot} + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox", NATEnabled: false}, + &repo, + nil, + false, false, true, false, // skipBootstrap = true + ) + if err != nil { + t.Fatalf("runVMRun: %v", err) + } + if dialed { + t.Fatal("guestDial should not be called when --no-bootstrap is set") + } +} + +func TestRunVMRunBootstrapPreconditionPassesWithoutMiseFiles(t *testing.T) { + repoRoot := t.TempDir() // empty repo, no mise files + + d, _ := runVMRunDepsRunningVM(t) + dialed := false + d.guestDial = func(context.Context, string, string) (vmRunGuestClient, error) { + dialed = true + return &testVMRunGuestClient{}, nil + } + + repo := vmRunRepo{sourcePath: repoRoot} + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox", NATEnabled: false}, + &repo, + nil, + false, false, false, false, + ) + if err != nil { + t.Fatalf("runVMRun: %v", err) + } + // Bootstrap dispatch happens (no mise file gating) but dial still + // gets called because the harness pipeline runs. + if !dialed { + t.Fatal("guestDial should be called for bootstrap dispatch") + } +} + +func TestRunVMRunDetachSkipsSshAttach(t *testing.T) { + d, _ := runVMRunDepsRunningVM(t) + d.guestDial = func(context.Context, string, string) (vmRunGuestClient, error) { + return &testVMRunGuestClient{}, nil + } + sshExecCalls := 0 + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { + sshExecCalls++ + return nil + } + + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox"}, + nil, // bare mode + nil, // no command + false, true, false, false, // detach = true + ) + if err != nil { + t.Fatalf("runVMRun: %v", err) + } + if sshExecCalls != 0 { + t.Fatalf("sshExec called %d times, want 0 in detach mode", sshExecCalls) + } + if !strings.Contains(stderr.String(), "reconnect with: banger vm ssh devbox") { + t.Fatalf("stderr = %q, want reconnect hint", stderr.String()) + } +} + +func TestRunVMRunDetachUsesSyncBootstrapPath(t *testing.T) { + repoRoot := t.TempDir() + + d, _ := runVMRunDepsRunningVM(t) + fakeClient := &testVMRunGuestClient{} + d.guestDial = func(context.Context, string, string) (vmRunGuestClient, error) { + return fakeClient, nil + } + sshExecCalls := 0 + d.sshExec = func(context.Context, io.Reader, io.Writer, io.Writer, []string) error { + sshExecCalls++ + return nil + } + + repo := vmRunRepo{sourcePath: repoRoot} + var stdout, stderr bytes.Buffer + err := d.runVMRun( + context.Background(), + "/tmp/bangerd.sock", + model.DaemonConfig{SSHKeyPath: "/tmp/id_ed25519"}, + strings.NewReader(""), + &stdout, &stderr, + api.VMCreateParams{Name: "devbox", NATEnabled: true}, + &repo, + nil, + false, true, false, false, // detach = true + ) + if err != nil { + t.Fatalf("runVMRun: %v", err) + } + if sshExecCalls != 0 { + t.Fatalf("sshExec called %d times, want 0 in detach mode", sshExecCalls) + } + if len(fakeClient.uploads) != 1 { + t.Fatalf("uploads = %d, want 1 (harness upload)", len(fakeClient.uploads)) + } + // Sync mode should invoke the tee'd wrapper, not the nohup launcher. + if strings.Contains(fakeClient.launchScript, "nohup") { + t.Fatalf("detach mode should not use nohup launcher; got: %q", fakeClient.launchScript) + } + if !strings.Contains(fakeClient.launchScript, "tee") { + t.Fatalf("detach mode should tee output to log; got: %q", fakeClient.launchScript) + } +} diff --git a/internal/cli/vm_spec_test.go b/internal/cli/vm_spec_test.go new file mode 100644 index 0000000..50614fd --- /dev/null +++ b/internal/cli/vm_spec_test.go @@ -0,0 +1,53 @@ +package cli + +import ( + "bytes" + "strings" + "testing" + + "banger/internal/api" +) + +func TestPrintVMSpecLineWithAllFields(t *testing.T) { + vcpu, mem := 2, 2048 + params := api.VMCreateParams{ + VCPUCount: &vcpu, + MemoryMiB: &mem, + WorkDiskSize: "8G", + } + var buf bytes.Buffer + printVMSpecLine(&buf, params) + got := buf.String() + for _, want := range []string{"spec:", "2 vcpu", "2048 MiB", "8G"} { + if !strings.Contains(got, want) { + t.Errorf("output missing %q:\n%s", want, got) + } + } + if !strings.HasSuffix(got, "\n") { + t.Error("spec line should terminate with newline") + } +} + +func TestPrintVMSpecLineFallsBackToBuiltinsOnNilFields(t *testing.T) { + // Empty params — the printer reaches for DefaultVCPUCount / + // DefaultMemoryMiB / DefaultWorkDiskSize so output is still sane. + var buf bytes.Buffer + printVMSpecLine(&buf, api.VMCreateParams{}) + got := buf.String() + // Not asserting exact values — just that it produced a plausible + // line with the three labels. + for _, want := range []string{"spec:", "vcpu", "MiB", "disk"} { + if !strings.Contains(got, want) { + t.Errorf("output missing %q:\n%s", want, got) + } + } +} + +func TestPrintVMSpecLineIgnoresUnparseableDiskSize(t *testing.T) { + // Falls back to builtin default; must not panic or print garbage. + var buf bytes.Buffer + printVMSpecLine(&buf, api.VMCreateParams{WorkDiskSize: "not-a-size"}) + if !strings.Contains(buf.String(), "spec:") { + t.Errorf("expected spec line even with bad input, got %q", buf.String()) + } +} diff --git a/internal/cli/workspace_preview.go b/internal/cli/workspace_preview.go new file mode 100644 index 0000000..956d6ea --- /dev/null +++ b/internal/cli/workspace_preview.go @@ -0,0 +1,61 @@ +package cli + +import ( + "context" + "fmt" + "io" +) + +// runWorkspaceDryRun inspects the local repo at resolvedPath and +// prints the file list that `vm run` / `workspace prepare` would ship +// into the guest. Runs on the CLI side (no daemon RPC needed) since +// the daemon is always local and the workspace inspection is a pure +// git read. Git calls go through d.repoInspector so tests inject a +// stub Runner via the deps struct instead of touching package globals. +func (d *deps) runWorkspaceDryRun(ctx context.Context, out io.Writer, resolvedPath, branchName, fromRef string, includeUntracked bool) error { + spec, err := d.repoInspector.InspectRepo(ctx, resolvedPath, branchName, fromRef, includeUntracked) + if err != nil { + return err + } + fmt.Fprintf(out, "dry-run: %d file(s) would be copied to guest\n", len(spec.OverlayPaths)) + fmt.Fprintf(out, "repo: %s\n", spec.RepoRoot) + if includeUntracked { + fmt.Fprintln(out, "mode: tracked + untracked non-ignored (--include-untracked)") + } else { + fmt.Fprintln(out, "mode: tracked only (re-run with --include-untracked to also copy untracked non-ignored files)") + } + fmt.Fprintln(out, "---") + for _, path := range spec.OverlayPaths { + fmt.Fprintln(out, path) + } + if !includeUntracked { + d.noteUntrackedSkipped(ctx, out, spec.RepoRoot) + } + return nil +} + +// noteUntrackedSkipped prints a one-line notice when the repo holds +// untracked non-ignored files that will NOT be copied because +// --include-untracked was not passed. +// +// Best-effort: if sourcePath isn't inside a git repo, or git errors, +// or there are no untracked files, the helper stays silent. The +// notice is a courtesy — failing the whole operation over a courtesy +// would be worse than the notice being missing. +// +// Resolves sourcePath to the repo root internally via `git rev-parse +// --show-toplevel` so callers can pass whatever path the user typed. +// Before this helper normalised, subdir inputs ran `ls-files +// --others` scoped to the subdir, which silently underreported the +// skipped files the user needed to know about. +func (d *deps) noteUntrackedSkipped(ctx context.Context, out io.Writer, sourcePath string) { + repoRoot, err := d.repoInspector.GitTrimmedOutput(ctx, sourcePath, "rev-parse", "--show-toplevel") + if err != nil || repoRoot == "" { + return + } + count, err := d.repoInspector.CountUntrackedPaths(ctx, repoRoot) + if err != nil || count == 0 { + return + } + fmt.Fprintf(out, "---\nnote: %d untracked non-ignored file(s) were NOT copied (git-tracked files only by default — pass --include-untracked to include them)\n", count) +} diff --git a/internal/cli/workspace_preview_test.go b/internal/cli/workspace_preview_test.go new file mode 100644 index 0000000..74cac66 --- /dev/null +++ b/internal/cli/workspace_preview_test.go @@ -0,0 +1,120 @@ +package cli + +import ( + "bytes" + "context" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + + "banger/internal/daemon/workspace" +) + +// seedRepoWithSubdir creates a git repo with one tracked file, and an +// untracked non-ignored file at the repo root (not under the subdir). +// Returns the repo root and the subdir path. +func seedRepoWithSubdir(t *testing.T) (repoRoot, subDir string) { + t.Helper() + if _, err := exec.LookPath("git"); err != nil { + t.Skipf("git not on PATH: %v", err) + } + repoRoot = t.TempDir() + run := func(args ...string) { + t.Helper() + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = repoRoot + cmd.Env = append(os.Environ(), + "GIT_AUTHOR_NAME=t", "GIT_AUTHOR_EMAIL=t@t", + "GIT_COMMITTER_NAME=t", "GIT_COMMITTER_EMAIL=t@t", + "GIT_CONFIG_GLOBAL=/dev/null", + ) + if out, err := cmd.CombinedOutput(); err != nil { + t.Fatalf("%v: %v\n%s", args, err, out) + } + } + writeFile := func(relPath, content string) { + t.Helper() + full := filepath.Join(repoRoot, relPath) + if err := os.MkdirAll(filepath.Dir(full), 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(full, []byte(content), 0o644); err != nil { + t.Fatal(err) + } + } + run("git", "init", "-q", "-b", "main") + run("git", "config", "commit.gpgsign", "false") + writeFile("tracked.md", "hello\n") + writeFile("sub/kept.txt", "kept\n") + run("git", "add", ".") + run("git", "commit", "-q", "-m", "init") + // Untracked non-ignored file at the ROOT — not under sub/. This is + // what the pre-fix noteUntrackedSkipped would miss when the user + // passed sub/ as the workspace source. + writeFile("ROOT-SECRET.env", "TOKEN=abc\n") + subDir = filepath.Join(repoRoot, "sub") + return repoRoot, subDir +} + +// TestNoteUntrackedSkippedCountsRepoWideEvenFromSubdir pins the bug +// fix: when the user passes a subdirectory of a repo as the workspace +// source, the untracked-files notice must still reflect what will +// actually be skipped at the guest-shipping layer — which is a +// repo-wide concern. Before the fix the helper ran `git -C +// ls-files --others --exclude-standard`, which only sees files under +// the subdir, silently underreporting the real skip count. +func TestNoteUntrackedSkippedCountsRepoWideEvenFromSubdir(t *testing.T) { + repoRoot, subDir := seedRepoWithSubdir(t) + + d := defaultDeps() + d.repoInspector = workspace.NewInspector() + + var out bytes.Buffer + d.noteUntrackedSkipped(context.Background(), &out, subDir) + + got := out.String() + if !strings.Contains(got, "1 untracked") { + t.Fatalf("note = %q, want mention of 1 untracked file (the root-level SECRET.env)", got) + } + _ = repoRoot +} + +// TestNoteUntrackedSkippedSilentOutsideRepo verifies the best-effort +// contract: when sourcePath is not inside any git repo, the helper +// prints nothing and does not error. Callers rely on this so a user +// who points vm run at an ad-hoc directory (or an export tarball +// that's been unpacked) doesn't get the whole operation aborted +// over a courtesy notice. +func TestNoteUntrackedSkippedSilentOutsideRepo(t *testing.T) { + d := defaultDeps() + d.repoInspector = workspace.NewInspector() + + nonRepo := t.TempDir() + var out bytes.Buffer + d.noteUntrackedSkipped(context.Background(), &out, nonRepo) + + if got := out.String(); got != "" { + t.Fatalf("note = %q, want no output outside a git repo", got) + } +} + +// TestNoteUntrackedSkippedSwallowsInspectorErrors verifies that a +// runner that errors on every call produces no output and no panic. +// This is the other half of best-effort: even if git-the-binary is +// somehow broken or missing, the live flow keeps running. +func TestNoteUntrackedSkippedSwallowsInspectorErrors(t *testing.T) { + d := defaultDeps() + d.repoInspector = &workspace.Inspector{ + Runner: func(context.Context, string, ...string) ([]byte, error) { + return nil, &exec.Error{Name: "git", Err: exec.ErrNotFound} + }, + } + + var out bytes.Buffer + d.noteUntrackedSkipped(context.Background(), &out, t.TempDir()) + if got := out.String(); got != "" { + t.Fatalf("note = %q, want silence when inspector runner errors", got) + } +} diff --git a/internal/config/config.go b/internal/config/config.go index bfaf926..48670cd 100644 --- a/internal/config/config.go +++ b/internal/config/config.go @@ -5,6 +5,7 @@ import ( "crypto/rand" "crypto/x509" "encoding/pem" + "fmt" "os" "path/filepath" "strings" @@ -19,34 +20,71 @@ import ( ) type fileConfig struct { - LogLevel string `toml:"log_level"` - WebListenAddr *string `toml:"web_listen_addr"` - FirecrackerBin string `toml:"firecracker_bin"` - SSHKeyPath string `toml:"ssh_key_path"` - DefaultImageName string `toml:"default_image_name"` - AutoStopStaleAfter string `toml:"auto_stop_stale_after"` - StatsPollInterval string `toml:"stats_poll_interval"` - MetricsPoll string `toml:"metrics_poll_interval"` - BridgeName string `toml:"bridge_name"` - BridgeIP string `toml:"bridge_ip"` - CIDR string `toml:"cidr"` - TapPoolSize int `toml:"tap_pool_size"` - DefaultDNS string `toml:"default_dns"` + LogLevel string `toml:"log_level"` + FirecrackerBin string `toml:"firecracker_bin"` + JailerBin string `toml:"jailer_bin"` + JailerEnabled *bool `toml:"jailer_enabled"` + JailerChrootBase string `toml:"jailer_chroot_base"` + SSHKeyPath string `toml:"ssh_key_path"` + DefaultImageName string `toml:"default_image_name"` + AutoStopStaleAfter string `toml:"auto_stop_stale_after"` + StatsPollInterval string `toml:"stats_poll_interval"` + BridgeName string `toml:"bridge_name"` + BridgeIP string `toml:"bridge_ip"` + CIDR string `toml:"cidr"` + TapPoolSize int `toml:"tap_pool_size"` + DefaultDNS string `toml:"default_dns"` + FileSync []fileSyncEntryFile `toml:"file_sync"` + VMDefaults *vmDefaultsFile `toml:"vm_defaults"` +} + +type fileSyncEntryFile struct { + Host string `toml:"host"` + Guest string `toml:"guest"` + Mode string `toml:"mode"` +} + +// vmDefaultsFile mirrors the optional `[vm_defaults]` block. All +// fields are zero-valued when omitted; the resolver treats zero as +// "not set, compute from host or fall back to builtin constants." +type vmDefaultsFile struct { + VCPUCount int `toml:"vcpu"` + MemoryMiB int `toml:"memory_mib"` + DiskSize string `toml:"disk_size"` + SystemOverlaySize string `toml:"system_overlay_size"` } func Load(layout paths.Layout) (model.DaemonConfig, error) { + home, err := os.UserHomeDir() + if err != nil { + return model.DaemonConfig{}, err + } + return load(layout, home, true) +} + +func LoadDaemon(layout paths.Layout, ownerHome string) (model.DaemonConfig, error) { + return load(layout, ownerHome, false) +} + +func load(layout paths.Layout, home string, ensureDefaultSSHKey bool) (model.DaemonConfig, error) { cfg := model.DaemonConfig{ - LogLevel: "info", - WebListenAddr: "127.0.0.1:7777", - AutoStopStaleAfter: 0, - StatsPollInterval: model.DefaultStatsPollInterval, - MetricsPollInterval: model.DefaultMetricsPollInterval, - BridgeName: model.DefaultBridgeName, - BridgeIP: model.DefaultBridgeIP, - CIDR: model.DefaultCIDR, - TapPoolSize: 4, - DefaultDNS: model.DefaultDNS, - DefaultImageName: "default", + LogLevel: "info", + AutoStopStaleAfter: 0, + StatsPollInterval: model.DefaultStatsPollInterval, + BridgeName: model.DefaultBridgeName, + BridgeIP: model.DefaultBridgeIP, + CIDR: model.DefaultCIDR, + TapPoolSize: 4, + DefaultDNS: model.DefaultDNS, + DefaultImageName: "debian-bookworm", + HostHomeDir: home, + JailerBin: model.DefaultJailerBinary, + JailerEnabled: true, + // Chroot lives under StateDir (ext4) — not RuntimeDir (tmpfs). + // Hard-linking the kernel and any file-backed drives into the + // chroot requires same-filesystem; images already live under + // StateDir, so colocating the chroot avoids EXDEV. + JailerChrootBase: filepath.Join(layout.StateDir, "jail"), } var file fileConfig @@ -66,14 +104,20 @@ func Load(layout paths.Layout) (model.DaemonConfig, error) { if value := strings.TrimSpace(file.LogLevel); value != "" { cfg.LogLevel = value } - if file.WebListenAddr != nil { - cfg.WebListenAddr = strings.TrimSpace(*file.WebListenAddr) - } if value := strings.TrimSpace(file.FirecrackerBin); value != "" { cfg.FirecrackerBin = value } else if path, err := system.LookupExecutable("firecracker"); err == nil { cfg.FirecrackerBin = path } + if value := strings.TrimSpace(file.JailerBin); value != "" { + cfg.JailerBin = value + } + if file.JailerEnabled != nil { + cfg.JailerEnabled = *file.JailerEnabled + } + if value := strings.TrimSpace(file.JailerChrootBase); value != "" { + cfg.JailerChrootBase = value + } if value := strings.TrimSpace(file.DefaultImageName); value != "" { cfg.DefaultImageName = value } @@ -106,31 +150,267 @@ func Load(layout paths.Layout) (model.DaemonConfig, error) { } cfg.StatsPollInterval = duration } - if value := strings.TrimSpace(file.MetricsPoll); value != "" { - duration, err := time.ParseDuration(value) - if err != nil { - return cfg, err - } - cfg.MetricsPollInterval = duration - } if value := strings.TrimSpace(os.Getenv("BANGER_LOG_LEVEL")); value != "" { cfg.LogLevel = value } - sshKeyPath, err := resolveSSHKeyPath(layout, file.SSHKeyPath) + sshKeyPath, err := resolveSSHKeyPath(layout, file.SSHKeyPath, home, ensureDefaultSSHKey) if err != nil { return cfg, err } cfg.SSHKeyPath = sshKeyPath + + for i, entry := range file.FileSync { + validated, err := validateFileSyncEntry(entry, home) + if err != nil { + return cfg, fmt.Errorf("file_sync[%d]: %w", i, err) + } + cfg.FileSync = append(cfg.FileSync, validated) + } + + if file.VMDefaults != nil { + override, err := parseVMDefaults(*file.VMDefaults) + if err != nil { + return cfg, fmt.Errorf("vm_defaults: %w", err) + } + cfg.VMDefaults = override + } return cfg, nil } -func resolveSSHKeyPath(layout paths.Layout, configured string) (string, error) { +// parseVMDefaults validates and translates the TOML block into the +// model-level override struct. Negative values are rejected outright; +// zero means "not set." +func parseVMDefaults(file vmDefaultsFile) (model.VMDefaultsOverride, error) { + override := model.VMDefaultsOverride{ + VCPUCount: file.VCPUCount, + MemoryMiB: file.MemoryMiB, + } + if override.VCPUCount < 0 { + return model.VMDefaultsOverride{}, fmt.Errorf("vcpu must be >= 0 (got %d)", override.VCPUCount) + } + if override.MemoryMiB < 0 { + return model.VMDefaultsOverride{}, fmt.Errorf("memory_mib must be >= 0 (got %d)", override.MemoryMiB) + } + if value := strings.TrimSpace(file.DiskSize); value != "" { + bytes, err := model.ParseSize(value) + if err != nil { + return model.VMDefaultsOverride{}, fmt.Errorf("disk_size: %w", err) + } + override.WorkDiskSizeBytes = bytes + } + if value := strings.TrimSpace(file.SystemOverlaySize); value != "" { + bytes, err := model.ParseSize(value) + if err != nil { + return model.VMDefaultsOverride{}, fmt.Errorf("system_overlay_size: %w", err) + } + override.SystemOverlaySizeByte = bytes + } + return override, nil +} + +// validateFileSyncEntry normalises a single `[[file_sync]]` entry +// and rejects anything the operator would regret later: empty +// paths, unsupported leading characters, path traversal, host paths +// outside the owner home, or non-absolute guest targets. +func validateFileSyncEntry(entry fileSyncEntryFile, home string) (model.FileSyncEntry, error) { + host := strings.TrimSpace(entry.Host) + guest := strings.TrimSpace(entry.Guest) + if host == "" { + return model.FileSyncEntry{}, fmt.Errorf("host path is required") + } + if guest == "" { + return model.FileSyncEntry{}, fmt.Errorf("guest path is required") + } + if _, err := ResolveFileSyncHostPath(host, home); err != nil { + return model.FileSyncEntry{}, err + } + if err := validateFileSyncPath("guest", guest, true); err != nil { + return model.FileSyncEntry{}, err + } + // Guest paths must resolve under /root — that's where banger mounts + // the work disk. Syncing to /etc, /var, etc. would require writing + // to the rootfs snapshot, which file_sync deliberately doesn't do. + if !strings.HasPrefix(guest, "~/") && !strings.HasPrefix(guest, "/root/") && guest != "~" && guest != "/root" { + return model.FileSyncEntry{}, fmt.Errorf("guest path %q: must be under /root or ~/ (the work disk is mounted at /root)", guest) + } + mode := strings.TrimSpace(entry.Mode) + if mode != "" { + if err := validateFileSyncMode(mode); err != nil { + return model.FileSyncEntry{}, err + } + } + return model.FileSyncEntry{Host: host, Guest: guest, Mode: mode}, nil +} + +// ResolveFileSyncHostPath expands a configured [[file_sync]].host path +// against the owner home and rejects anything that lands outside that +// home. Both config.Load and the root daemon use this so policy cannot +// drift between startup-time validation and runtime file reads. +func ResolveFileSyncHostPath(raw, home string) (string, error) { + raw = strings.TrimSpace(raw) + if err := validateFileSyncPath("host", raw, true); err != nil { + return "", err + } + home = strings.TrimSpace(home) + if home == "" { + return "", fmt.Errorf("host path %q: owner home is required", raw) + } + if !filepath.IsAbs(home) { + return "", fmt.Errorf("host path %q: owner home %q must be absolute", raw, home) + } + candidate := raw + if strings.HasPrefix(raw, "~/") { + candidate = filepath.Join(home, strings.TrimPrefix(raw, "~/")) + } + candidate = filepath.Clean(candidate) + if !filepath.IsAbs(candidate) { + return "", fmt.Errorf("host path %q: resolved path %q must be absolute", raw, candidate) + } + if err := ensurePathWithinRoot(candidate, home); err != nil { + return "", fmt.Errorf("host path %q: %w", raw, err) + } + return candidate, nil +} + +// ResolveExistingFileSyncHostPath resolves a configured +// [[file_sync]].host path to its real on-disk target. This is the +// runtime companion to ResolveFileSyncHostPath: once os.Stat succeeds, +// the daemon uses this to ensure a top-level symlink still points +// inside the owner home before it reads from the path as root. +func ResolveExistingFileSyncHostPath(raw, home string) (string, error) { + candidate, err := ResolveFileSyncHostPath(raw, home) + if err != nil { + return "", err + } + resolved, err := filepath.EvalSymlinks(candidate) + if err != nil { + return "", fmt.Errorf("host path %q: resolve symlinks: %w", raw, err) + } + resolved = filepath.Clean(resolved) + if err := ensurePathWithinRoot(resolved, home); err != nil { + return "", fmt.Errorf("host path %q: resolved symlink target %q: %w", raw, resolved, err) + } + return resolved, nil +} + +// validateFileSyncPath rejects relative paths (other than a leading +// "~/"), "..", empty segments, and "~user/..." forms banger doesn't +// expand. Absolute paths and home-anchored paths pass through — the +// actual expansion happens at sync time. +func validateFileSyncPath(label, raw string, allowHome bool) error { + if raw == "~" { + return fmt.Errorf("%s path %q: bare '~' is not supported, point at a file or directory under it", label, raw) + } + // "~user/..." must be rejected specifically — catch it before the + // generic "must be absolute" message so the error names the real + // problem. + if strings.HasPrefix(raw, "~") && !strings.HasPrefix(raw, "~/") { + return fmt.Errorf("%s path %q: only '~/' is expanded, not '~user/'", label, raw) + } + if strings.HasPrefix(raw, "~/") { + if !allowHome { + return fmt.Errorf("%s path %q: home-relative paths are not supported here", label, raw) + } + } else if !strings.HasPrefix(raw, "/") { + return fmt.Errorf("%s path %q: must be absolute (start with '/') or home-anchored (start with '~/')", label, raw) + } + for _, segment := range strings.Split(raw, "/") { + if segment == ".." { + return fmt.Errorf("%s path %q: '..' segments are not allowed", label, raw) + } + } + return nil +} + +func ensurePathWithinRoot(candidate, root string) error { + root = filepath.Clean(strings.TrimSpace(root)) + candidate = filepath.Clean(strings.TrimSpace(candidate)) + rel, err := filepath.Rel(root, candidate) + if err != nil { + return fmt.Errorf("compare against owner home %q: %w", root, err) + } + if rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) { + return fmt.Errorf("must stay under owner home %q", root) + } + return nil +} + +// validateFileSyncMode accepts three- or four-digit octal strings. +// Three-digit modes like "600" are auto-prefixed with a leading 0 +// when parsed by the consumer. +func validateFileSyncMode(mode string) error { + if len(mode) < 3 || len(mode) > 4 { + return fmt.Errorf("mode %q: must be a 3- or 4-digit octal string", mode) + } + for _, r := range mode { + if r < '0' || r > '7' { + return fmt.Errorf("mode %q: must be octal (digits 0-7)", mode) + } + } + return nil +} + +func resolveSSHKeyPath(layout paths.Layout, configured, home string, ensureDefault bool) (string, error) { configured = strings.TrimSpace(configured) if configured != "" { - return configured, nil + return normalizeSSHKeyPath(configured, home) } - return ensureDefaultSSHKey(filepath.Join(layout.ConfigDir, "ssh", "id_ed25519")) + // Key lives under the state dir, not the config dir. The daemon's + // ensureVMSSHClientConfig scrubs ConfigDir/ssh on every Open as + // part of migrating off the pre-state-dir layout — putting the + // default key there would race with that cleanup (create → delete + // → next VM create fails to read the key). + sshDir := strings.TrimSpace(layout.SSHDir) + if sshDir == "" { + sshDir = filepath.Join(layout.StateDir, "ssh") + } + if !filepath.IsAbs(sshDir) { + return "", fmt.Errorf("ssh key dir must be absolute; got %q (check paths.Resolve populated SSHDir / StateDir)", sshDir) + } + defaultPath := filepath.Join(sshDir, "id_ed25519") + if ensureDefault { + return ensureDefaultSSHKey(defaultPath) + } + return defaultPath, nil +} + +// normalizeSSHKeyPath validates and canonicalises a user-configured +// ssh_key_path. Accepts: +// +// - absolute paths ("/home/me/keys/id_ed25519") +// - home-anchored paths ("~/keys/id_ed25519") — expanded against $HOME +// +// Rejects: +// +// - bare "~" (ambiguous — expand to what?) +// - "~other/foo" (we only expand the current user's home) +// - relative paths ("id_ed25519", "./keys/id_ed25519") — these are +// ambiguous because the daemon's cwd isn't the user's shell cwd, +// and readers in internal/guest + internal/cli do raw os.ReadFile +// on the path without re-resolving against a known anchor +func normalizeSSHKeyPath(raw, home string) (string, error) { + raw = strings.TrimSpace(raw) + if raw == "" { + return "", nil + } + if raw == "~" { + return "", fmt.Errorf("ssh_key_path %q: bare '~' is not supported, point at a specific key file", raw) + } + if strings.HasPrefix(raw, "~") && !strings.HasPrefix(raw, "~/") { + return "", fmt.Errorf("ssh_key_path %q: only '~/' is expanded, not '~user/'", raw) + } + if strings.HasPrefix(raw, "~/") { + home = strings.TrimSpace(home) + if home == "" { + return "", fmt.Errorf("ssh_key_path %q: no home directory available for ~ expansion", raw) + } + raw = filepath.Join(home, strings.TrimPrefix(raw, "~/")) + } + if !filepath.IsAbs(raw) { + return "", fmt.Errorf("ssh_key_path %q: must be absolute (start with '/') or home-anchored (start with '~/')", raw) + } + return filepath.Clean(raw), nil } func ensureDefaultSSHKey(path string) (string, error) { diff --git a/internal/config/config_test.go b/internal/config/config_test.go index 1934c6a..2a38fb6 100644 --- a/internal/config/config_test.go +++ b/internal/config/config_test.go @@ -3,6 +3,7 @@ package config import ( "os" "path/filepath" + "strings" "testing" "time" @@ -11,6 +12,7 @@ import ( func TestLoadDefaultsResolveFirecrackerAndGenerateSSHKey(t *testing.T) { configDir := t.TempDir() + sshDir := t.TempDir() binDir := t.TempDir() firecrackerPath := filepath.Join(binDir, "firecracker") if err := os.WriteFile(firecrackerPath, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { @@ -18,7 +20,7 @@ func TestLoadDefaultsResolveFirecrackerAndGenerateSSHKey(t *testing.T) { } t.Setenv("PATH", binDir) - cfg, err := Load(paths.Layout{ConfigDir: configDir}) + cfg, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: sshDir}) if err != nil { t.Fatalf("Load: %v", err) } @@ -26,7 +28,11 @@ func TestLoadDefaultsResolveFirecrackerAndGenerateSSHKey(t *testing.T) { if cfg.FirecrackerBin != firecrackerPath { t.Fatalf("FirecrackerBin = %q, want %q", cfg.FirecrackerBin, firecrackerPath) } - wantKey := filepath.Join(configDir, "ssh", "id_ed25519") + // Default key lives under SSHDir (state dir), NOT ConfigDir/ssh. + // ConfigDir/ssh gets scrubbed by ensureVMSSHClientConfig on every + // daemon Open, so regression-guard that the generator never picks + // that path again. + wantKey := filepath.Join(sshDir, "id_ed25519") if cfg.SSHKeyPath != wantKey { t.Fatalf("SSHKeyPath = %q, want %q", cfg.SSHKeyPath, wantKey) } @@ -35,11 +41,160 @@ func TestLoadDefaultsResolveFirecrackerAndGenerateSSHKey(t *testing.T) { t.Fatalf("stat %s: %v", path, err) } } - if cfg.DefaultImageName != "default" { - t.Fatalf("DefaultImageName = %q, want default", cfg.DefaultImageName) + forbiddenKey := filepath.Join(configDir, "ssh", "id_ed25519") + if _, err := os.Stat(forbiddenKey); err == nil { + t.Fatalf("key was also generated at %s; config.Load must not write under ConfigDir/ssh", forbiddenKey) } - if cfg.WebListenAddr != "127.0.0.1:7777" { - t.Fatalf("WebListenAddr = %q", cfg.WebListenAddr) + if cfg.DefaultImageName != "debian-bookworm" { + t.Fatalf("DefaultImageName = %q, want debian-bookworm", cfg.DefaultImageName) + } +} + +func TestLoadSSHKeyPathExpandsHomeAnchored(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + configDir := t.TempDir() + data := []byte("ssh_key_path = \"~/mykeys/id_ed25519\"\n") + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatalf("write config.toml: %v", err) + } + + cfg, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err != nil { + t.Fatalf("Load: %v", err) + } + want := filepath.Join(homeDir, "mykeys", "id_ed25519") + if cfg.SSHKeyPath != want { + t.Fatalf("SSHKeyPath = %q, want %q", cfg.SSHKeyPath, want) + } +} + +func TestLoadDaemonDoesNotGenerateDefaultSSHKey(t *testing.T) { + ownerHome := t.TempDir() + sshDir := filepath.Join(t.TempDir(), "daemon-ssh") + cfg, err := LoadDaemon(paths.Layout{ConfigDir: t.TempDir(), SSHDir: sshDir}, ownerHome) + if err != nil { + t.Fatalf("LoadDaemon: %v", err) + } + wantKey := filepath.Join(sshDir, "id_ed25519") + if cfg.SSHKeyPath != wantKey { + t.Fatalf("SSHKeyPath = %q, want %q", cfg.SSHKeyPath, wantKey) + } + if cfg.HostHomeDir != ownerHome { + t.Fatalf("HostHomeDir = %q, want %q", cfg.HostHomeDir, ownerHome) + } + if _, err := os.Stat(wantKey); !os.IsNotExist(err) { + t.Fatalf("LoadDaemon created %s, want no key material on daemon config load", wantKey) + } +} + +// TestLoadNormalizesAbsoluteSSHKeyPath pins filepath.Clean behaviour +// for configured paths: trailing slashes and duplicate slashes are +// flattened so downstream path comparisons don't see two spellings +// for the same path. +func TestLoadNormalizesAbsoluteSSHKeyPath(t *testing.T) { + cases := []struct { + name string + raw string + want string + }{ + {"trailing slash collapsed", "/tmp/keys/id_ed25519/", "/tmp/keys/id_ed25519"}, + {"duplicate slashes collapsed", "/tmp//keys///id_ed25519", "/tmp/keys/id_ed25519"}, + {"dot segments resolved", "/tmp/keys/./id_ed25519", "/tmp/keys/id_ed25519"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + configDir := t.TempDir() + data := []byte("ssh_key_path = \"" + tc.raw + "\"\n") + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatalf("write config.toml: %v", err) + } + cfg, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err != nil { + t.Fatalf("Load %q: %v", tc.raw, err) + } + if cfg.SSHKeyPath != tc.want { + t.Fatalf("SSHKeyPath = %q, want %q", cfg.SSHKeyPath, tc.want) + } + }) + } +} + +// TestEnsureDefaultSSHKeyRejectsCorruptExistingFile pins the +// "don't silently overwrite" contract: if someone wrote garbage to +// the default key path (or the key was truncated mid-write by a +// previous crash), config.Load must surface the parse error instead +// of pretending the file is usable. The regression we care about is +// a future refactor that adds "regenerate if invalid" silently — +// that would nuke a real user key on every daemon Open. +func TestEnsureDefaultSSHKeyRejectsCorruptExistingFile(t *testing.T) { + sshDir := t.TempDir() + corruptKey := filepath.Join(sshDir, "id_ed25519") + if err := os.WriteFile(corruptKey, []byte("not a pem private key"), 0o600); err != nil { + t.Fatalf("write corrupt key: %v", err) + } + + _, err := Load(paths.Layout{ConfigDir: t.TempDir(), SSHDir: sshDir}) + if err == nil { + t.Fatal("Load: want error when existing key file is not a valid private key") + } + // The error should mention the parse failure, not "regenerated". + if strings.Contains(err.Error(), "regenerat") { + t.Fatalf("Load silently regenerated: %v", err) + } + // Original garbage must still be there — the invariant is "don't + // touch files you can't parse". + data, readErr := os.ReadFile(corruptKey) + if readErr != nil { + t.Fatalf("ReadFile: %v", readErr) + } + if string(data) != "not a pem private key" { + t.Fatalf("key content = %q, want the original garbage", string(data)) + } +} + +// TestResolveSSHKeyPathRejectsEmptySSHDirAndStateDir pins the +// guard in resolveSSHKeyPath: if a caller builds a layout without +// SSHDir and StateDir, they shouldn't get a key generated in cwd. +// The guard existed before (added after a test scribbled into +// internal/config/ssh/); this test prevents it from going away. +func TestResolveSSHKeyPathRejectsEmptySSHDirAndStateDir(t *testing.T) { + _, err := Load(paths.Layout{ConfigDir: t.TempDir()}) + if err == nil { + t.Fatal("Load: want error when neither SSHDir nor StateDir is set") + } + if !strings.Contains(err.Error(), "must be absolute") { + t.Fatalf("Load error = %v, want 'must be absolute' diagnostic", err) + } +} + +func TestLoadRejectsInvalidSSHKeyPath(t *testing.T) { + cases := []struct { + name string + raw string + want string + }{ + {"relative bare", "id_ed25519", "must be absolute"}, + {"relative with dot", "./keys/id_ed25519", "must be absolute"}, + {"bare tilde", "~", "bare '~' is not supported"}, + {"user-tilde", "~other/id_ed25519", "only '~/' is expanded"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + configDir := t.TempDir() + data := []byte("ssh_key_path = \"" + tc.raw + "\"\n") + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatalf("write config.toml: %v", err) + } + _, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err == nil { + t.Fatalf("Load %q: want error containing %q", tc.raw, tc.want) + } + if !strings.Contains(err.Error(), tc.want) { + t.Fatalf("Load %q: error = %v, want contains %q", tc.raw, err, tc.want) + } + }) } } @@ -47,13 +202,11 @@ func TestLoadAppliesConfigOverrides(t *testing.T) { configDir := t.TempDir() data := []byte(` log_level = "debug" -web_listen_addr = "" firecracker_bin = "/opt/firecracker" ssh_key_path = "/tmp/custom-key" -default_image_name = "void-exp" +default_image_name = "void" auto_stop_stale_after = "1h" stats_poll_interval = "15s" -metrics_poll_interval = "30s" bridge_name = "br-test" bridge_ip = "10.0.0.1" cidr = "25" @@ -64,7 +217,7 @@ default_dns = "9.9.9.9" t.Fatalf("write config.toml: %v", err) } - cfg, err := Load(paths.Layout{ConfigDir: configDir}) + cfg, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) if err != nil { t.Fatalf("Load: %v", err) } @@ -72,16 +225,13 @@ default_dns = "9.9.9.9" if cfg.LogLevel != "debug" { t.Fatalf("LogLevel = %q", cfg.LogLevel) } - if cfg.WebListenAddr != "" { - t.Fatalf("WebListenAddr = %q, want empty", cfg.WebListenAddr) - } if cfg.FirecrackerBin != "/opt/firecracker" { t.Fatalf("FirecrackerBin = %q", cfg.FirecrackerBin) } if cfg.SSHKeyPath != "/tmp/custom-key" { t.Fatalf("SSHKeyPath = %q", cfg.SSHKeyPath) } - if cfg.DefaultImageName != "void-exp" { + if cfg.DefaultImageName != "void" { t.Fatalf("DefaultImageName = %q", cfg.DefaultImageName) } if cfg.AutoStopStaleAfter != time.Hour { @@ -90,9 +240,6 @@ default_dns = "9.9.9.9" if cfg.StatsPollInterval != 15*time.Second { t.Fatalf("StatsPollInterval = %s", cfg.StatsPollInterval) } - if cfg.MetricsPollInterval != 30*time.Second { - t.Fatalf("MetricsPollInterval = %s", cfg.MetricsPollInterval) - } if cfg.BridgeName != "br-test" || cfg.BridgeIP != "10.0.0.1" || cfg.CIDR != "25" { t.Fatalf("bridge config = %+v", cfg) } @@ -107,7 +254,7 @@ default_dns = "9.9.9.9" func TestLoadAppliesLogLevelEnvOverride(t *testing.T) { t.Setenv("BANGER_LOG_LEVEL", "warn") - cfg, err := Load(paths.Layout{ConfigDir: t.TempDir()}) + cfg, err := Load(paths.Layout{ConfigDir: t.TempDir(), SSHDir: t.TempDir()}) if err != nil { t.Fatalf("Load: %v", err) } @@ -115,3 +262,234 @@ func TestLoadAppliesLogLevelEnvOverride(t *testing.T) { t.Fatalf("LogLevel = %q, want warn", cfg.LogLevel) } } + +func TestLoadAcceptsFileSyncEntries(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + configDir := t.TempDir() + hostsFile := filepath.Join(homeDir, ".config", "gh", "hosts.yml") + data := []byte(` +[[file_sync]] +host = "~/.aws" +guest = "~/.aws" + +[[file_sync]] +host = "` + hostsFile + `" +guest = "/root/.config/gh/hosts.yml" +mode = "0644" +`) + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatal(err) + } + cfg, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err != nil { + t.Fatalf("Load: %v", err) + } + if len(cfg.FileSync) != 2 { + t.Fatalf("FileSync = %+v", cfg.FileSync) + } + if cfg.FileSync[0].Host != "~/.aws" || cfg.FileSync[0].Guest != "~/.aws" { + t.Fatalf("entry[0] = %+v", cfg.FileSync[0]) + } + if cfg.FileSync[1].Host != hostsFile || cfg.FileSync[1].Guest != "/root/.config/gh/hosts.yml" { + t.Fatalf("entry[1] = %+v", cfg.FileSync[1]) + } + if cfg.FileSync[1].Mode != "0644" { + t.Fatalf("entry[1] mode = %q", cfg.FileSync[1].Mode) + } +} + +func TestLoadDaemonAcceptsFileSyncPathUnderOwnerHome(t *testing.T) { + ownerHome := t.TempDir() + t.Setenv("HOME", t.TempDir()) + + configDir := t.TempDir() + allowed := filepath.Join(ownerHome, ".config", "gh", "hosts.yml") + data := []byte(` +[[file_sync]] +host = "` + allowed + `" +guest = "~/.config/gh/hosts.yml" +`) + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatal(err) + } + + cfg, err := LoadDaemon(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}, ownerHome) + if err != nil { + t.Fatalf("LoadDaemon: %v", err) + } + got, err := ResolveFileSyncHostPath(cfg.FileSync[0].Host, cfg.HostHomeDir) + if err != nil { + t.Fatalf("ResolveFileSyncHostPath: %v", err) + } + if got != allowed { + t.Fatalf("resolved host path = %q, want %q", got, allowed) + } +} + +func TestLoadRejectsInvalidFileSyncEntries(t *testing.T) { + cases := []struct { + name string + toml string + want string + }{ + { + "empty host", + `[[file_sync]]` + "\n" + `host = ""` + "\n" + `guest = "~/foo"`, + "host path is required", + }, + { + "empty guest", + `[[file_sync]]` + "\n" + `host = "~/foo"` + "\n" + `guest = ""`, + "guest path is required", + }, + { + "relative host", + `[[file_sync]]` + "\n" + `host = "foo/bar"` + "\n" + `guest = "~/foo"`, + "must be absolute", + }, + { + "guest outside /root", + `[[file_sync]]` + "\n" + `host = "~/x"` + "\n" + `guest = "/etc/resolv.conf"`, + "must be under /root or ~/", + }, + { + "path traversal", + `[[file_sync]]` + "\n" + `host = "~/../secrets"` + "\n" + `guest = "~/secrets"`, + "'..' segments", + }, + { + "tilde user", + `[[file_sync]]` + "\n" + `host = "~other/foo"` + "\n" + `guest = "~/foo"`, + "only '~/' is expanded", + }, + { + "invalid mode", + `[[file_sync]]` + "\n" + `host = "~/x"` + "\n" + `guest = "~/x"` + "\n" + `mode = "rwx"`, + "must be octal", + }, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + configDir := t.TempDir() + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), []byte(tc.toml+"\n"), 0o644); err != nil { + t.Fatal(err) + } + _, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err == nil { + t.Fatalf("Load: want error containing %q", tc.want) + } + if !strings.Contains(err.Error(), tc.want) { + t.Fatalf("Load error = %v, want contains %q", err, tc.want) + } + }) + } +} + +func TestLoadRejectsFileSyncHostOutsideHome(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + configDir := t.TempDir() + data := []byte(` +[[file_sync]] +host = "/etc/resolv.conf" +guest = "~/resolv.conf" +`) + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatal(err) + } + _, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err == nil { + t.Fatal("Load: want error for host path outside home") + } + if !strings.Contains(err.Error(), "owner home") { + t.Fatalf("Load error = %v, want owner-home diagnostic", err) + } +} + +func TestLoadDaemonRejectsFileSyncHostOutsideOwnerHome(t *testing.T) { + ownerHome := t.TempDir() + t.Setenv("HOME", t.TempDir()) + + configDir := t.TempDir() + outside := filepath.Join(t.TempDir(), "secret.txt") + data := []byte(` +[[file_sync]] +host = "` + outside + `" +guest = "~/secret.txt" +`) + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatal(err) + } + _, err := LoadDaemon(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}, ownerHome) + if err == nil { + t.Fatal("LoadDaemon: want error for host path outside owner home") + } + if !strings.Contains(err.Error(), "owner home") { + t.Fatalf("LoadDaemon error = %v, want owner-home diagnostic", err) + } +} + +func TestLoadAcceptsVMDefaults(t *testing.T) { + configDir := t.TempDir() + data := []byte(` +[vm_defaults] +vcpu = 4 +memory_mib = 4096 +disk_size = "16G" +system_overlay_size = "12G" +`) + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), data, 0o644); err != nil { + t.Fatal(err) + } + cfg, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}) + if err != nil { + t.Fatalf("Load: %v", err) + } + if cfg.VMDefaults.VCPUCount != 4 { + t.Errorf("VCPUCount = %d, want 4", cfg.VMDefaults.VCPUCount) + } + if cfg.VMDefaults.MemoryMiB != 4096 { + t.Errorf("MemoryMiB = %d, want 4096", cfg.VMDefaults.MemoryMiB) + } + if cfg.VMDefaults.WorkDiskSizeBytes != 16*1024*1024*1024 { + t.Errorf("WorkDiskSizeBytes = %d, want 16 GiB", cfg.VMDefaults.WorkDiskSizeBytes) + } + if cfg.VMDefaults.SystemOverlaySizeByte != 12*1024*1024*1024 { + t.Errorf("SystemOverlaySizeByte = %d, want 12 GiB", cfg.VMDefaults.SystemOverlaySizeByte) + } +} + +func TestLoadEmptyVMDefaultsLeavesZeros(t *testing.T) { + // No [vm_defaults] block → cfg.VMDefaults is the zero value, + // which the resolver will map to auto or builtin. + cfg, err := Load(paths.Layout{ConfigDir: t.TempDir(), SSHDir: t.TempDir()}) + if err != nil { + t.Fatalf("Load: %v", err) + } + if cfg.VMDefaults.VCPUCount != 0 || cfg.VMDefaults.MemoryMiB != 0 { + t.Errorf("VMDefaults = %+v, want zeroed", cfg.VMDefaults) + } +} + +func TestLoadRejectsNegativeVMDefaults(t *testing.T) { + cases := map[string]string{ + "vcpu": `[vm_defaults]` + "\n" + `vcpu = -1`, + "memory": `[vm_defaults]` + "\n" + `memory_mib = -1`, + "disk_size": `[vm_defaults]` + "\n" + `disk_size = "banana"`, + "overlay": `[vm_defaults]` + "\n" + `system_overlay_size = "banana"`, + } + for name, body := range cases { + t.Run(name, func(t *testing.T) { + configDir := t.TempDir() + if err := os.WriteFile(filepath.Join(configDir, "config.toml"), []byte(body+"\n"), 0o644); err != nil { + t.Fatal(err) + } + if _, err := Load(paths.Layout{ConfigDir: configDir, SSHDir: t.TempDir()}); err == nil { + t.Fatal("expected error") + } + }) + } +} diff --git a/internal/daemon/ARCHITECTURE.md b/internal/daemon/ARCHITECTURE.md new file mode 100644 index 0000000..623849c --- /dev/null +++ b/internal/daemon/ARCHITECTURE.md @@ -0,0 +1,217 @@ +# `internal/daemon` architecture + +This document describes the current daemon package layout: the `Daemon` +composition root, the four services it wires together, the subpackages +that own stateless helpers, the privileged-ops seam used by the +supported system install, and the lock ordering every caller must +respect. + +## Supported service topology + +On the supported host path (`banger system install` on a `systemd` +host), banger runs as two cooperating services: + +- `bangerd.service` runs as the configured owner user. It owns the + public RPC socket, store, image state, workspace prep, and the + lifecycle state machine. +- `bangerd-root.service` runs as root. It owns only the privileged + host-kernel operations: bridge/tap, NAT/resolver routing, dm/loop + snapshot plumbing, privileged ext4 mutation on dm devices, and + firecracker process/socket ownership. + +The owner daemon talks to the root helper through the `privilegedOps` +seam. Non-system/dev paths still use the same seam, but it is backed +by an in-process adapter instead of the helper RPC client. + +## Composition + +`Daemon` is a thin composition root. It holds shared infrastructure +(store, runner, logger, layout, config, listener, privileged-ops +adapter) plus pointers to four focused services. RPC dispatch is a +pure forwarder into those services; no lifecycle / image / workspace / +networking behaviour lives on `*Daemon` itself. + +``` +Daemon +├── *HostNetwork — bridge, tap pool, NAT, DNS, firecracker process, +│ DM snapshots, vsock readiness +├── *ImageService — register, promote, delete, pull (bundle + OCI), +│ kernel catalog, managed-seed refresh +├── *WorkspaceService — workspace.prepare / workspace.export, auth-key +│ + git-identity sync onto the work disk +└── *VMService — VM lifecycle (create/start/stop/restart/kill/ + delete/set), stats polling, ports query, + handle cache, per-VM lock set, create-op + registry, preflight validation +``` + +Each service owns its own state. Cross-service calls go through narrow +consumer-defined seams: + +- `WorkspaceService` does not hold a `*VMService` pointer. It takes + function-typed deps (`vmResolver`, `aliveChecker`, `withVMLockByRef`, + `imageResolver`, `imageWorkSeed`) so it sees exactly the operations + it needs and nothing more. Those deps are captured as closures so + construction-order cycles don't recur. +- `VMService` holds direct pointers to `*HostNetwork`, `*ImageService`, + and `*WorkspaceService`. Orchestrating a VM start really does compose + all three (bridge + tap + image resolution + work-disk sync), and + declaring a function-typed interface for every call would balloon + the surface for no win — services are unexported, so package-external + code can never reach them. +- Capability hooks do not take `*Daemon`. Each capability is a struct + with explicit service-pointer fields (`workDiskCapability{vm, ws, + store, defaultImageName}`, `dnsCapability{net}`, `natCapability{vm, + net, logger}`) populated at wiring time. `VMService` invokes them + through a `capabilityHooks` struct (function-typed bag) populated at + construction; neither the service nor any capability has a `*Daemon` + pointer. + +Services + capabilities are built eagerly by `wireServices(d)`, called +once from `Daemon.Open` after the composition root's infrastructure is +populated, and once per test that constructs a `&Daemon{...}` literal. +Tests that want to stub a particular service or the capability list +assign the field before calling `wireServices` — the helper is +idempotent and skips anything already set. + +## Service state + +### `HostNetwork` (`host_network.go`, `nat.go`, `dns_routing.go`, `tap_pool.go`, `snapshot.go`) + +- `tapPool` — TAP interface pool, owns its own lock. +- `vmDNS *vmdns.Server` — in-process DNS server for `.vm` names. +- `privilegedOps` — the host-kernel seam used for bridge/tap/NAT, + resolver routing, dm snapshots, privileged ext4 mutation, and + firecracker ownership/kill flows. +- No direct VM-state access. Where an operation needs a VM's tap name + (e.g. `ensureNAT`), the signature takes `guestIP` + `tap` string so + the caller (VMService) resolves them first. + +### `ImageService` (`image_service.go`, `images.go`, `images_pull.go`, `image_seed.go`, `kernels.go`) + +- `imageOpsMu sync.Mutex` — the publication-window lock. Held only + across the recheck-name + atomic-rename + UpsertImage commit atom. + Slow work (network fetch, ext4 build, SSH-key seeding) runs unlocked. +- Test seams `pullAndFlatten`, `finalizePulledRootfs`, `bundleFetch` + are struct fields (not package globals), so tests inject per-instance + fakes. + +### `WorkspaceService` (`workspace_service.go`, `workspace.go`, `vm_authsync.go`) + +- `workspaceLocks vmLockSet` — per-VM mutex scoped to + `workspace.prepare` / `workspace.export`. These ops acquire + `vmLocks[id]` (on VMService) only long enough to validate VM state + and snapshot the fields they need, then release it and acquire + `workspaceLocks[id]` for the slow guest I/O phase. That keeps + `vm stop` / `delete` / `restart` from queueing behind a running tar + import. +- Test seams `workspaceInspectRepo`, `workspaceImport` are per-instance + fields. + +### `VMService` (`vm_service.go`, `vm_lifecycle.go`, `vm_create.go`, `vm_create_ops.go`, `vm_stats.go`, `vm_set.go`, `vm_disk.go`, `vm_handles.go`, `vm_authsync.go` (via WorkspaceService), `preflight.go`, `ports.go`, `vm.go`) + +- `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID. Held for + the **entire lifecycle op** on that VM: `start` holds it across + preflight, bridge setup, firecracker spawn, and post-boot wiring + (seconds to tens of seconds). Two `start`/`stop`/`delete`/`set` + calls against the same VM therefore serialise; calls against + different VMs run independently. +- `createVMMu sync.Mutex` — narrow **reservation** mutex. `CreateVM` + resolves the image (possibly auto-pulling, which self-locks on + `imageOpsMu`) and parses sizing flags outside this lock, then holds + `createVMMu` only to re-check that the requested VM name is still + free, allocate the next guest IP, and insert the initial "created" + row. The subsequent boot flow runs under the per-VM lock only. +- `createOps opstate.Registry[*vmCreateOperationState]` — in-flight + async create operations; owns its own lock. +- `handles *handleCache` — in-memory map of per-VM transient kernel/ + process handles (PID, tap device, loop devices, DM target). Each + VM directory holds a small `handles.json` scratch file so the + cache can be rebuilt at daemon startup. +- `vsockHostDevice` — path to `/dev/vhost-vsock` the preflight and + doctor checks RequireFile against. Defaulted in wireServices; + tests point at a tempfile to make the check pass without the + kernel module loaded. Guest-SSH test seams live on `*Daemon` + (`d.guestWaitForSSH`, `d.guestDial`), not VMService — workspace + prepare is the only path that reaches guest SSH, and it gets + there through closures WorkspaceService captured at wiring time. + +## Subpackages + +Stateless helpers with no need for a service pointer live in +subpackages. Each takes explicit dependencies (typically a +`system.Runner`-compatible interface) and holds no global state beyond +small test seams. + +| Subpackage | Purpose | +| ---------------------------- | ---------------------------------------------------------------------- | +| `internal/daemon/opstate` | Generic `Registry[T AsyncOp]` for async-operation bookkeeping. | +| `internal/daemon/dmsnap` | Device-mapper COW snapshot create/cleanup/remove. | +| `internal/daemon/fcproc` | Firecracker process primitives (bridge, tap, binary, PID, kill, wait). | +| `internal/daemon/imagemgr` | Image subsystem pure helpers: validators, staging, build script gen. | +| `internal/daemon/workspace` | Workspace helpers: git inspection, copy prep, guest import script. | + +All subpackages are leaves — no intra-daemon subpackage imports another. + +## Lock ordering + +Acquire in this order, release in reverse. Never acquire in the +opposite direction. + +``` +VMService.vmLocks[id] → WorkspaceService.workspaceLocks[id] + → {VMService.createVMMu, ImageService.imageOpsMu} + → subsystem-local locks +``` + +`vmLocks[id]` and `workspaceLocks[id]` are NEVER held at the same +time. `workspace.prepare` acquires `vmLocks[id]` just long enough to +validate VM state, releases it, then acquires `workspaceLocks[id]` +for the guest I/O phase. Regular lifecycle ops (`start`, `stop`, +`delete`, `set`) do NOT do this split — they hold `vmLocks[id]` +across the whole flow. + +Subsystem-local locks (`tapPool.mu`, `opstate.Registry` mu, +`handleCache.mu`) are leaves. They do not contend with each other. + +Notes: + +- `vmLocks[id]` is the outer lock for any operation scoped to a single + VM. Acquired via `VMService.withVMLockByID` / `withVMLockByRef`. The + callback runs under the lock — treat the whole function body as + critical section. +- `createVMMu` is held only across the VM-name reservation + IP + allocation + initial UpsertVM. Image resolution and the full boot + flow happen outside it. +- `imageOpsMu` is held only across the publication atom (recheck name + + atomic rename + UpsertImage, or the equivalent for Register / + Promote / Delete). Network fetch, ext4 build, and file copies run + unlocked. +- Holding a subsystem-local lock while calling into guest SSH is + discouraged; copy needed state out under the lock and release before + blocking I/O. + +## Reconcile and background work + +`Daemon.reconcile(ctx)` is the orchestrator run at startup. It +rehydrates the handle cache, reaps stale VMs, and republishes DNS +records. `Daemon.backgroundLoop()` is the ticker fan-out — +`VMService.pollStats`, `VMService.stopStaleVMs`, and +`VMService.pruneVMCreateOperations` run on independent tickers. On the +supported system path, any reconcile-time host cleanup that needs +privilege goes through `privilegedOps`, not directly through the owner +daemon process. + +## External API + +Only `internal/cli` imports this package. The surface is: + +- `daemon.Open(ctx) (*Daemon, error)` +- `daemon.OpenSystem(ctx) (*Daemon, error)` +- `(*Daemon).Serve(ctx) error` +- `(*Daemon).Close() error` +- `daemon.Doctor(...)` — host diagnostics (no receiver). + +All other methods live on the four services and are reached only +through the RPC `dispatch` switch in `daemon.go`. They are free to +move/rename during refactoring. diff --git a/internal/daemon/autopull_test.go b/internal/daemon/autopull_test.go new file mode 100644 index 0000000..6907eff --- /dev/null +++ b/internal/daemon/autopull_test.go @@ -0,0 +1,153 @@ +package daemon + +import ( + "context" + "errors" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/imagecat" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" +) + +func TestFindOrAutoPullImageReturnsLocalWithoutPulling(t *testing.T) { + d := &Daemon{ + layout: paths.Layout{ImagesDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: func(context.Context, string, imagecat.CatEntry) (imagecat.Manifest, error) { + t.Fatal("bundleFetch should not be called when image is local") + return imagecat.Manifest{}, nil + }, + } + wireServices(d) + id, _ := model.NewID() + if err := d.store.UpsertImage(context.Background(), model.Image{ + ID: id, + Name: "my-local-image", + CreatedAt: model.Now(), + UpdatedAt: model.Now(), + }); err != nil { + t.Fatal(err) + } + image, err := d.vm.findOrAutoPullImage(context.Background(), "my-local-image") + if err != nil { + t.Fatalf("findOrAutoPullImage: %v", err) + } + if image.Name != "my-local-image" { + t.Fatalf("Name = %q, want my-local-image", image.Name) + } +} + +func TestFindOrAutoPullImagePullsFromCatalog(t *testing.T) { + imagesDir := t.TempDir() + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "generic-6.12") + + pullCalls := 0 + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, KernelsDir: kernelsDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: func(ctx context.Context, destDir string, entry imagecat.CatEntry) (imagecat.Manifest, error) { + pullCalls++ + return stubBundleFetch(imagecat.Manifest{KernelRef: "generic-6.12"})(ctx, destDir, entry) + }, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + // "debian-bookworm" is in the embedded imagecat catalog. + image, err := d.vm.findOrAutoPullImage(context.Background(), "debian-bookworm") + if err != nil { + t.Fatalf("findOrAutoPullImage: %v", err) + } + if image.Name != "debian-bookworm" { + t.Fatalf("Name = %q, want debian-bookworm", image.Name) + } + if pullCalls != 1 { + t.Fatalf("bundleFetch calls = %d, want 1", pullCalls) + } +} + +func TestFindOrAutoPullImageReturnsOriginalErrorWhenNotInCatalog(t *testing.T) { + d := &Daemon{ + layout: paths.Layout{ImagesDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + wireServices(d) + _, err := d.vm.findOrAutoPullImage(context.Background(), "not-in-catalog-or-store") + if err == nil || !strings.Contains(err.Error(), "not found") { + t.Fatalf("err = %v, want not-found", err) + } +} + +func TestReadOrAutoPullKernelReturnsLocalWithoutPulling(t *testing.T) { + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "generic-6.12") + d := &Daemon{layout: paths.Layout{KernelsDir: kernelsDir}} + wireServices(d) + + entry, err := d.img.readOrAutoPullKernel(context.Background(), "generic-6.12") + if err != nil { + t.Fatalf("readOrAutoPullKernel: %v", err) + } + if entry.Name != "generic-6.12" { + t.Fatalf("Name = %q", entry.Name) + } +} + +func TestReadOrAutoPullKernelErrorsWhenNotInCatalog(t *testing.T) { + d := &Daemon{layout: paths.Layout{KernelsDir: t.TempDir()}} + wireServices(d) + _, err := d.img.readOrAutoPullKernel(context.Background(), "nonexistent-kernel") + if err == nil || !strings.Contains(err.Error(), "not found") { + t.Fatalf("err = %v, want not-found", err) + } +} + +// TestReadOrAutoPullKernelSurfacesNonNotExistError covers the path where +// kernelcat.ReadLocal fails for a reason other than missing entry (e.g. +// corrupt manifest); the autopull logic should NOT try to fetch in that +// case since the entry clearly exists in some broken form. +func TestReadOrAutoPullKernelSurfacesNonNotExistError(t *testing.T) { + kernelsDir := t.TempDir() + // Seed a manifest that doesn't match the entry's own Name field — + // kernelcat.ReadLocal returns an error, not os.ErrNotExist. + dir := filepath.Join(kernelsDir, "broken-kernel") + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(dir, "manifest.json"), []byte(`{"name":"different-name"}`), 0o644); err != nil { + t.Fatal(err) + } + d := &Daemon{layout: paths.Layout{KernelsDir: kernelsDir}} + wireServices(d) + _, err := d.img.readOrAutoPullKernel(context.Background(), "broken-kernel") + if err == nil { + t.Fatal("want error") + } + // Must not be wrapped in an "auto-pull" message — the corrupt-manifest + // failure should surface as the primary cause. + if strings.Contains(err.Error(), "not found in catalog") { + t.Fatalf("err = %v, should not claim 'not in catalog'", err) + } + // Sanity: ensure it's not os.ErrNotExist-compatible. + if errors.Is(err, os.ErrNotExist) { + t.Fatalf("err = %v, should not be os.ErrNotExist", err) + } +} diff --git a/internal/daemon/capabilities.go b/internal/daemon/capabilities.go index 78031e3..b99ba4a 100644 --- a/internal/daemon/capabilities.go +++ b/internal/daemon/capabilities.go @@ -3,23 +3,34 @@ package daemon import ( "context" "errors" + "log/slog" "net" "os" "strings" + "time" + + "github.com/miekg/dns" "banger/internal/firecracker" "banger/internal/guestconfig" "banger/internal/model" + "banger/internal/store" "banger/internal/system" "banger/internal/vmdns" ) +// vmCapability is the base capability tag. Actual behaviour lives on +// optional sub-interfaces (startPreflight / guestConfig / machineConfig +// / prepareHost / postStart / cleanup / configChange / doctor); a +// capability implements whichever subset it cares about. None of them +// take *Daemon — each capability is a struct constructed with its +// explicit service-pointer dependencies at wireServices time. type vmCapability interface { Name() string } type startPreflightCapability interface { - AddStartPreflight(context.Context, *Daemon, *system.Preflight, model.VMRecord, model.Image) + AddStartPreflight(context.Context, *system.Preflight, model.VMRecord, model.Image) } type guestConfigCapability interface { @@ -31,47 +42,48 @@ type machineConfigCapability interface { } type prepareHostCapability interface { - PrepareHost(context.Context, *Daemon, *model.VMRecord, model.Image) error + PrepareHost(context.Context, *model.VMRecord, model.Image) error } type postStartCapability interface { - PostStart(context.Context, *Daemon, model.VMRecord, model.Image) error + PostStart(context.Context, model.VMRecord, model.Image) error } type cleanupCapability interface { - Cleanup(context.Context, *Daemon, model.VMRecord) error + Cleanup(context.Context, model.VMRecord) error } type configChangeCapability interface { - ApplyConfigChange(context.Context, *Daemon, model.VMRecord, model.VMRecord) error + ApplyConfigChange(context.Context, model.VMRecord, model.VMRecord) error } type doctorCapability interface { - AddDoctorChecks(context.Context, *Daemon, *system.Report) + AddDoctorChecks(context.Context, *system.Report) } -func (d *Daemon) registeredCapabilities() []vmCapability { - if len(d.vmCaps) > 0 { - return d.vmCaps - } +// defaultCapabilities builds the production capability list from +// already-constructed services. Called from wireServices once d.vm / +// d.ws / d.net are populated, so every capability ships with the +// concrete service pointers it needs and none of them reach through +// *Daemon at dispatch time. +func (d *Daemon) defaultCapabilities() []vmCapability { return []vmCapability{ - workDiskCapability{}, - opencodeCapability{}, - dnsCapability{}, - natCapability{}, + newWorkDiskCapability(d.vm, d.ws, d.store, d.config.DefaultImageName), + newDNSCapability(d.net), + newNATCapability(d.vm, d.net, d.logger), } } func (d *Daemon) addCapabilityStartPrereqs(ctx context.Context, checks *system.Preflight, vm model.VMRecord, image model.Image) { - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { if hook, ok := capability.(startPreflightCapability); ok { - hook.AddStartPreflight(ctx, d, checks, vm, image) + hook.AddStartPreflight(ctx, checks, vm, image) } } } func (d *Daemon) contributeGuestConfig(builder *guestconfig.Builder, vm model.VMRecord, image model.Image) { - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { if hook, ok := capability.(guestConfigCapability); ok { hook.ContributeGuest(builder, vm, image) } @@ -79,7 +91,7 @@ func (d *Daemon) contributeGuestConfig(builder *guestconfig.Builder, vm model.VM } func (d *Daemon) contributeMachineConfig(cfg *firecracker.MachineConfig, vm model.VMRecord, image model.Image) { - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { if hook, ok := capability.(machineConfigCapability); ok { hook.ContributeMachine(cfg, vm, image) } @@ -87,13 +99,13 @@ func (d *Daemon) contributeMachineConfig(cfg *firecracker.MachineConfig, vm mode } func (d *Daemon) prepareCapabilityHosts(ctx context.Context, vm *model.VMRecord, image model.Image) error { - prepared := make([]vmCapability, 0, len(d.registeredCapabilities())) - for _, capability := range d.registeredCapabilities() { + prepared := make([]vmCapability, 0, len(d.vmCaps)) + for _, capability := range d.vmCaps { hook, ok := capability.(prepareHostCapability) if !ok { continue } - if err := hook.PrepareHost(ctx, d, vm, image); err != nil { + if err := hook.PrepareHost(ctx, vm, image); err != nil { d.cleanupPreparedCapabilities(context.Background(), vm, prepared) return err } @@ -103,7 +115,7 @@ func (d *Daemon) prepareCapabilityHosts(ctx context.Context, vm *model.VMRecord, } func (d *Daemon) postStartCapabilities(ctx context.Context, vm model.VMRecord, image model.Image) error { - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { switch capability.Name() { case "dns": vmCreateStage(ctx, "apply_dns", "publishing vm dns record") @@ -113,7 +125,7 @@ func (d *Daemon) postStartCapabilities(ctx context.Context, vm model.VMRecord, i } } if hook, ok := capability.(postStartCapability); ok { - if err := hook.PostStart(ctx, d, vm, image); err != nil { + if err := hook.PostStart(ctx, vm, image); err != nil { return err } } @@ -122,7 +134,7 @@ func (d *Daemon) postStartCapabilities(ctx context.Context, vm model.VMRecord, i } func (d *Daemon) cleanupCapabilityState(ctx context.Context, vm model.VMRecord) error { - return d.cleanupPreparedCapabilities(ctx, &vm, d.registeredCapabilities()) + return d.cleanupPreparedCapabilities(ctx, &vm, d.vmCaps) } func (d *Daemon) cleanupPreparedCapabilities(ctx context.Context, vm *model.VMRecord, capabilities []vmCapability) error { @@ -132,15 +144,24 @@ func (d *Daemon) cleanupPreparedCapabilities(ctx context.Context, vm *model.VMRe if !ok { continue } - err = joinErr(err, hook.Cleanup(ctx, d, *vm)) + cleanupErr := hook.Cleanup(ctx, *vm) + if cleanupErr != nil && d.logger != nil { + // Log per-capability cleanup failures. The aggregate + // errors.Join return value is still the contract for + // callers, but a multi-failure cleanup hides which + // capability misbehaved unless we surface each one + // individually here. + d.logger.Warn("capability cleanup failed", append(vmLogAttrs(*vm), "capability", capabilities[index].Name(), "error", cleanupErr.Error())...) + } + err = joinErr(err, cleanupErr) } return err } func (d *Daemon) applyCapabilityConfigChanges(ctx context.Context, before, after model.VMRecord) error { - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { if hook, ok := capability.(configChangeCapability); ok { - if err := hook.ApplyConfigChange(ctx, d, before, after); err != nil { + if err := hook.ApplyConfigChange(ctx, before, after); err != nil { return err } } @@ -149,18 +170,37 @@ func (d *Daemon) applyCapabilityConfigChanges(ctx context.Context, before, after } func (d *Daemon) addCapabilityDoctorChecks(ctx context.Context, report *system.Report) { - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { if hook, ok := capability.(doctorCapability); ok { - hook.AddDoctorChecks(ctx, d, report) + hook.AddDoctorChecks(ctx, report) } } } -type workDiskCapability struct{} +// workDiskCapability provisions a per-VM work disk (image-seeded or +// freshly formatted) and syncs host-side authorised keys + git +// identity + file_sync entries onto it. Holds pointers to the VM and +// workspace services because PrepareHost orchestrates across both, +// plus the store + default image name for its doctor check. +type workDiskCapability struct { + vm *VMService + ws *WorkspaceService + store *store.Store + defaultImageName string +} + +func newWorkDiskCapability(vm *VMService, ws *WorkspaceService, st *store.Store, defaultImageName string) workDiskCapability { + return workDiskCapability{ + vm: vm, + ws: ws, + store: st, + defaultImageName: defaultImageName, + } +} func (workDiskCapability) Name() string { return "work-disk" } -func (workDiskCapability) AddStartPreflight(_ context.Context, _ *Daemon, checks *system.Preflight, vm model.VMRecord, image model.Image) { +func (workDiskCapability) AddStartPreflight(_ context.Context, checks *system.Preflight, vm model.VMRecord, image model.Image) { if exists(vm.Runtime.WorkDiskPath) { return } @@ -199,20 +239,26 @@ func (workDiskCapability) ContributeMachine(cfg *firecracker.MachineConfig, vm m }) } -func (workDiskCapability) PrepareHost(ctx context.Context, d *Daemon, vm *model.VMRecord, image model.Image) error { - prep, err := d.ensureWorkDisk(ctx, vm, image) +func (c workDiskCapability) PrepareHost(ctx context.Context, vm *model.VMRecord, image model.Image) error { + prep, err := c.vm.ensureWorkDisk(ctx, vm, image) if err != nil { return err } - if err := d.ensureAuthorizedKeyOnWorkDisk(ctx, vm, image, prep); err != nil { + if err := c.ws.ensureAuthorizedKeyOnWorkDisk(ctx, vm, image, prep); err != nil { return err } - return d.ensureOpencodeAuthOnWorkDisk(ctx, vm) + if err := c.ws.ensureHushLoginOnWorkDisk(ctx, vm); err != nil { + return err + } + if err := c.ws.ensureGitIdentityOnWorkDisk(ctx, vm); err != nil { + return err + } + return c.ws.runFileSync(ctx, vm) } -func (workDiskCapability) AddDoctorChecks(_ context.Context, d *Daemon, report *system.Report) { - if d.store != nil && strings.TrimSpace(d.config.DefaultImageName) != "" { - if image, err := d.store.GetImageByName(context.Background(), d.config.DefaultImageName); err == nil && strings.TrimSpace(image.WorkSeedPath) != "" && exists(image.WorkSeedPath) { +func (c workDiskCapability) AddDoctorChecks(_ context.Context, report *system.Report) { + if c.store != nil && strings.TrimSpace(c.defaultImageName) != "" { + if image, err := c.store.GetImageByName(context.Background(), c.defaultImageName); err == nil && strings.TrimSpace(image.WorkSeedPath) != "" && exists(image.WorkSeedPath) { checks := system.NewPreflight() checks.RequireFile(image.WorkSeedPath, "default image work-seed", `rebuild the default image to regenerate the /root seed`) report.AddPreflight("feature /root work disk", checks, "seeded /root work disk artifact available") @@ -220,30 +266,46 @@ func (workDiskCapability) AddDoctorChecks(_ context.Context, d *Daemon, report * } } checks := system.NewPreflight() - for _, command := range []string{"mkfs.ext4", "mount", "umount", "cp"} { + for _, command := range []string{"truncate", "mkfs.ext4"} { checks.RequireCommand(command, toolHint(command)) } report.AddPreflight("feature /root work disk", checks, "fallback /root work disk tooling available") - report.AddWarn("feature /root work disk", "default image has no work-seed artifact; new VM creates will be slower until the image is rebuilt") + report.AddWarn("feature /root work disk", "default image has no work-seed artifact; guest /root will be empty until the image is rebuilt") } -type dnsCapability struct{} +// dnsCapability publishes + removes .vm records on the in-process +// DNS server. Only needs HostNetwork. +type dnsCapability struct { + net *HostNetwork +} + +func newDNSCapability(net *HostNetwork) dnsCapability { + return dnsCapability{net: net} +} func (dnsCapability) Name() string { return "dns" } -func (dnsCapability) PostStart(ctx context.Context, d *Daemon, vm model.VMRecord, _ model.Image) error { - return d.setDNS(ctx, vm.Name, vm.Runtime.GuestIP) +func (c dnsCapability) PostStart(ctx context.Context, vm model.VMRecord, _ model.Image) error { + return c.net.setDNS(ctx, vm.Name, vm.Runtime.GuestIP) } -func (dnsCapability) Cleanup(ctx context.Context, d *Daemon, vm model.VMRecord) error { - return d.removeDNS(ctx, vm.Runtime.DNSName) +func (c dnsCapability) Cleanup(_ context.Context, vm model.VMRecord) error { + return c.net.removeDNS(vm.Runtime.DNSName) } -func (dnsCapability) AddDoctorChecks(_ context.Context, _ *Daemon, report *system.Report) { +func (dnsCapability) AddDoctorChecks(_ context.Context, report *system.Report) { conn, err := net.ListenPacket("udp", vmdns.DefaultListenAddr) if err != nil { if strings.Contains(strings.ToLower(err.Error()), "address already in use") { - report.AddWarn("feature vm dns", "listener address "+vmdns.DefaultListenAddr+" is already in use") + // "Already in use" is the expected state when banger's own + // daemon is running. Probe the listener with a *.vm query + // the banger DNS server is the only thing on the host + // authoritative for, and pass if the response shape matches. + if probeBangerDNS(vmdns.DefaultListenAddr) { + report.AddPass("feature vm dns", "banger DNS server is already serving "+vmdns.DefaultListenAddr) + return + } + report.AddWarn("feature vm dns", "listener address "+vmdns.DefaultListenAddr+" is held by another process") return } report.AddFail("feature vm dns", "cannot bind "+vmdns.DefaultListenAddr+": "+err.Error()) @@ -253,56 +315,91 @@ func (dnsCapability) AddDoctorChecks(_ context.Context, _ *Daemon, report *syste report.AddPass("feature vm dns", "listener can bind "+vmdns.DefaultListenAddr) } -type natCapability struct{} +// probeBangerDNS returns true iff a UDP DNS query to addr is answered +// by something that behaves like banger's vmdns server: a *.vm name +// produces an authoritative NXDOMAIN. Any other listener (a stub +// resolver, a different DNS server) either refuses, recurses, or +// returns non-authoritative — all distinguishable from this probe. +func probeBangerDNS(addr string) bool { + client := &dns.Client{Net: "udp", Timeout: 500 * time.Millisecond} + req := new(dns.Msg) + req.SetQuestion("doctor-probe-not-a-real-vm.vm.", dns.TypeA) + resp, _, err := client.Exchange(req, addr) + if err != nil || resp == nil { + return false + } + return resp.Authoritative && resp.Rcode == dns.RcodeNameError +} + +// natCapability sets up host-side NAT so guest traffic can reach the +// outside world. Needs VMService (tap lookup + aliveness) and +// HostNetwork (NAT rules), plus the daemon logger for the cleanup +// short-circuit note. +type natCapability struct { + vm *VMService + net *HostNetwork + logger *slog.Logger +} + +func newNATCapability(vm *VMService, net *HostNetwork, logger *slog.Logger) natCapability { + return natCapability{vm: vm, net: net, logger: logger} +} func (natCapability) Name() string { return "nat" } -func (natCapability) AddStartPreflight(ctx context.Context, d *Daemon, checks *system.Preflight, vm model.VMRecord, _ model.Image) { +func (c natCapability) AddStartPreflight(ctx context.Context, checks *system.Preflight, vm model.VMRecord, _ model.Image) { if !vm.Spec.NATEnabled { return } - d.addNATPrereqs(ctx, checks) + c.net.addNATPrereqs(ctx, checks) } -func (natCapability) PostStart(ctx context.Context, d *Daemon, vm model.VMRecord, _ model.Image) error { +func (c natCapability) PostStart(ctx context.Context, vm model.VMRecord, _ model.Image) error { if !vm.Spec.NATEnabled { return nil } - return d.ensureNAT(ctx, vm, true) + return c.net.ensureNAT(ctx, vm.Runtime.GuestIP, c.vm.vmHandles(vm.ID).TapDevice, true) } -func (natCapability) Cleanup(ctx context.Context, d *Daemon, vm model.VMRecord) error { +func (c natCapability) Cleanup(ctx context.Context, vm model.VMRecord) error { if !vm.Spec.NATEnabled { return nil } - if strings.TrimSpace(vm.Runtime.GuestIP) == "" || strings.TrimSpace(vm.Runtime.TapDevice) == "" { - if d.logger != nil { - d.logger.Debug("skipping nat cleanup without runtime network handles", append(vmLogAttrs(vm), "guest_ip", vm.Runtime.GuestIP, "tap_device", vm.Runtime.TapDevice)...) + // Handle cache is volatile across daemon restarts; Runtime is + // the persisted DB-backed copy. Fall back so a crash / corrupt + // handles.json doesn't leak iptables rules keyed off the tap. + tap := strings.TrimSpace(c.vm.vmHandles(vm.ID).TapDevice) + if tap == "" { + tap = strings.TrimSpace(vm.Runtime.TapDevice) + } + if strings.TrimSpace(vm.Runtime.GuestIP) == "" || tap == "" { + if c.logger != nil { + c.logger.Debug("skipping nat cleanup without runtime network handles", append(vmLogAttrs(vm), "guest_ip", vm.Runtime.GuestIP, "tap_device", tap)...) } return nil } - return d.ensureNAT(ctx, vm, false) + return c.net.ensureNAT(ctx, vm.Runtime.GuestIP, tap, false) } -func (natCapability) ApplyConfigChange(ctx context.Context, d *Daemon, before, after model.VMRecord) error { +func (c natCapability) ApplyConfigChange(ctx context.Context, before, after model.VMRecord) error { if before.Spec.NATEnabled == after.Spec.NATEnabled { return nil } - if after.State != model.VMStateRunning || !system.ProcessRunning(after.Runtime.PID, after.Runtime.APISockPath) { + if !c.vm.vmAlive(after) { return nil } - return d.ensureNAT(ctx, after, after.Spec.NATEnabled) + return c.net.ensureNAT(ctx, after.Runtime.GuestIP, c.vm.vmHandles(after.ID).TapDevice, after.Spec.NATEnabled) } -func (natCapability) AddDoctorChecks(ctx context.Context, d *Daemon, report *system.Report) { +func (c natCapability) AddDoctorChecks(ctx context.Context, report *system.Report) { checks := system.NewPreflight() checks.RequireCommand("ip", toolHint("ip")) - d.addNATPrereqs(ctx, checks) + c.net.addNATPrereqs(ctx, checks) if len(checks.Problems()) > 0 { report.Add(system.CheckStatusFail, "feature nat", checks.Problems()...) return } - uplink, err := d.defaultUplink(ctx) + uplink, err := c.net.defaultUplink(ctx) if err != nil { report.AddFail("feature nat", err.Error()) return diff --git a/internal/daemon/capabilities_test.go b/internal/daemon/capabilities_test.go index 13a6350..e1376a1 100644 --- a/internal/daemon/capabilities_test.go +++ b/internal/daemon/capabilities_test.go @@ -3,6 +3,7 @@ package daemon import ( "context" "errors" + "net" "reflect" "testing" @@ -10,31 +11,32 @@ import ( "banger/internal/guestconfig" "banger/internal/model" "banger/internal/system" + "banger/internal/vmdns" ) type testCapability struct { name string - prepare func(context.Context, *Daemon, *model.VMRecord, model.Image) error - cleanup func(context.Context, *Daemon, model.VMRecord) error + prepare func(context.Context, *model.VMRecord, model.Image) error + cleanup func(context.Context, model.VMRecord) error contribute func(*guestconfig.Builder, model.VMRecord, model.Image) contributeFC func(*firecracker.MachineConfig, model.VMRecord, model.Image) - configChange func(context.Context, *Daemon, model.VMRecord, model.VMRecord) error - doctor func(context.Context, *Daemon, *system.Report) - startPreflight func(context.Context, *Daemon, *system.Preflight, model.VMRecord, model.Image) + configChange func(context.Context, model.VMRecord, model.VMRecord) error + doctor func(context.Context, *system.Report) + startPreflight func(context.Context, *system.Preflight, model.VMRecord, model.Image) } func (c testCapability) Name() string { return c.name } -func (c testCapability) PrepareHost(ctx context.Context, d *Daemon, vm *model.VMRecord, image model.Image) error { +func (c testCapability) PrepareHost(ctx context.Context, vm *model.VMRecord, image model.Image) error { if c.prepare != nil { - return c.prepare(ctx, d, vm, image) + return c.prepare(ctx, vm, image) } return nil } -func (c testCapability) Cleanup(ctx context.Context, d *Daemon, vm model.VMRecord) error { +func (c testCapability) Cleanup(ctx context.Context, vm model.VMRecord) error { if c.cleanup != nil { - return c.cleanup(ctx, d, vm) + return c.cleanup(ctx, vm) } return nil } @@ -51,22 +53,22 @@ func (c testCapability) ContributeMachine(cfg *firecracker.MachineConfig, vm mod } } -func (c testCapability) ApplyConfigChange(ctx context.Context, d *Daemon, before, after model.VMRecord) error { +func (c testCapability) ApplyConfigChange(ctx context.Context, before, after model.VMRecord) error { if c.configChange != nil { - return c.configChange(ctx, d, before, after) + return c.configChange(ctx, before, after) } return nil } -func (c testCapability) AddDoctorChecks(ctx context.Context, d *Daemon, report *system.Report) { +func (c testCapability) AddDoctorChecks(ctx context.Context, report *system.Report) { if c.doctor != nil { - c.doctor(ctx, d, report) + c.doctor(ctx, report) } } -func (c testCapability) AddStartPreflight(ctx context.Context, d *Daemon, checks *system.Preflight, vm model.VMRecord, image model.Image) { +func (c testCapability) AddStartPreflight(ctx context.Context, checks *system.Preflight, vm model.VMRecord, image model.Image) { if c.startPreflight != nil { - c.startPreflight(ctx, d, checks, vm, image) + c.startPreflight(ctx, checks, vm, image) } } @@ -78,32 +80,33 @@ func TestPrepareCapabilityHostsRollsBackPreparedCapabilitiesInReverseOrder(t *te vmCaps: []vmCapability{ testCapability{ name: "first", - prepare: func(context.Context, *Daemon, *model.VMRecord, model.Image) error { + prepare: func(context.Context, *model.VMRecord, model.Image) error { return nil }, - cleanup: func(context.Context, *Daemon, model.VMRecord) error { + cleanup: func(context.Context, model.VMRecord) error { cleanupOrder = append(cleanupOrder, "first") return nil }, }, testCapability{ name: "second", - prepare: func(context.Context, *Daemon, *model.VMRecord, model.Image) error { + prepare: func(context.Context, *model.VMRecord, model.Image) error { return nil }, - cleanup: func(context.Context, *Daemon, model.VMRecord) error { + cleanup: func(context.Context, model.VMRecord) error { cleanupOrder = append(cleanupOrder, "second") return nil }, }, testCapability{ name: "broken", - prepare: func(context.Context, *Daemon, *model.VMRecord, model.Image) error { + prepare: func(context.Context, *model.VMRecord, model.Image) error { return errors.New("boom") }, }, }, } + wireServices(d) err := d.prepareCapabilityHosts(context.Background(), &vm, model.Image{}) if err == nil || err.Error() != "boom" { @@ -128,6 +131,7 @@ func TestContributeHooksPopulateGuestAndMachineConfig(t *testing.T) { }, }, } + wireServices(d) builder := guestconfig.NewBuilder() d.contributeGuestConfig(builder, model.VMRecord{}, model.Image{}) @@ -144,13 +148,40 @@ func TestContributeHooksPopulateGuestAndMachineConfig(t *testing.T) { } } -func TestRegisteredCapabilitiesIncludeOpencode(t *testing.T) { +func TestProbeBangerDNSAcceptsRealServer(t *testing.T) { + server, err := vmdns.New("127.0.0.1:0", nil) + if err != nil { + t.Fatalf("vmdns.New: %v", err) + } + t.Cleanup(func() { _ = server.Close() }) + + if !probeBangerDNS(server.Addr()) { + t.Fatal("probeBangerDNS rejected the real banger DNS server") + } +} + +func TestProbeBangerDNSRejectsSilentListener(t *testing.T) { + // A UDP listener that drops every datagram. The probe should + // time out and return false — i.e. "this is not banger". + conn, err := net.ListenPacket("udp", "127.0.0.1:0") + if err != nil { + t.Fatalf("ListenPacket: %v", err) + } + t.Cleanup(func() { _ = conn.Close() }) + + if probeBangerDNS(conn.LocalAddr().String()) { + t.Fatal("probeBangerDNS accepted a silent non-DNS listener") + } +} + +func TestDefaultCapabilitiesInOrder(t *testing.T) { d := &Daemon{} + wireServices(d) var names []string - for _, capability := range d.registeredCapabilities() { + for _, capability := range d.vmCaps { names = append(names, capability.Name()) } - want := []string{"work-disk", "opencode", "dns", "nat"} + want := []string{"work-disk", "dns", "nat"} if !reflect.DeepEqual(names, want) { t.Fatalf("capabilities = %v, want %v", names, want) } diff --git a/internal/daemon/concurrency_test.go b/internal/daemon/concurrency_test.go new file mode 100644 index 0000000..ed0d59a --- /dev/null +++ b/internal/daemon/concurrency_test.go @@ -0,0 +1,210 @@ +package daemon + +import ( + "context" + "os" + "path/filepath" + "sync" + "sync/atomic" + "testing" + "time" + + "banger/internal/api" + "banger/internal/imagepull" + "banger/internal/paths" + "banger/internal/system" +) + +// TestPullImageDoesNotSerialiseOnDifferentNames confirms the refactor +// actually releases imageOpsMu during the slow staging phase: two +// PullImage calls for distinct names run concurrently, with the +// "pull" half overlapping in time. Before the fix the two would have +// run strictly sequentially (one blocking the other inside +// imageOpsMu across the full OCI pull), which the maxActive >= 2 +// assertion would fail. +func TestPullImageDoesNotSerialiseOnDifferentNames(t *testing.T) { + if _, err := os.Stat("/usr/bin/mkfs.ext4"); err != nil { + if _, err := os.Stat("/sbin/mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not available; skipping") + } + } + imagesDir := t.TempDir() + cacheDir := t.TempDir() + kernel, initrd, modules := writeFakeKernelTriple(t) + + var ( + active atomic.Int32 + maxActive atomic.Int32 + enterPull = make(chan struct{}) + startRelease = make(chan struct{}) + ) + + slowPullAndFlatten := func(_ context.Context, _ string, _ string, destDir string) (imagepull.Metadata, error) { + // Record that we entered the pull body. + enterPull <- struct{}{} + // Track concurrent overlap. + n := active.Add(1) + for { + cur := maxActive.Load() + if n <= cur || maxActive.CompareAndSwap(cur, n) { + break + } + } + // Wait for the test to unblock us AFTER both pulls have + // entered the body. + <-startRelease + active.Add(-1) + // Produce the minimal synthetic tree stubPullAndFlatten does. + if err := os.MkdirAll(filepath.Join(destDir, "etc"), 0o755); err != nil { + return imagepull.Metadata{}, err + } + if err := os.WriteFile(filepath.Join(destDir, "etc", "hello"), []byte("world"), 0o644); err != nil { + return imagepull.Metadata{}, err + } + return imagepull.Metadata{Entries: map[string]imagepull.FileMeta{}}, nil + } + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, OCICacheDir: cacheDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: slowPullAndFlatten, + finalizePulledRootfs: stubFinalizePulledRootfs, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + + mkParams := func(name string) api.ImagePullParams { + return api.ImagePullParams{ + Ref: "example.invalid/" + name + ":latest", + Name: name, + KernelPath: kernel, + InitrdPath: initrd, + ModulesDir: modules, + } + } + + var wg sync.WaitGroup + errs := make([]error, 2) + for i, name := range []string{"alpha", "beta"} { + wg.Add(1) + go func(i int, name string) { + defer wg.Done() + _, err := d.img.PullImage(context.Background(), mkParams(name)) + errs[i] = err + }(i, name) + } + + // Wait for BOTH pulls to enter the slow body before we release + // them. If imageOpsMu still wrapped the full flow, the second + // pull would block on the mutex and never reach the enterPull + // send — the timeout below would fire. + for i := 0; i < 2; i++ { + select { + case <-enterPull: + case <-time.After(3 * time.Second): + t.Fatalf("pull %d never entered the slow body — imageOpsMu still serialises distinct pulls", i+1) + } + } + close(startRelease) + wg.Wait() + + for i, err := range errs { + if err != nil { + t.Fatalf("pull %d failed: %v", i+1, err) + } + } + if maxActive.Load() < 2 { + t.Fatalf("maxActive = %d, want >= 2 (pulls did not overlap)", maxActive.Load()) + } +} + +// TestPullImageRejectsNameClashAtPublish confirms the publish-window +// recheck is what actually enforces name uniqueness now that the slow +// body runs unlocked. Two pulls race to the same name; one wins and +// the other errors. +func TestPullImageRejectsNameClashAtPublish(t *testing.T) { + if _, err := os.Stat("/usr/bin/mkfs.ext4"); err != nil { + if _, err := os.Stat("/sbin/mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not available; skipping") + } + } + imagesDir := t.TempDir() + cacheDir := t.TempDir() + kernel, initrd, modules := writeFakeKernelTriple(t) + + release := make(chan struct{}) + synchronised := make(chan struct{}, 2) + pullAndFlatten := func(_ context.Context, _ string, _ string, destDir string) (imagepull.Metadata, error) { + synchronised <- struct{}{} + <-release + if err := os.MkdirAll(filepath.Join(destDir, "etc"), 0o755); err != nil { + return imagepull.Metadata{}, err + } + if err := os.WriteFile(filepath.Join(destDir, "marker"), []byte("ok"), 0o644); err != nil { + return imagepull.Metadata{}, err + } + return imagepull.Metadata{Entries: map[string]imagepull.FileMeta{}}, nil + } + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, OCICacheDir: cacheDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: pullAndFlatten, + finalizePulledRootfs: stubFinalizePulledRootfs, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + + params := api.ImagePullParams{ + Ref: "example.invalid/contender:latest", + Name: "contender", + KernelPath: kernel, + InitrdPath: initrd, + ModulesDir: modules, + } + + var wg sync.WaitGroup + errs := make([]error, 2) + for i := 0; i < 2; i++ { + wg.Add(1) + go func(i int) { + defer wg.Done() + _, err := d.img.PullImage(context.Background(), params) + errs[i] = err + }(i) + } + // Both workers must enter the pull body before either publishes. + for i := 0; i < 2; i++ { + select { + case <-synchronised: + case <-time.After(3 * time.Second): + t.Fatalf("pull %d never entered the slow body", i+1) + } + } + close(release) + wg.Wait() + + wins, losses := 0, 0 + for _, err := range errs { + if err == nil { + wins++ + } else { + losses++ + } + } + if wins != 1 || losses != 1 { + t.Fatalf("wins=%d losses=%d, want exactly one of each (errs=%v)", wins, losses, errs) + } +} diff --git a/internal/daemon/daemon.go b/internal/daemon/daemon.go index e142fd3..174b53f 100644 --- a/internal/daemon/daemon.go +++ b/internal/daemon/daemon.go @@ -3,55 +3,61 @@ package daemon import ( "bufio" "context" - "database/sql" "encoding/json" "errors" "fmt" "log/slog" "net" - "net/http" "os" + "path/filepath" "strings" "sync" "time" - "banger/internal/api" + "golang.org/x/sys/unix" + "banger/internal/config" + ws "banger/internal/daemon/workspace" + "banger/internal/installmeta" "banger/internal/model" "banger/internal/paths" + "banger/internal/roothelper" "banger/internal/rpc" "banger/internal/store" "banger/internal/system" "banger/internal/vmdns" ) +// Daemon is the composition root: shared infrastructure (store, +// runner, logger, layout, config) plus pointers to the four focused +// services that own behavior. Open wires the services; the dispatch +// loop forwards RPCs to them. No lifecycle / image / workspace / +// networking behavior lives on *Daemon itself — it's wiring. type Daemon struct { - layout paths.Layout - config model.DaemonConfig - store *store.Store - runner system.CommandRunner - logger *slog.Logger - mu sync.Mutex - createOpsMu sync.Mutex - createOps map[string]*vmCreateOperationState - imageBuildOpsMu sync.Mutex - imageBuildOps map[string]*imageBuildOperationState - vmLocksMu sync.Mutex - vmLocks map[string]*sync.Mutex - tapPoolMu sync.Mutex - tapPool []string - tapPoolNext int + layout paths.Layout + userLayout paths.Layout + config model.DaemonConfig + store *store.Store + runner system.CommandRunner + logger *slog.Logger + priv privilegedOps + + net *HostNetwork + img *ImageService + ws *WorkspaceService + vm *VMService + stats *StatsService + closing chan struct{} once sync.Once pid int listener net.Listener - webListener net.Listener - webServer *http.Server - webURL string - vmDNS *vmdns.Server vmCaps []vmCapability - imageBuild func(context.Context, imageBuildSpec) error requestHandler func(context.Context, rpc.Request) rpc.Response + guestWaitForSSH func(context.Context, string, string, time.Duration) error + guestDial func(context.Context, string, string) (guestSSHClient, error) + clientUID int + clientGID int } func Open(ctx context.Context) (d *Daemon, err error) { @@ -66,6 +72,39 @@ func Open(ctx context.Context) (d *Daemon, err error) { if err != nil { return nil, err } + return openWithConfig(ctx, layout, layout, cfg, os.Getuid(), os.Getgid(), true, nil) +} + +func OpenSystem(ctx context.Context) (*Daemon, error) { + meta, err := installmeta.Load(installmeta.DefaultPath) + if err != nil { + return nil, err + } + layout := paths.ResolveSystem() + if err := paths.EnsureSystemOwned(layout); err != nil { + return nil, err + } + ownerLayout, err := paths.ResolveUserForHome(meta.OwnerHome) + if err != nil { + return nil, err + } + cfg, err := config.LoadDaemon(ownerLayout, meta.OwnerHome) + if err != nil { + return nil, err + } + // config.Load fills JailerChrootBase from the layout it sees. In + // system mode that's the owner's layout (no privileged StateDir) so + // the value lands under the owner home — wrong for the helper, which + // validates paths against the system StateDir. Override unconditionally + // here so both daemon and helper see /var/lib/banger/jail. + if strings.TrimSpace(cfg.JailerChrootBase) == "" || !filepath.IsAbs(cfg.JailerChrootBase) || strings.HasPrefix(cfg.JailerChrootBase, ownerLayout.StateDir) { + cfg.JailerChrootBase = filepath.Join(layout.StateDir, "jail") + } + helper := newHelperPrivilegedOps(roothelper.NewClient(installmeta.DefaultRootHelperSocketPath), cfg, layout) + return openWithConfig(ctx, layout, ownerLayout, cfg, -1, -1, false, helper) +} + +func openWithConfig(ctx context.Context, layout, userLayout paths.Layout, cfg model.DaemonConfig, clientUID, clientGID int, syncSSHConfig bool, priv privilegedOps) (d *Daemon, err error) { logger, normalizedLevel, err := newDaemonLogger(os.Stderr, cfg.LogLevel) if err != nil { return nil, err @@ -75,34 +114,64 @@ func Open(ctx context.Context) (d *Daemon, err error) { if err != nil { return nil, err } + closing := make(chan struct{}) + runner := system.NewRunner() d = &Daemon{ - layout: layout, - config: cfg, - store: db, - runner: system.NewRunner(), - logger: logger, - closing: make(chan struct{}), - pid: os.Getpid(), + layout: layout, + userLayout: userLayout, + config: cfg, + store: db, + runner: runner, + logger: logger, + closing: closing, + pid: os.Getpid(), + clientUID: clientUID, + clientGID: clientGID, + priv: priv, + } + wireServices(d) + // From here on, every failure path must run Close() so the host + // state we touched (DNS listener goroutine, resolvectl routing, + // SQLite handle, future side effects) gets unwound. Close is + // idempotent + nil-guarded so it's safe to call on a partially + // initialised daemon — `d.vmDNS == nil` and friends short-circuit + // the teardown of components we never set up. + defer func() { + if err != nil { + _ = d.Close() + } + }() + + if syncSSHConfig { + d.ensureVMSSHClientConfig() } d.logger.Info("daemon opened", "socket", layout.SocketPath, "state_dir", layout.StateDir, "log_level", cfg.LogLevel) - if err = d.startVMDNS(vmdns.DefaultListenAddr); err != nil { + if err = d.net.startVMDNS(vmdns.DefaultListenAddr); err != nil { d.logger.Error("daemon open failed", "stage", "start_vm_dns", "error", err.Error()) return nil, err } - defer func() { - if err != nil { - _ = d.stopVMDNS() - } - }() if err = d.reconcile(ctx); err != nil { d.logger.Error("daemon open failed", "stage", "reconcile", "error", err.Error()) return nil, err } - if err = d.initializeTapPool(ctx); err != nil { - d.logger.Error("daemon open failed", "stage", "initialize_tap_pool", "error", err.Error()) - return nil, err + d.net.ensureVMDNSResolverRouting(ctx) + // Seed HostNetwork's pool index from taps already claimed by VMs + // on disk so newly warmed pool entries don't collide with them. + if d.config.TapPoolSize > 0 && d.store != nil { + vms, listErr := d.store.ListVMs(ctx) + if listErr != nil { + d.logger.Error("daemon open failed", "stage", "initialize_tap_pool", "error", listErr.Error()) + return nil, listErr + } + used := make([]string, 0, len(vms)) + for _, vm := range vms { + if tap := d.vm.vmHandles(vm.ID).TapDevice; tap != "" { + used = append(used, tap) + } + } + d.net.initializeTapPool(used) } - go d.ensureTapPool(context.Background()) + go d.net.ensureTapPool(context.Background()) return d, nil } @@ -116,13 +185,11 @@ func (d *Daemon) Close() error { if d.listener != nil { _ = d.listener.Close() } - if d.webServer != nil { - _ = d.webServer.Close() + var closeErr error + if d.store != nil { + closeErr = d.store.Close() } - if d.webListener != nil { - _ = d.webListener.Close() - } - err = errors.Join(d.stopVMDNS(), d.store.Close()) + err = errors.Join(d.net.clearVMDNSResolverRouting(context.Background()), d.net.stopVMDNS(), closeErr) }) return err } @@ -139,16 +206,31 @@ func (d *Daemon) Serve(ctx context.Context) error { d.listener = listener defer listener.Close() defer os.Remove(d.layout.SocketPath) + serveDone := make(chan struct{}) + defer close(serveDone) + go func() { + select { + case <-ctx.Done(): + _ = listener.Close() + case <-d.closing: + case <-serveDone: + } + }() + // Tighten the socket mode while root still owns it, then hand it to + // the configured client uid/gid. In the hardened systemd unit we keep + // CAP_CHOWN but intentionally do not keep the broader file-ownership + // capability set that would be needed to chmod after chown. if err := os.Chmod(d.layout.SocketPath, 0o600); err != nil { return err } + if d.clientUID >= 0 && d.clientGID >= 0 { + if err := os.Chown(d.layout.SocketPath, d.clientUID, d.clientGID); err != nil { + return err + } + } if d.logger != nil { d.logger.Info("daemon serving", "socket", d.layout.SocketPath, "pid", d.pid) } - if err := d.startWebServer(); err != nil { - return err - } - go d.backgroundLoop() for { @@ -161,7 +243,7 @@ func (d *Daemon) Serve(ctx context.Context) error { return nil default: } - if ne, ok := err.(net.Error); ok && ne.Temporary() { + if _, ok := err.(net.Error); ok { if d.logger != nil { d.logger.Warn("daemon accept temporary failure", "error", err.Error()) } @@ -179,6 +261,13 @@ func (d *Daemon) Serve(ctx context.Context) error { func (d *Daemon) handleConn(conn net.Conn) { defer conn.Close() + if err := d.authorizeConn(conn); err != nil { + if d.logger != nil { + d.logger.Warn("daemon connection rejected", "remote", conn.RemoteAddr().String(), "error", err.Error()) + } + _ = json.NewEncoder(conn).Encode(rpc.NewError("unauthorized", err.Error())) + return + } reader := bufio.NewReader(conn) var req rpc.Request if err := json.NewDecoder(reader).Decode(&req); err != nil { @@ -201,6 +290,44 @@ func (d *Daemon) handleConn(conn net.Conn) { } } +// authorizeConn enforces SO_PEERCRED on the daemon socket as a +// belt-and-braces check on top of filesystem perms (0600 + chowned to +// the owner). Filesystem perms already prevent other host users from +// connecting; the peer-cred read closes the door on any path that +// might leak the socket FD to a non-owner process. Mirrors the +// equivalent check in roothelper.authorizeConn. +func (d *Daemon) authorizeConn(conn net.Conn) error { + unixConn, ok := conn.(*net.UnixConn) + if !ok { + return errors.New("daemon requires unix connections") + } + rawConn, err := unixConn.SyscallConn() + if err != nil { + return err + } + var cred *unix.Ucred + var controlErr error + if err := rawConn.Control(func(fd uintptr) { + cred, controlErr = unix.GetsockoptUcred(int(fd), unix.SOL_SOCKET, unix.SO_PEERCRED) + }); err != nil { + return err + } + if controlErr != nil { + return controlErr + } + if cred == nil { + return errors.New("missing peer credentials") + } + expected := d.clientUID + if expected < 0 { + expected = os.Getuid() + } + if int(cred.Uid) == 0 || int(cred.Uid) == expected { + return nil + } + return fmt.Errorf("uid %d is not allowed to use the daemon", cred.Uid) +} + func (d *Daemon) watchRequestDisconnect(conn net.Conn, reader *bufio.Reader, method string, cancel context.CancelFunc) func() { if conn == nil || reader == nil { return func() {} @@ -226,7 +353,7 @@ func (d *Daemon) watchRequestDisconnect(conn net.Conn, reader *bufio.Reader, met default: } if d.logger != nil { - d.logger.Info("daemon request canceled", "method", method, "remote", conn.RemoteAddr().String(), "error", err.Error()) + d.logger.Debug("daemon request canceled", "method", method, "remote", conn.RemoteAddr().String(), "error", err.Error()) } cancel() return @@ -240,213 +367,34 @@ func (d *Daemon) watchRequestDisconnect(conn net.Conn, reader *bufio.Reader, met } func (d *Daemon) dispatch(ctx context.Context, req rpc.Request) rpc.Response { + // Per-RPC correlation id is generated unconditionally — even + // errors that short-circuit before reaching a handler get one + // so the operator has a handle for every CLI failure. + // Generation can fail in theory (crypto/rand IO error) — + // degrade gracefully to a blank id rather than tearing down + // the request. + opID, _ := model.NewOpID() + if opID != "" { + ctx = WithOpID(ctx, opID) + } + stampOpID := func(resp rpc.Response) rpc.Response { + if !resp.OK && resp.Error != nil && resp.Error.OpID == "" && opID != "" { + resp.Error.OpID = opID + } + return resp + } + if req.Version != rpc.Version { - return rpc.NewError("bad_version", fmt.Sprintf("unsupported version %d", req.Version)) + return stampOpID(rpc.NewError("bad_version", fmt.Sprintf("unsupported version %d", req.Version))) } if d.requestHandler != nil { - return d.requestHandler(ctx, req) + return stampOpID(d.requestHandler(ctx, req)) } - switch req.Method { - case "ping": - result, _ := rpc.NewResult(api.PingResult{Status: "ok", PID: d.pid, WebURL: d.webURL}) - return result - case "shutdown": - go d.Close() - result, _ := rpc.NewResult(api.ShutdownResult{Status: "stopping"}) - return result - case "vm.create": - params, err := rpc.DecodeParams[api.VMCreateParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.CreateVM(ctx, params) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.create.begin": - params, err := rpc.DecodeParams[api.VMCreateParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - op, err := d.BeginVMCreate(ctx, params) - return marshalResultOrError(api.VMCreateBeginResult{Operation: op}, err) - case "vm.create.status": - params, err := rpc.DecodeParams[api.VMCreateStatusParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - op, err := d.VMCreateStatus(ctx, params.ID) - return marshalResultOrError(api.VMCreateStatusResult{Operation: op}, err) - case "vm.create.cancel": - params, err := rpc.DecodeParams[api.VMCreateStatusParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - err = d.CancelVMCreate(ctx, params.ID) - return marshalResultOrError(api.Empty{}, err) - case "vm.list": - vms, err := d.store.ListVMs(ctx) - return marshalResultOrError(api.VMListResult{VMs: vms}, err) - case "vm.show": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.FindVM(ctx, params.IDOrName) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.start": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.StartVM(ctx, params.IDOrName) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.stop": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.StopVM(ctx, params.IDOrName) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.kill": - params, err := rpc.DecodeParams[api.VMKillParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.KillVM(ctx, params) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.restart": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.RestartVM(ctx, params.IDOrName) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.delete": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.DeleteVM(ctx, params.IDOrName) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.set": - params, err := rpc.DecodeParams[api.VMSetParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.SetVM(ctx, params) - return marshalResultOrError(api.VMShowResult{VM: vm}, err) - case "vm.stats": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, stats, err := d.GetVMStats(ctx, params.IDOrName) - return marshalResultOrError(api.VMStatsResult{VM: vm, Stats: stats}, err) - case "vm.logs": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.FindVM(ctx, params.IDOrName) - if err != nil { - return rpc.NewError("not_found", err.Error()) - } - return marshalResultOrError(api.VMLogsResult{LogPath: vm.Runtime.LogPath}, nil) - case "vm.ssh": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - vm, err := d.TouchVM(ctx, params.IDOrName) - if err != nil { - return rpc.NewError("not_found", err.Error()) - } - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - return rpc.NewError("not_running", fmt.Sprintf("vm %s is not running", vm.Name)) - } - return marshalResultOrError(api.VMSSHResult{Name: vm.Name, GuestIP: vm.Runtime.GuestIP}, nil) - case "vm.health": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - result, err := d.HealthVM(ctx, params.IDOrName) - return marshalResultOrError(result, err) - case "vm.ping": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - result, err := d.PingVM(ctx, params.IDOrName) - return marshalResultOrError(result, err) - case "vm.ports": - params, err := rpc.DecodeParams[api.VMRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - result, err := d.PortsVM(ctx, params.IDOrName) - return marshalResultOrError(result, err) - case "image.list": - images, err := d.store.ListImages(ctx) - return marshalResultOrError(api.ImageListResult{Images: images}, err) - case "image.show": - params, err := rpc.DecodeParams[api.ImageRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - image, err := d.FindImage(ctx, params.IDOrName) - return marshalResultOrError(api.ImageShowResult{Image: image}, err) - case "image.build": - params, err := rpc.DecodeParams[api.ImageBuildParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - image, err := d.BuildImage(ctx, params) - return marshalResultOrError(api.ImageShowResult{Image: image}, err) - case "image.build.begin": - params, err := rpc.DecodeParams[api.ImageBuildParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - op, err := d.BeginImageBuild(ctx, params) - return marshalResultOrError(api.ImageBuildBeginResult{Operation: op}, err) - case "image.build.status": - params, err := rpc.DecodeParams[api.ImageBuildStatusParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - op, err := d.ImageBuildStatus(ctx, params.ID) - return marshalResultOrError(api.ImageBuildStatusResult{Operation: op}, err) - case "image.build.cancel": - params, err := rpc.DecodeParams[api.ImageBuildStatusParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - err = d.CancelImageBuild(ctx, params.ID) - return marshalResultOrError(api.Empty{}, err) - case "image.register": - params, err := rpc.DecodeParams[api.ImageRegisterParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - image, err := d.RegisterImage(ctx, params) - return marshalResultOrError(api.ImageShowResult{Image: image}, err) - case "image.promote": - params, err := rpc.DecodeParams[api.ImageRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - image, err := d.PromoteImage(ctx, params.IDOrName) - return marshalResultOrError(api.ImageShowResult{Image: image}, err) - case "image.delete": - params, err := rpc.DecodeParams[api.ImageRefParams](req) - if err != nil { - return rpc.NewError("bad_request", err.Error()) - } - image, err := d.DeleteImage(ctx, params.IDOrName) - return marshalResultOrError(api.ImageShowResult{Image: image}, err) - default: - return rpc.NewError("unknown_method", req.Method) + h, ok := rpcHandlers[req.Method] + if !ok { + return stampOpID(rpc.NewError("unknown_method", req.Method)) } + return stampOpID(h(ctx, d, req)) } func (d *Daemon) backgroundLoop() { @@ -459,195 +407,194 @@ func (d *Daemon) backgroundLoop() { case <-d.closing: return case <-statsTicker.C: - if err := d.pollStats(context.Background()); err != nil && d.logger != nil { + if err := d.stats.pollStats(context.Background()); err != nil && d.logger != nil { d.logger.Error("background stats poll failed", "error", err.Error()) } case <-staleTicker.C: - if err := d.stopStaleVMs(context.Background()); err != nil && d.logger != nil { + if err := d.stats.stopStaleVMs(context.Background()); err != nil && d.logger != nil { d.logger.Error("background stale sweep failed", "error", err.Error()) } - d.pruneVMCreateOperations(time.Now().Add(-10 * time.Minute)) - d.pruneImageBuildOperations(time.Now().Add(-10 * time.Minute)) + d.vm.pruneVMCreateOperations(time.Now().Add(-10 * time.Minute)) } } } -func (d *Daemon) startVMDNS(addr string) error { - server, err := vmdns.New(addr, d.logger) - if err != nil { - return err - } - d.vmDNS = server - if d.logger != nil { - d.logger.Info("vm dns serving", "dns_addr", server.Addr()) - } - return nil -} - -func (d *Daemon) stopVMDNS() error { - if d.vmDNS == nil { - return nil - } - err := d.vmDNS.Close() - d.vmDNS = nil - return err -} - -func (d *Daemon) ensureDefaultImage(ctx context.Context) error { - _ = ctx - return nil -} - func (d *Daemon) reconcile(ctx context.Context) error { - op := d.beginOperation("daemon.reconcile") + op := d.beginOperation(ctx, "daemon.reconcile") vms, err := d.store.ListVMs(ctx) if err != nil { return op.fail(err) } for _, vm := range vms { - if err := d.withVMLockByIDErr(ctx, vm.ID, func(vm model.VMRecord) error { + if err := d.vm.withVMLockByIDErr(ctx, vm.ID, func(vm model.VMRecord) error { if vm.State != model.VMStateRunning { + // Belt-and-braces: a stopped VM should never have a + // scratch file or a cache entry. Clean up anything + // left by an ungraceful previous daemon crash. + d.vm.clearVMHandles(vm) return nil } - if system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { + // Rebuild the in-memory handle cache by loading the per-VM + // scratch file and verifying the firecracker process is + // still alive. + h, alive, err := d.vm.rediscoverHandles(ctx, vm) + if err != nil && d.logger != nil { + d.logger.Warn("rediscover handles failed", "vm_id", vm.ID, "error", err.Error()) + } + // Either way, seed the cache with what the scratch file + // claimed. If alive, subsequent vmAlive() calls pass; if + // not, cleanupRuntime needs these handles to know which + // kernel resources (DM / loops / tap) to tear down. + d.vm.setVMHandlesInMemory(vm.ID, h) + if alive { return nil } op.stage("stale_vm", vmLogAttrs(vm)...) - _ = d.cleanupRuntime(ctx, vm, true) + _ = d.vm.cleanupRuntime(ctx, vm, true) vm.State = model.VMStateStopped vm.Runtime.State = model.VMStateStopped - clearRuntimeHandles(&vm) + clearRuntimeTeardownState(&vm) + d.vm.clearVMHandles(vm) vm.UpdatedAt = model.Now() return d.store.UpsertVM(ctx, vm) }); err != nil { return op.fail(err, "vm_id", vm.ID) } } - if err := d.rebuildDNS(ctx); err != nil { + if err := d.vm.rebuildDNS(ctx); err != nil { return op.fail(err) } op.done() return nil } +// FindVM stays on Daemon as a thin forwarder to the VM service lookup. +// Dispatch code reads the facade directly; tests that pre-date the +// service split keep compiling. func (d *Daemon) FindVM(ctx context.Context, idOrName string) (model.VMRecord, error) { - if idOrName == "" { - return model.VMRecord{}, errors.New("vm id or name is required") - } - if vm, err := d.store.GetVM(ctx, idOrName); err == nil { - return vm, nil - } - vms, err := d.store.ListVMs(ctx) - if err != nil { - return model.VMRecord{}, err - } - matchCount := 0 - var match model.VMRecord - for _, vm := range vms { - if strings.HasPrefix(vm.ID, idOrName) || strings.HasPrefix(vm.Name, idOrName) { - match = vm - matchCount++ - } - } - if matchCount == 1 { - return match, nil - } - if matchCount > 1 { - return model.VMRecord{}, fmt.Errorf("multiple VMs match %q", idOrName) - } - return model.VMRecord{}, fmt.Errorf("vm %q not found", idOrName) + return d.vm.FindVM(ctx, idOrName) } +// FindImage stays on Daemon as a thin forwarder to the image service +// lookup so callers reading dispatch code see the obvious facade, and +// tests that pre-date the service split still compile. func (d *Daemon) FindImage(ctx context.Context, idOrName string) (model.Image, error) { - if idOrName == "" { - return model.Image{}, errors.New("image id or name is required") - } - if image, err := d.store.GetImageByName(ctx, idOrName); err == nil { - return image, nil - } - if image, err := d.store.GetImageByID(ctx, idOrName); err == nil { - return image, nil - } - images, err := d.store.ListImages(ctx) - if err != nil { - return model.Image{}, err - } - matchCount := 0 - var match model.Image - for _, image := range images { - if strings.HasPrefix(image.ID, idOrName) || strings.HasPrefix(image.Name, idOrName) { - match = image - matchCount++ - } - } - if matchCount == 1 { - return match, nil - } - if matchCount > 1 { - return model.Image{}, fmt.Errorf("multiple images match %q", idOrName) - } - return model.Image{}, fmt.Errorf("image %q not found", idOrName) + return d.img.FindImage(ctx, idOrName) } func (d *Daemon) TouchVM(ctx context.Context, idOrName string) (model.VMRecord, error) { - return d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - system.TouchNow(&vm) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err + return d.vm.TouchVM(ctx, idOrName) +} + +// wireServices populates the four focused services and their peer +// references from the infrastructure already on d (runner, logger, +// config, layout, store, closing, plus the SSH-client test seams). +// Idempotent: each service is skipped if the field is already non-nil, +// so tests can preinstall stubs for the services they want to fake and +// let wireServices fill the rest. The peer-service closures on +// WorkspaceService capture d rather than a direct *VMService pointer so +// the ws↔vm construction order doesn't recurse: the closures read d.vm +// at call time, by which point it is populated. +func wireServices(d *Daemon) { + if d.priv == nil { + clientUID, clientGID := d.clientUID, d.clientGID + if clientUID == 0 && clientGID == 0 { + clientUID, clientGID = -1, -1 } - return vm, nil - }) -} - -func (d *Daemon) withVMLockByRef(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) { - vm, err := d.FindVM(ctx, idOrName) - if err != nil { - return model.VMRecord{}, err + d.priv = newLocalPrivilegedOps(d.runner, d.logger, d.config, d.layout, clientUID, clientGID) } - return d.withVMLockByID(ctx, vm.ID, fn) -} - -func (d *Daemon) withVMLockByID(ctx context.Context, id string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) { - if strings.TrimSpace(id) == "" { - return model.VMRecord{}, errors.New("vm id is required") + if d.net == nil { + d.net = newHostNetwork(hostNetworkDeps{ + runner: d.runner, + logger: d.logger, + config: d.config, + layout: d.layout, + closing: d.closing, + priv: d.priv, + }) } - unlock := d.lockVMID(id) - defer unlock() - - vm, err := d.store.GetVMByID(ctx, id) - if err != nil { - if errors.Is(err, sql.ErrNoRows) { - return model.VMRecord{}, fmt.Errorf("vm %q not found", id) - } - return model.VMRecord{}, err + if d.img == nil { + d.img = newImageService(imageServiceDeps{ + runner: d.runner, + logger: d.logger, + config: d.config, + layout: d.layout, + store: d.store, + beginOperation: d.beginOperation, + }) } - return fn(vm) -} - -func (d *Daemon) withVMLockByIDErr(ctx context.Context, id string, fn func(model.VMRecord) error) error { - _, err := d.withVMLockByID(ctx, id, func(vm model.VMRecord) (model.VMRecord, error) { - if err := fn(vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil - }) - return err -} - -func (d *Daemon) lockVMID(id string) func() { - d.vmLocksMu.Lock() - if d.vmLocks == nil { - d.vmLocks = make(map[string]*sync.Mutex) + if d.ws == nil { + d.ws = newWorkspaceService(workspaceServiceDeps{ + runner: d.runner, + logger: d.logger, + config: d.config, + layout: d.layout, + store: d.store, + repoInspector: ws.NewInspector(), + vmResolver: func(ctx context.Context, idOrName string) (model.VMRecord, error) { + return d.vm.FindVM(ctx, idOrName) + }, + aliveChecker: func(vm model.VMRecord) bool { + return d.vm.vmAlive(vm) + }, + waitGuestSSH: d.waitForGuestSSH, + dialGuest: d.dialGuest, + imageResolver: func(ctx context.Context, idOrName string) (model.Image, error) { + return d.FindImage(ctx, idOrName) + }, + imageWorkSeed: func(ctx context.Context, image model.Image, fingerprint string) error { + return d.img.refreshManagedWorkSeedFingerprint(ctx, image, fingerprint) + }, + withVMLockByRef: func(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) { + return d.vm.withVMLockByRef(ctx, idOrName, fn) + }, + beginOperation: d.beginOperation, + }) } - lock, ok := d.vmLocks[id] - if !ok { - lock = &sync.Mutex{} - d.vmLocks[id] = lock + if d.vm == nil { + d.vm = newVMService(vmServiceDeps{ + runner: d.runner, + logger: d.logger, + config: d.config, + layout: d.layout, + store: d.store, + net: d.net, + img: d.img, + ws: d.ws, + priv: d.priv, + capHooks: d.buildCapabilityHooks(), + beginOperation: d.beginOperation, + vsockHostDevice: defaultVsockHostDevice, + }) + } + if d.stats == nil { + // Closures capture d rather than d.vm directly, so they re-read + // d.vm at call time. Wire order (d.vm constructed above) makes + // the closures safe, but this pattern also protects against a + // future test that swaps d.vm after initial wire. + d.stats = newStatsService(statsServiceDeps{ + runner: d.runner, + logger: d.logger, + config: d.config, + store: d.store, + net: d.net, + beginOperation: d.beginOperation, + vmAlive: func(vm model.VMRecord) bool { return d.vm.vmAlive(vm) }, + vmHandles: func(id string) model.VMHandles { return d.vm.vmHandles(id) }, + withVMLockByRef: func(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) { + return d.vm.withVMLockByRef(ctx, idOrName, fn) + }, + withVMLockByIDErr: func(ctx context.Context, id string, fn func(model.VMRecord) error) error { + return d.vm.withVMLockByIDErr(ctx, id, fn) + }, + cleanupRuntime: func(ctx context.Context, vm model.VMRecord, preserve bool) error { + return d.vm.cleanupRuntime(ctx, vm, preserve) + }, + }) + } + if len(d.vmCaps) == 0 { + d.vmCaps = d.defaultCapabilities() } - d.vmLocksMu.Unlock() - - lock.Lock() - return lock.Unlock } func marshalResultOrError(v any, err error) rpc.Response { diff --git a/internal/daemon/daemon_test.go b/internal/daemon/daemon_test.go index dc43c59..7b19cb6 100644 --- a/internal/daemon/daemon_test.go +++ b/internal/daemon/daemon_test.go @@ -2,27 +2,82 @@ package daemon import ( "context" + "encoding/json" + "errors" + "io" + "log/slog" + "net" "os" "path/filepath" "strings" + "syscall" "testing" + "time" "banger/internal/api" + "banger/internal/buildinfo" "banger/internal/model" "banger/internal/paths" + "banger/internal/rpc" "banger/internal/system" ) -func TestBuildImageRequiresFromImage(t *testing.T) { - d := &Daemon{ - layout: paths.Layout{ImagesDir: t.TempDir(), StateDir: t.TempDir()}, - store: openDaemonStore(t), - runner: system.NewRunner(), +// TestAuthorizeConnRejectsNonUnixConn pins the type guard at the top +// of authorizeConn: SO_PEERCRED only makes sense on a unix socket, so +// anything else must be refused outright. net.Pipe gives us a +// connection that satisfies net.Conn but isn't a *net.UnixConn, which +// is exactly the shape we need to exercise the early-return. +func TestAuthorizeConnRejectsNonUnixConn(t *testing.T) { + d := &Daemon{} + pipeA, pipeB := net.Pipe() + defer pipeA.Close() + defer pipeB.Close() + if err := d.authorizeConn(pipeA); err == nil { + t.Fatal("authorizeConn(pipe) succeeded, want error") } +} - _, err := d.BuildImage(context.Background(), api.ImageBuildParams{Name: "missing-base"}) - if err == nil || !strings.Contains(err.Error(), "from-image is required") { - t.Fatalf("BuildImage() error = %v", err) +// TestAuthorizeConnAcceptsOwnerUIDOverUnixSocket pins the happy path: +// when the test process connects to a freshly bound unix socket as +// itself, the daemon's peer-cred check matches d.clientUID and lets +// the connection through. +func TestAuthorizeConnAcceptsOwnerUIDOverUnixSocket(t *testing.T) { + dir := t.TempDir() + sockPath := filepath.Join(dir, "test.sock") + listener, err := net.Listen("unix", sockPath) + if err != nil { + t.Fatalf("listen: %v", err) + } + defer listener.Close() + + type result struct { + err error + } + got := make(chan result, 1) + go func() { + conn, err := listener.Accept() + if err != nil { + got <- result{err: err} + return + } + defer conn.Close() + d := &Daemon{clientUID: os.Getuid()} + got <- result{err: d.authorizeConn(conn)} + }() + + client, err := net.Dial("unix", sockPath) + if err != nil { + t.Fatalf("dial: %v", err) + } + defer client.Close() + + select { + case r := <-got: + if r.err != nil { + t.Fatalf("authorizeConn(unix self) = %v, want nil", r.err) + } + case <-time.After(2 * time.Second): + t.Fatal("authorizeConn never returned") } } @@ -32,8 +87,9 @@ func TestRegisterImageRequiresKernel(t *testing.T) { t.Fatalf("write rootfs: %v", err) } d := &Daemon{store: openDaemonStore(t)} + wireServices(d) - _, err := d.RegisterImage(context.Background(), api.ImageRegisterParams{ + _, err := d.img.RegisterImage(context.Background(), api.ImageRegisterParams{ Name: "missing-kernel", RootfsPath: rootfs, }) @@ -42,6 +98,98 @@ func TestRegisterImageRequiresKernel(t *testing.T) { } } +func TestDispatchPingIncludesBuildInfo(t *testing.T) { + d := &Daemon{pid: 42} + wireServices(d) + + resp := d.dispatch(context.Background(), rpc.Request{Version: rpc.Version, Method: "ping"}) + if !resp.OK { + t.Fatalf("dispatch(ping) = %+v, want ok", resp) + } + + var got api.PingResult + if err := json.Unmarshal(resp.Result, &got); err != nil { + t.Fatalf("Unmarshal(PingResult): %v", err) + } + + info := buildinfo.Current() + if got.Status != "ok" || got.PID != 42 { + t.Fatalf("PingResult = %+v, want status/pid populated", got) + } + if got.Version != info.Version || got.Commit != info.Commit || got.BuiltAt != info.BuiltAt { + t.Fatalf("PingResult build info = %+v, want %+v", got, info) + } +} + +func TestServeReturnsOnContextCancel(t *testing.T) { + dir := t.TempDir() + runtimeDir := filepath.Join(dir, "runtime") + if err := os.MkdirAll(runtimeDir, 0o755); err != nil { + t.Fatalf("MkdirAll runtime: %v", err) + } + socketPath := filepath.Join(runtimeDir, "bangerd.sock") + probe, err := net.Listen("unix", filepath.Join(runtimeDir, "probe.sock")) + if err != nil { + if errors.Is(err, syscall.EPERM) || strings.Contains(err.Error(), "operation not permitted") { + t.Skipf("unix socket listen blocked in this environment: %v", err) + } + t.Fatalf("probe listen: %v", err) + } + _ = probe.Close() + _ = os.Remove(filepath.Join(runtimeDir, "probe.sock")) + d := &Daemon{ + layout: paths.Layout{ + RuntimeDir: runtimeDir, + SocketPath: socketPath, + }, + config: model.DaemonConfig{ + StatsPollInterval: time.Hour, + }, + store: openDaemonStore(t), + runner: system.NewRunner(), + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + closing: make(chan struct{}), + clientUID: -1, + clientGID: -1, + } + wireServices(d) + + ctx, cancel := context.WithCancel(context.Background()) + defer cancel() + + serveErr := make(chan error, 1) + go func() { + serveErr <- d.Serve(ctx) + }() + + deadline := time.Now().Add(2 * time.Second) + for { + if _, err := os.Stat(socketPath); err == nil { + break + } + select { + case err := <-serveErr: + t.Fatalf("Serve() returned before socket was ready: %v", err) + default: + } + if time.Now().After(deadline) { + t.Fatalf("socket %s not created before deadline", socketPath) + } + time.Sleep(25 * time.Millisecond) + } + + cancel() + + select { + case err := <-serveErr: + if err != nil { + t.Fatalf("Serve() error = %v, want nil on context cancel", err) + } + case <-time.After(2 * time.Second): + t.Fatal("Serve() did not return after context cancel") + } +} + func TestPromoteImageCopiesBootArtifactsIntoArtifactDir(t *testing.T) { dir := t.TempDir() rootfs := filepath.Join(dir, "rootfs.ext4") @@ -65,7 +213,7 @@ func TestPromoteImageCopiesBootArtifactsIntoArtifactDir(t *testing.T) { db := openDaemonStore(t) image := model.Image{ ID: "img-promote", - Name: "void-exp", + Name: "void", Managed: false, RootfsPath: rootfs, KernelPath: kernel, @@ -88,7 +236,8 @@ func TestPromoteImageCopiesBootArtifactsIntoArtifactDir(t *testing.T) { store: db, runner: system.NewRunner(), } - got, err := d.PromoteImage(context.Background(), image.Name) + wireServices(d) + got, err := d.img.PromoteImage(context.Background(), image.Name) if err != nil { t.Fatalf("PromoteImage: %v", err) } diff --git a/internal/daemon/daemon_testing_test.go b/internal/daemon/daemon_testing_test.go new file mode 100644 index 0000000..00d7944 --- /dev/null +++ b/internal/daemon/daemon_testing_test.go @@ -0,0 +1,241 @@ +package daemon + +import ( + "bytes" + "io" + "log/slog" + "path/filepath" + "testing" + + "banger/internal/model" + "banger/internal/paths" + "banger/internal/store" + "banger/internal/system" +) + +// testDaemonOpts collects everything newTestDaemon knows how to +// override. Nothing is exported: the zero value is "sensible defaults", +// tests pick overrides by option function. +type testDaemonOpts struct { + runner system.CommandRunner + config *model.DaemonConfig + store *store.Store + logger *slog.Logger + layout *paths.Layout + vmCaps []vmCapability + vmCapsSet bool + vsockHostDevice string +} + +// testDaemonOption applies a single override to testDaemonOpts. Pass +// any combination to newTestDaemon; later options win on conflict. +type testDaemonOption func(*testDaemonOpts) + +// withRunner sets the system.CommandRunner used by HostNetwork, +// ImageService, WorkspaceService, and VMService. Most tests want +// permissiveRunner or scriptedRunner; the default is a permissive +// runner that returns empty output with no error. +func withRunner(r system.CommandRunner) testDaemonOption { + return func(o *testDaemonOpts) { o.runner = r } +} + +// withConfig replaces the DaemonConfig. Useful for exercising config- +// dependent code paths (bridge name, firecracker binary path, +// default image name, etc.) without going through config.Load. +func withConfig(cfg model.DaemonConfig) testDaemonOption { + return func(o *testDaemonOpts) { o.config = &cfg } +} + +// withStore reuses an externally-opened store instead of opening a +// fresh tempdir DB. Useful when the test needs to pre-seed rows +// before the daemon is wired. +func withStore(st *store.Store) testDaemonOption { + return func(o *testDaemonOpts) { o.store = st } +} + +// withLogger routes daemon logs somewhere specific. Default is +// io.Discard so a passing test run stays quiet; failing tests that +// want structured log content can pass their own buffer-backed slog. +func withLogger(l *slog.Logger) testDaemonOption { + return func(o *testDaemonOpts) { o.logger = l } +} + +// withLayout overrides the paths.Layout. Defaults build all dirs +// under t.TempDir() so tests don't interfere with each other and +// don't write into the user's real ~/.local/state/banger. +func withLayout(layout paths.Layout) testDaemonOption { + return func(o *testDaemonOpts) { o.layout = &layout } +} + +// withVMCaps installs a specific capability list on the daemon. +// Default is an empty slice, which means wireServices skips the +// built-in workDisk/dns/nat capabilities — most harness tests don't +// want those firing real side-effects. Pass capability fakes to +// exercise dispatch paths. +func withVMCaps(caps ...vmCapability) testDaemonOption { + return func(o *testDaemonOpts) { + o.vmCaps = caps + o.vmCapsSet = true + } +} + +// withVsockHostDevice overrides the /dev/vhost-vsock path VMService +// checks during preflight. Useful for tests that need RequireFile to +// succeed against a tempfile without root access to the real device. +func withVsockHostDevice(path string) testDaemonOption { + return func(o *testDaemonOpts) { o.vsockHostDevice = path } +} + +// newTestDaemon builds a wired *Daemon backed by tempdir state, +// ready for tests that drive service methods or dispatch logic. +// All infrastructure comes from either t.TempDir() or the +// provided overrides; nothing touches the invoking user's real +// state. +// +// What the harness gives you by default: +// +// - paths.Layout rooted at t.TempDir() (distinct StateDir, +// ConfigDir, CacheDir, VMsDir, ImagesDir, KernelsDir, SSHDir, +// KnownHostsPath) +// - fresh store.Store opened against a tempdir state.db with all +// migrations run, auto-closed on t.Cleanup +// - permissiveRunner returning empty output + no error for every +// Run/RunSudo call (override with scriptedRunner or any other +// system.CommandRunner when you need assertion-style scripting) +// - io.Discard logger (quiet tests) +// - empty vmCaps (so default capability side-effects don't fire) +// - defaultVsockHostDevice on VMService (tests that need this to +// resolve via RequireFile should pass withVsockHostDevice to a +// tempfile) +// +// Returns the wired *Daemon. Every service pointer is non-nil; +// d.store is non-nil; d.vmCaps is exactly what the test asked for. +func newTestDaemon(t *testing.T, opts ...testDaemonOption) *Daemon { + t.Helper() + applied := testDaemonOpts{} + for _, opt := range opts { + opt(&applied) + } + + layout := applied.layout + if layout == nil { + dir := t.TempDir() + layout = &paths.Layout{ + StateDir: filepath.Join(dir, "state"), + ConfigDir: filepath.Join(dir, "config"), + CacheDir: filepath.Join(dir, "cache"), + VMsDir: filepath.Join(dir, "state", "vms"), + ImagesDir: filepath.Join(dir, "state", "images"), + KernelsDir: filepath.Join(dir, "state", "kernels"), + SSHDir: filepath.Join(dir, "state", "ssh"), + KnownHostsPath: filepath.Join(dir, "state", "ssh", "known_hosts"), + DBPath: filepath.Join(dir, "state", "state.db"), + SocketPath: filepath.Join(dir, "state", "banger.sock"), + RuntimeDir: filepath.Join(dir, "runtime"), + } + } + + st := applied.store + if st == nil { + st = openDaemonStore(t) + } + + runner := applied.runner + if runner == nil { + runner = &permissiveRunner{} + } + + logger := applied.logger + if logger == nil { + logger = slog.New(slog.NewTextHandler(io.Discard, nil)) + } + + cfg := model.DaemonConfig{ + StatsPollInterval: model.DefaultStatsPollInterval, + BridgeName: model.DefaultBridgeName, + BridgeIP: model.DefaultBridgeIP, + CIDR: model.DefaultCIDR, + DefaultDNS: model.DefaultDNS, + } + if applied.config != nil { + cfg = *applied.config + } + + d := &Daemon{ + layout: *layout, + config: cfg, + store: st, + runner: runner, + logger: logger, + vmCaps: applied.vmCaps, + } + wireServices(d) + // wireServices fills in the default workDisk/dns/nat capability + // list when vmCaps is empty at call time — that's the production + // path. Harness callers who didn't opt in to capabilities via + // withVMCaps explicitly want them OFF so their test doesn't + // accidentally fire real NAT rules or a DNS publish. Reset to + // nil here; withVMCaps sets vmCapsSet to skip this reset. + if !applied.vmCapsSet { + d.vmCaps = nil + } + if applied.vsockHostDevice != "" { + d.vm.vsockHostDevice = applied.vsockHostDevice + } + return d +} + +// TestNewTestDaemonDefaults pins the contract new callers rely on: +// a zero-option call returns a fully-wired daemon with every service +// pointer populated, a writable tempdir-backed store, and an empty +// capability list (so nothing fires real side-effects). If any of +// those invariants drift, every test that switches to newTestDaemon +// will silently start exercising different behaviour. +func TestNewTestDaemonDefaults(t *testing.T) { + d := newTestDaemon(t) + + if d.net == nil || d.img == nil || d.ws == nil || d.vm == nil { + t.Fatalf("wireServices left a service nil: net=%v img=%v ws=%v vm=%v", + d.net != nil, d.img != nil, d.ws != nil, d.vm != nil) + } + if d.store == nil { + t.Fatal("store is nil; harness must provide a working store") + } + if len(d.vmCaps) != 0 { + t.Fatalf("vmCaps = %d, want 0 (harness default must not fire real capabilities)", len(d.vmCaps)) + } + if d.vm.vsockHostDevice != defaultVsockHostDevice { + t.Fatalf("vsockHostDevice = %q, want default %q", d.vm.vsockHostDevice, defaultVsockHostDevice) + } +} + +// TestNewTestDaemonOptionsOverride verifies the option functions +// actually land on the resulting Daemon. Guard against a silent +// rename breaking option plumbing. +func TestNewTestDaemonOptionsOverride(t *testing.T) { + var buf bytes.Buffer + customLogger := slog.New(slog.NewTextHandler(&buf, nil)) + customRunner := &countingRunner{} + customVsock := filepath.Join(t.TempDir(), "vhost-vsock") + customCap := testCapability{name: "marker"} + + d := newTestDaemon(t, + withLogger(customLogger), + withRunner(customRunner), + withVsockHostDevice(customVsock), + withVMCaps(customCap), + ) + + if d.logger != customLogger { + t.Error("withLogger: logger not overridden") + } + if d.runner != customRunner { + t.Error("withRunner: runner not overridden") + } + if d.vm.vsockHostDevice != customVsock { + t.Errorf("withVsockHostDevice: got %q, want %q", d.vm.vsockHostDevice, customVsock) + } + if len(d.vmCaps) != 1 || d.vmCaps[0].Name() != "marker" { + t.Errorf("withVMCaps: vmCaps = %v, want one 'marker' cap", d.vmCaps) + } +} diff --git a/internal/daemon/dashboard.go b/internal/daemon/dashboard.go deleted file mode 100644 index b0953b5..0000000 --- a/internal/daemon/dashboard.go +++ /dev/null @@ -1,63 +0,0 @@ -package daemon - -import ( - "context" - - "banger/internal/api" - "banger/internal/model" - "banger/internal/system" -) - -func (d *Daemon) DashboardSummary(ctx context.Context) (api.DashboardSummary, error) { - summary := api.DashboardSummary{ - GeneratedAt: model.Now(), - Sudo: api.SudoStatus{ - Command: "sudo -v", - }, - } - if err := system.CheckSudo(ctx); err != nil { - summary.Sudo.Error = err.Error() - } else { - summary.Sudo.Available = true - } - - if host, err := system.ReadHostResources(); err == nil { - summary.Host.CPUCount = host.CPUCount - summary.Host.TotalMemoryBytes = host.TotalMemoryBytes - } - if usage, err := system.ReadFilesystemUsage(d.layout.StateDir); err == nil { - summary.Host.StateFilesystemTotalBytes = usage.TotalBytes - summary.Host.StateFilesystemFreeBytes = usage.FreeBytes - } - - images, err := d.store.ListImages(ctx) - if err != nil { - return api.DashboardSummary{}, err - } - for _, image := range images { - summary.Banger.ImageCount++ - if image.Managed { - summary.Banger.ManagedImageCount++ - } - } - - vms, err := d.store.ListVMs(ctx) - if err != nil { - return api.DashboardSummary{}, err - } - for _, vm := range vms { - summary.Banger.VMCount++ - summary.Banger.ConfiguredVCPUCount += vm.Spec.VCPUCount - summary.Banger.ConfiguredMemoryBytes += int64(vm.Spec.MemoryMiB) * 1024 * 1024 - summary.Banger.ConfiguredDiskBytes += vm.Spec.WorkDiskSizeBytes - summary.Banger.UsedSystemOverlayBytes += vm.Stats.SystemOverlayBytes - summary.Banger.UsedWorkDiskBytes += vm.Stats.WorkDiskBytes - if vm.State == model.VMStateRunning && system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - summary.Banger.RunningVMCount++ - summary.Banger.RunningCPUPercent += vm.Stats.CPUPercent - summary.Banger.RunningRSSBytes += vm.Stats.RSSBytes - summary.Banger.RunningVSZBytes += vm.Stats.VSZBytes - } - } - return summary, nil -} diff --git a/internal/daemon/dispatch.go b/internal/daemon/dispatch.go new file mode 100644 index 0000000..20886d5 --- /dev/null +++ b/internal/daemon/dispatch.go @@ -0,0 +1,309 @@ +package daemon + +import ( + "context" + "fmt" + + "banger/internal/api" + "banger/internal/buildinfo" + "banger/internal/rpc" +) + +// handler is the signature every RPC method dispatches through. Keeps +// Daemon.dispatch a one-liner — lookup + invoke — instead of the old +// ~240-line `switch`. Handlers close over a `*Daemon` parameter at +// call time (passed by the driver) rather than baked into the map, +// so tests that stand up a *Daemon with custom wiring re-use the +// same table without re-registering anything. +type handler func(ctx context.Context, d *Daemon, req rpc.Request) rpc.Response + +// paramHandler wraps the common "decode params of type P, call +// service returning (R, error), wrap R" flow that 28 of 34 methods +// follow. Compile-time type-safe — no reflection. P and R are +// deduced from the function literal passed in, so per-handler +// registration reads as "what's the RPC shape + what's the service +// call" and nothing else. +func paramHandler[P any, R any](call func(ctx context.Context, d *Daemon, p P) (R, error)) handler { + return func(ctx context.Context, d *Daemon, req rpc.Request) rpc.Response { + p, err := rpc.DecodeParams[P](req) + if err != nil { + return rpc.NewError("bad_request", err.Error()) + } + result, err := call(ctx, d, p) + return marshalResultOrError(result, err) + } +} + +// noParamHandler is the decode-free variant for RPC methods that +// take no params (ping, shutdown, *.list, kernel.catalog). +func noParamHandler[R any](call func(ctx context.Context, d *Daemon) (R, error)) handler { + return func(ctx context.Context, d *Daemon, _ rpc.Request) rpc.Response { + result, err := call(ctx, d) + return marshalResultOrError(result, err) + } +} + +// rpcHandlers maps every supported method name to its handler. Adding +// or removing a method is a single-line diff here — unlike the old +// switch, there's no four-line decode/call/wrap boilerplate to copy. +// The four special-case handlers (vm.logs, vm.ssh, ping, shutdown) +// live below the map; they need pre-service validation or raw result +// encoding that the generic wrapper can't express. +var rpcHandlers = map[string]handler{ + "ping": pingHandler, + "shutdown": shutdownHandler, + "daemon.operations.list": noParamHandler(daemonOperationsListDispatch), + + "vm.create": paramHandler(vmCreateDispatch), + "vm.create.begin": paramHandler(vmCreateBeginDispatch), + "vm.create.status": paramHandler(vmCreateStatusDispatch), + "vm.create.cancel": paramHandler(vmCreateCancelDispatch), + "vm.list": noParamHandler(vmListDispatch), + "vm.show": paramHandler(vmShowDispatch), + "vm.start": paramHandler(vmStartDispatch), + "vm.stop": paramHandler(vmStopDispatch), + "vm.kill": paramHandler(vmKillDispatch), + "vm.restart": paramHandler(vmRestartDispatch), + "vm.delete": paramHandler(vmDeleteDispatch), + "vm.set": paramHandler(vmSetDispatch), + "vm.stats": paramHandler(vmStatsDispatch), + "vm.logs": vmLogsHandler, + "vm.ssh": vmSSHHandler, + "vm.health": paramHandler(vmHealthDispatch), + "vm.ping": paramHandler(vmPingDispatch), + "vm.ports": paramHandler(vmPortsDispatch), + + "vm.workspace.prepare": paramHandler(workspacePrepareDispatch), + "vm.workspace.export": paramHandler(workspaceExportDispatch), + + "image.list": noParamHandler(imageListDispatch), + "image.show": paramHandler(imageShowDispatch), + "image.register": paramHandler(imageRegisterDispatch), + "image.promote": paramHandler(imagePromoteDispatch), + "image.delete": paramHandler(imageDeleteDispatch), + "image.pull": paramHandler(imagePullDispatch), + "image.cache.prune": paramHandler(imageCachePruneDispatch), + + "kernel.list": noParamHandler(kernelListDispatch), + "kernel.show": paramHandler(kernelShowDispatch), + "kernel.delete": paramHandler(kernelDeleteDispatch), + "kernel.import": paramHandler(kernelImportDispatch), + "kernel.pull": paramHandler(kernelPullDispatch), + "kernel.catalog": noParamHandler(kernelCatalogDispatch), +} + +// ---- Service-call adapters (kept thin; the interesting shape is up +// ---- in the `paramHandler` generic. These exist so the map entries +// ---- stay readable at a glance.) + +func vmCreateDispatch(ctx context.Context, d *Daemon, p api.VMCreateParams) (api.VMShowResult, error) { + vm, err := d.vm.CreateVM(ctx, p) + return api.VMShowResult{VM: vm}, err +} + +func vmCreateBeginDispatch(ctx context.Context, d *Daemon, p api.VMCreateParams) (api.VMCreateBeginResult, error) { + op, err := d.vm.BeginVMCreate(ctx, p) + return api.VMCreateBeginResult{Operation: op}, err +} + +func vmCreateStatusDispatch(ctx context.Context, d *Daemon, p api.VMCreateStatusParams) (api.VMCreateStatusResult, error) { + op, err := d.vm.VMCreateStatus(ctx, p.ID) + return api.VMCreateStatusResult{Operation: op}, err +} + +func vmCreateCancelDispatch(ctx context.Context, d *Daemon, p api.VMCreateStatusParams) (api.Empty, error) { + return api.Empty{}, d.vm.CancelVMCreate(ctx, p.ID) +} + +func vmListDispatch(ctx context.Context, d *Daemon) (api.VMListResult, error) { + vms, err := d.store.ListVMs(ctx) + return api.VMListResult{VMs: vms}, err +} + +func vmShowDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMShowResult, error) { + vm, err := d.vm.FindVM(ctx, p.IDOrName) + return api.VMShowResult{VM: vm}, err +} + +func vmStartDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMShowResult, error) { + vm, err := d.vm.StartVM(ctx, p.IDOrName) + return api.VMShowResult{VM: vm}, err +} + +func vmStopDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMShowResult, error) { + vm, err := d.vm.StopVM(ctx, p.IDOrName) + return api.VMShowResult{VM: vm}, err +} + +func vmKillDispatch(ctx context.Context, d *Daemon, p api.VMKillParams) (api.VMShowResult, error) { + vm, err := d.vm.KillVM(ctx, p) + return api.VMShowResult{VM: vm}, err +} + +func vmRestartDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMShowResult, error) { + vm, err := d.vm.RestartVM(ctx, p.IDOrName) + return api.VMShowResult{VM: vm}, err +} + +func vmDeleteDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMShowResult, error) { + vm, err := d.vm.DeleteVM(ctx, p.IDOrName) + return api.VMShowResult{VM: vm}, err +} + +func vmSetDispatch(ctx context.Context, d *Daemon, p api.VMSetParams) (api.VMShowResult, error) { + vm, err := d.vm.SetVM(ctx, p) + return api.VMShowResult{VM: vm}, err +} + +func vmStatsDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMStatsResult, error) { + vm, stats, err := d.stats.GetVMStats(ctx, p.IDOrName) + return api.VMStatsResult{VM: vm, Stats: stats}, err +} + +func vmHealthDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMHealthResult, error) { + return d.stats.HealthVM(ctx, p.IDOrName) +} + +func vmPingDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMPingResult, error) { + return d.stats.PingVM(ctx, p.IDOrName) +} + +func vmPortsDispatch(ctx context.Context, d *Daemon, p api.VMRefParams) (api.VMPortsResult, error) { + return d.stats.PortsVM(ctx, p.IDOrName) +} + +func workspacePrepareDispatch(ctx context.Context, d *Daemon, p api.VMWorkspacePrepareParams) (api.VMWorkspacePrepareResult, error) { + ws, err := d.ws.PrepareVMWorkspace(ctx, p) + return api.VMWorkspacePrepareResult{Workspace: ws}, err +} + +func workspaceExportDispatch(ctx context.Context, d *Daemon, p api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + return d.ws.ExportVMWorkspace(ctx, p) +} + +func imageListDispatch(ctx context.Context, d *Daemon) (api.ImageListResult, error) { + images, err := d.store.ListImages(ctx) + return api.ImageListResult{Images: images}, err +} + +func imageShowDispatch(ctx context.Context, d *Daemon, p api.ImageRefParams) (api.ImageShowResult, error) { + image, err := d.img.FindImage(ctx, p.IDOrName) + return api.ImageShowResult{Image: image}, err +} + +func imageRegisterDispatch(ctx context.Context, d *Daemon, p api.ImageRegisterParams) (api.ImageShowResult, error) { + image, err := d.img.RegisterImage(ctx, p) + return api.ImageShowResult{Image: image}, err +} + +func imagePromoteDispatch(ctx context.Context, d *Daemon, p api.ImageRefParams) (api.ImageShowResult, error) { + image, err := d.img.PromoteImage(ctx, p.IDOrName) + return api.ImageShowResult{Image: image}, err +} + +func imageDeleteDispatch(ctx context.Context, d *Daemon, p api.ImageRefParams) (api.ImageShowResult, error) { + image, err := d.img.DeleteImage(ctx, p.IDOrName) + return api.ImageShowResult{Image: image}, err +} + +func imagePullDispatch(ctx context.Context, d *Daemon, p api.ImagePullParams) (api.ImageShowResult, error) { + image, err := d.img.PullImage(ctx, p) + return api.ImageShowResult{Image: image}, err +} + +func imageCachePruneDispatch(ctx context.Context, d *Daemon, p api.ImageCachePruneParams) (api.ImageCachePruneResult, error) { + return d.img.PruneOCICache(ctx, p) +} + +func daemonOperationsListDispatch(ctx context.Context, d *Daemon) (api.OperationsListResult, error) { + return d.ListOperations(ctx) +} + +func kernelListDispatch(ctx context.Context, d *Daemon) (api.KernelListResult, error) { + return d.img.KernelList(ctx) +} + +func kernelShowDispatch(ctx context.Context, d *Daemon, p api.KernelRefParams) (api.KernelShowResult, error) { + entry, err := d.img.KernelShow(ctx, p.Name) + return api.KernelShowResult{Entry: entry}, err +} + +func kernelDeleteDispatch(ctx context.Context, d *Daemon, p api.KernelRefParams) (api.Empty, error) { + return api.Empty{}, d.img.KernelDelete(ctx, p.Name) +} + +func kernelImportDispatch(ctx context.Context, d *Daemon, p api.KernelImportParams) (api.KernelShowResult, error) { + entry, err := d.img.KernelImport(ctx, p) + return api.KernelShowResult{Entry: entry}, err +} + +func kernelPullDispatch(ctx context.Context, d *Daemon, p api.KernelPullParams) (api.KernelShowResult, error) { + entry, err := d.img.KernelPull(ctx, p) + return api.KernelShowResult{Entry: entry}, err +} + +func kernelCatalogDispatch(ctx context.Context, d *Daemon) (api.KernelCatalogResult, error) { + return d.img.KernelCatalog(ctx) +} + +// ---- Special-case handlers: pre-service validation, custom error +// ---- codes, or raw rpc.NewResult encoding — things the generic +// ---- wrapper can't express. + +// pingHandler is info-only: no service call, just a snapshot of +// build metadata. Raw rpc.NewResult to match the pre-refactor +// encoding; marshalResultOrError would over-wrap this. +func pingHandler(_ context.Context, d *Daemon, _ rpc.Request) rpc.Response { + info := buildinfo.Current() + result, _ := rpc.NewResult(api.PingResult{ + Status: "ok", + PID: d.pid, + Version: info.Version, + Commit: info.Commit, + BuiltAt: info.BuiltAt, + }) + return result +} + +// shutdownHandler triggers async daemon shutdown. `d.Close` runs in +// a goroutine so the RPC response reaches the client before the +// listener closes. +func shutdownHandler(_ context.Context, d *Daemon, _ rpc.Request) rpc.Response { + go d.Close() + result, _ := rpc.NewResult(api.ShutdownResult{Status: "stopping"}) + return result +} + +// vmLogsHandler needs the "not_found" error code (distinct from +// "operation_failed") when FindVM misses, so the CLI can print a +// cleaner message. The generic paramHandler maps every service err +// to "operation_failed". +func vmLogsHandler(ctx context.Context, d *Daemon, req rpc.Request) rpc.Response { + params, err := rpc.DecodeParams[api.VMRefParams](req) + if err != nil { + return rpc.NewError("bad_request", err.Error()) + } + vm, err := d.vm.FindVM(ctx, params.IDOrName) + if err != nil { + return rpc.NewError("not_found", err.Error()) + } + return marshalResultOrError(api.VMLogsResult{LogPath: vm.Runtime.LogPath}, nil) +} + +// vmSSHHandler does two pre-service validations: FindVM / TouchVM +// for "not_found", then vmAlive for "not_running". Both distinct +// error codes feed cleaner CLI output. +func vmSSHHandler(ctx context.Context, d *Daemon, req rpc.Request) rpc.Response { + params, err := rpc.DecodeParams[api.VMRefParams](req) + if err != nil { + return rpc.NewError("bad_request", err.Error()) + } + vm, err := d.vm.TouchVM(ctx, params.IDOrName) + if err != nil { + return rpc.NewError("not_found", err.Error()) + } + if !d.vm.vmAlive(vm) { + return rpc.NewError("not_running", fmt.Sprintf("vm %s is not running", vm.Name)) + } + return marshalResultOrError(api.VMSSHResult{Name: vm.Name, GuestIP: vm.Runtime.GuestIP}, nil) +} diff --git a/internal/daemon/dispatch_test.go b/internal/daemon/dispatch_test.go new file mode 100644 index 0000000..602ffbc --- /dev/null +++ b/internal/daemon/dispatch_test.go @@ -0,0 +1,143 @@ +package daemon + +import ( + "context" + "sort" + "strings" + "testing" + + "banger/internal/rpc" +) + +// TestRPCHandlersMatchDocumentedMethods pins the surface of the RPC +// table: adding or removing a method should be an explicit, reviewable +// change. If the keyset drifts and this test isn't updated alongside, +// that's a red flag — either the documented list is stale, or a +// method sneaked in without being discussed. +// +// The expected list is the single source of truth for "methods +// banger speaks." Any production code consulting it (CLI completions, +// docs generator) can grep this test. +func TestRPCHandlersMatchDocumentedMethods(t *testing.T) { + expected := []string{ + "image.cache.prune", + "image.delete", + "image.list", + "image.promote", + "image.pull", + "image.register", + "image.show", + + "kernel.catalog", + "kernel.delete", + "kernel.import", + "kernel.list", + "kernel.pull", + "kernel.show", + + "daemon.operations.list", + + "ping", + "shutdown", + + "vm.create", + "vm.create.begin", + "vm.create.cancel", + "vm.create.status", + "vm.delete", + "vm.health", + "vm.kill", + "vm.list", + "vm.logs", + "vm.ping", + "vm.ports", + "vm.restart", + "vm.set", + "vm.show", + "vm.ssh", + "vm.start", + "vm.stats", + "vm.stop", + + "vm.workspace.export", + "vm.workspace.prepare", + } + + got := make([]string, 0, len(rpcHandlers)) + for name := range rpcHandlers { + got = append(got, name) + } + sort.Strings(got) + sort.Strings(expected) + + if len(got) != len(expected) { + t.Fatalf("method count: got %d, want %d\n got: %v\n want: %v", len(got), len(expected), got, expected) + } + for i := range expected { + if got[i] != expected[i] { + t.Fatalf("method[%d]: got %q, want %q\n full got: %v\n full want: %v", i, got[i], expected[i], got, expected) + } + } +} + +// TestRPCHandlersAllNonNil catches a silly-but-possible footgun: +// registering a method with a nil function literal. +func TestRPCHandlersAllNonNil(t *testing.T) { + for name, h := range rpcHandlers { + if h == nil { + t.Errorf("rpcHandlers[%q] = nil", name) + } + } +} + +// TestDispatchStampsOpIDOnError pins the contract that every error +// response leaving dispatch carries an op_id, even on the +// short-circuit paths (bad_version, unknown_method) that never +// reach a handler. Operators rely on this id to correlate a CLI +// failure to a daemon log line. +func TestDispatchStampsOpIDOnError(t *testing.T) { + d := &Daemon{} + t.Run("unknown_method", func(t *testing.T) { + resp := d.dispatch(context.Background(), rpc.Request{Version: rpc.Version, Method: "no.such.method"}) + if resp.OK { + t.Fatalf("expected error response, got %+v", resp) + } + if resp.Error == nil || resp.Error.Code != "unknown_method" { + t.Fatalf("error = %+v, want unknown_method", resp.Error) + } + if !strings.HasPrefix(resp.Error.OpID, "op-") { + t.Fatalf("op_id = %q, want op-* prefix", resp.Error.OpID) + } + }) + t.Run("bad_version", func(t *testing.T) { + resp := d.dispatch(context.Background(), rpc.Request{Version: rpc.Version + 99, Method: "ping"}) + if resp.OK { + t.Fatalf("expected error response, got %+v", resp) + } + if resp.Error == nil || resp.Error.Code != "bad_version" { + t.Fatalf("error = %+v, want bad_version", resp.Error) + } + if !strings.HasPrefix(resp.Error.OpID, "op-") { + t.Fatalf("op_id = %q, want op-* prefix", resp.Error.OpID) + } + }) +} + +// TestDispatchPropagatesOpIDFromContext covers the case where a +// handler returns its own rpc.NewError with an empty op_id (most +// service errors do); the dispatch wrapper must stamp the +// dispatch-generated id on the way out. +func TestDispatchPropagatesOpIDFromContext(t *testing.T) { + d := &Daemon{ + requestHandler: func(_ context.Context, _ rpc.Request) rpc.Response { + return rpc.NewError("operation_failed", "deliberate test failure") + }, + } + resp := d.dispatch(context.Background(), rpc.Request{Version: rpc.Version, Method: "anything"}) + if resp.OK || resp.Error == nil { + t.Fatalf("expected error response, got %+v", resp) + } + if !strings.HasPrefix(resp.Error.OpID, "op-") { + t.Fatalf("dispatch did not stamp op_id: %+v", resp.Error) + } +} diff --git a/internal/daemon/dmsnap/dmsnap.go b/internal/daemon/dmsnap/dmsnap.go new file mode 100644 index 0000000..cbc5945 --- /dev/null +++ b/internal/daemon/dmsnap/dmsnap.go @@ -0,0 +1,128 @@ +// Package dmsnap wraps the host-side device-mapper snapshot operations used +// to give each VM a copy-on-write view over a shared rootfs image. It issues +// losetup/dmsetup via a system.CommandRunner-compatible runner. +package dmsnap + +import ( + "context" + "errors" + "fmt" + "strings" + "time" +) + +// Runner is the narrow command-runner surface dmsnap needs. system.Runner +// satisfies it. +type Runner interface { + RunSudo(ctx context.Context, args ...string) ([]byte, error) +} + +// Handles records the loop devices and dm target allocated for a snapshot. +// Callers pass it back to Cleanup to unwind in the right order. +type Handles struct { + BaseLoop string + COWLoop string + DMName string + DMDev string +} + +// Create sets up a dm-snapshot named dmName layering cowPath over rootfsPath. +// On failure it cleans up whatever it had attached so far. +func Create(ctx context.Context, runner Runner, rootfsPath, cowPath, dmName string) (handles Handles, err error) { + defer func() { + if err == nil { + return + } + if cleanupErr := Cleanup(context.Background(), runner, handles); cleanupErr != nil { + err = errors.Join(err, cleanupErr) + } + }() + + baseBytes, err := runner.RunSudo(ctx, "losetup", "-f", "--show", "--read-only", rootfsPath) + if err != nil { + return handles, err + } + handles.BaseLoop = strings.TrimSpace(string(baseBytes)) + + cowBytes, err := runner.RunSudo(ctx, "losetup", "-f", "--show", cowPath) + if err != nil { + return handles, err + } + handles.COWLoop = strings.TrimSpace(string(cowBytes)) + + sectorsBytes, err := runner.RunSudo(ctx, "blockdev", "--getsz", handles.BaseLoop) + if err != nil { + return handles, err + } + sectors := strings.TrimSpace(string(sectorsBytes)) + + if _, err := runner.RunSudo(ctx, "dmsetup", "create", dmName, "--table", fmt.Sprintf("0 %s snapshot %s %s P 8", sectors, handles.BaseLoop, handles.COWLoop)); err != nil { + return handles, err + } + handles.DMName = dmName + handles.DMDev = "/dev/mapper/" + dmName + return handles, nil +} + +// Cleanup tears down a snapshot: remove the dm target, then detach the loops. +// Missing-handle errors (already cleaned up) are ignored. +func Cleanup(ctx context.Context, runner Runner, handles Handles) error { + var cleanupErr error + + switch { + case handles.DMName != "": + if err := Remove(ctx, runner, handles.DMName); err != nil { + cleanupErr = errors.Join(cleanupErr, err) + } + case handles.DMDev != "": + if err := Remove(ctx, runner, handles.DMDev); err != nil { + cleanupErr = errors.Join(cleanupErr, err) + } + } + + if handles.COWLoop != "" { + if _, err := runner.RunSudo(ctx, "losetup", "-d", handles.COWLoop); err != nil { + if !isMissing(err) { + cleanupErr = errors.Join(cleanupErr, err) + } + } + } + if handles.BaseLoop != "" { + if _, err := runner.RunSudo(ctx, "losetup", "-d", handles.BaseLoop); err != nil { + if !isMissing(err) { + cleanupErr = errors.Join(cleanupErr, err) + } + } + } + + return cleanupErr +} + +// Remove retries dmsetup remove while the device is briefly busy after +// detach. Missing targets succeed. +func Remove(ctx context.Context, runner Runner, target string) error { + deadline := time.Now().Add(15 * time.Second) + for { + if _, err := runner.RunSudo(ctx, "dmsetup", "remove", target); err != nil { + if isMissing(err) { + return nil + } + if strings.Contains(err.Error(), "Device or resource busy") && time.Now().Before(deadline) { + time.Sleep(100 * time.Millisecond) + continue + } + return err + } + return nil + } +} + +func isMissing(err error) bool { + if err == nil { + return false + } + msg := err.Error() + return strings.Contains(msg, "No such device or address") || + strings.Contains(msg, "not found") || + strings.Contains(msg, "does not exist") +} diff --git a/internal/daemon/dmsnap/dmsnap_test.go b/internal/daemon/dmsnap/dmsnap_test.go new file mode 100644 index 0000000..f179f2b --- /dev/null +++ b/internal/daemon/dmsnap/dmsnap_test.go @@ -0,0 +1,288 @@ +package dmsnap + +import ( + "context" + "errors" + "strings" + "testing" +) + +// scriptedRunner records every RunSudo call's argv and plays back a +// scripted sequence of (out, err) responses. Going past the script is +// a fatal error so an unexpected extra call shows up clearly. Mirrors +// the pattern used by internal/daemon/fcproc/fcproc_test.go but stays +// local to dmsnap (this is a leaf package). +type scriptedRunner struct { + t *testing.T + scripts []scriptedReply + calls [][]string +} + +type scriptedReply struct { + out []byte + err error +} + +func (r *scriptedRunner) RunSudo(_ context.Context, args ...string) ([]byte, error) { + r.t.Helper() + r.calls = append(r.calls, append([]string(nil), args...)) + if len(r.scripts) == 0 { + r.t.Fatalf("unexpected RunSudo call %d: %v", len(r.calls), args) + } + step := r.scripts[0] + r.scripts = r.scripts[1:] + return step.out, step.err +} + +func argsContain(args []string, want ...string) bool { + if len(args) < len(want) { + return false + } + for i, w := range want { + if args[i] != w { + return false + } + } + return true +} + +// TestCreateOrdersOpsAndPopulatesHandles pins the four-step setup +// sequence Create runs in: losetup base (read-only), losetup cow, +// blockdev getsz, dmsetup create with a snapshot table. If the order +// drifts the helper would build dm targets backed by the wrong +// device, which silently corrupts every VM that uses the snapshot. +func TestCreateOrdersOpsAndPopulatesHandles(t *testing.T) { + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {out: []byte("/dev/loop0\n")}, // losetup -f --show --read-only rootfs + {out: []byte("/dev/loop1\n")}, // losetup -f --show cow + {out: []byte("16384\n")}, // blockdev --getsz /dev/loop0 + {}, // dmsetup create + }, + } + + handles, err := Create(context.Background(), runner, "/state/rootfs.ext4", "/state/cow.img", "fc-rootfs-test") + if err != nil { + t.Fatalf("Create: %v", err) + } + + if len(runner.calls) != 4 { + t.Fatalf("got %d RunSudo calls, want 4", len(runner.calls)) + } + if !argsContain(runner.calls[0], "losetup", "-f", "--show", "--read-only", "/state/rootfs.ext4") { + t.Fatalf("call 0 = %v, want read-only losetup of rootfs", runner.calls[0]) + } + if !argsContain(runner.calls[1], "losetup", "-f", "--show", "/state/cow.img") { + t.Fatalf("call 1 = %v, want losetup of cow", runner.calls[1]) + } + if !argsContain(runner.calls[2], "blockdev", "--getsz", "/dev/loop0") { + t.Fatalf("call 2 = %v, want blockdev getsz on base loop", runner.calls[2]) + } + if !argsContain(runner.calls[3], "dmsetup", "create", "fc-rootfs-test") { + t.Fatalf("call 3 = %v, want dmsetup create of dm name", runner.calls[3]) + } + // The snapshot table must reference the base + cow loops in that + // order. Pin it so a future refactor can't accidentally swap them + // (which would make the COW the read-only side and corrupt every + // write). + tableArg := runner.calls[3][len(runner.calls[3])-1] + if !strings.Contains(tableArg, "snapshot /dev/loop0 /dev/loop1") { + t.Fatalf("dmsetup table = %q, want 'snapshot /dev/loop0 /dev/loop1'", tableArg) + } + + if handles.BaseLoop != "/dev/loop0" || handles.COWLoop != "/dev/loop1" { + t.Fatalf("loops = %+v, want base=loop0 cow=loop1", handles) + } + if handles.DMName != "fc-rootfs-test" || handles.DMDev != "/dev/mapper/fc-rootfs-test" { + t.Fatalf("dm names = %+v, want fc-rootfs-test", handles) + } +} + +// TestCreateFailureRunsCleanup verifies that a partial setup is +// unwound on failure: if dmsetup create fails after both loops are +// attached, Create must release them via losetup -d before returning. +// Without this the host accumulates orphan loop devices on every +// failed VM start. +func TestCreateFailureRunsCleanup(t *testing.T) { + dmCreateErr := errors.New("dmsetup table refused") + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {out: []byte("/dev/loop0\n")}, // losetup base + {out: []byte("/dev/loop1\n")}, // losetup cow + {out: []byte("16384\n")}, // blockdev getsz + {err: dmCreateErr}, // dmsetup create fails + {}, // cleanup: losetup -d /dev/loop1 + {}, // cleanup: losetup -d /dev/loop0 + }, + } + + _, err := Create(context.Background(), runner, "/state/rootfs.ext4", "/state/cow.img", "fc-rootfs-test") + if !errors.Is(err, dmCreateErr) { + t.Fatalf("Create error = %v, want dmsetup error to bubble", err) + } + if len(runner.calls) != 6 { + t.Fatalf("got %d RunSudo calls, want 6 (4 setup + 2 cleanup)", len(runner.calls)) + } + // Cleanup order: cow first, then base, mirroring stack unwind. + if !argsContain(runner.calls[4], "losetup", "-d", "/dev/loop1") { + t.Fatalf("call 4 = %v, want losetup -d on cow loop", runner.calls[4]) + } + if !argsContain(runner.calls[5], "losetup", "-d", "/dev/loop0") { + t.Fatalf("call 5 = %v, want losetup -d on base loop", runner.calls[5]) + } +} + +// TestCleanupOrdersDmsetupBeforeLosetup pins the destruction order: +// the dm target must come down BEFORE the loops it sits on are +// detached, otherwise dmsetup remove sees EBUSY because the target's +// backing devices vanished mid-flight. +func TestCleanupOrdersDmsetupBeforeLosetup(t *testing.T) { + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {}, // dmsetup remove fc-rootfs-test + {}, // losetup -d cow + {}, // losetup -d base + }, + } + + handles := Handles{ + BaseLoop: "/dev/loop0", + COWLoop: "/dev/loop1", + DMName: "fc-rootfs-test", + DMDev: "/dev/mapper/fc-rootfs-test", + } + if err := Cleanup(context.Background(), runner, handles); err != nil { + t.Fatalf("Cleanup: %v", err) + } + if len(runner.calls) != 3 { + t.Fatalf("got %d RunSudo calls, want 3", len(runner.calls)) + } + if !argsContain(runner.calls[0], "dmsetup", "remove", "fc-rootfs-test") { + t.Fatalf("call 0 = %v, want dmsetup remove first", runner.calls[0]) + } + if !argsContain(runner.calls[1], "losetup", "-d", "/dev/loop1") { + t.Fatalf("call 1 = %v, want cow loop detach second", runner.calls[1]) + } + if !argsContain(runner.calls[2], "losetup", "-d", "/dev/loop0") { + t.Fatalf("call 2 = %v, want base loop detach last", runner.calls[2]) + } +} + +// TestCleanupFallsBackToDMDevWhenNameEmpty covers the "we only know +// the /dev/mapper path" branch — Remove accepts either form, and +// Cleanup picks DMDev when DMName isn't recorded (older state files +// only stored the path). +func TestCleanupFallsBackToDMDevWhenNameEmpty(t *testing.T) { + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {}, // dmsetup remove /dev/mapper/fc-rootfs-test + {}, // losetup -d cow + {}, // losetup -d base + }, + } + handles := Handles{ + BaseLoop: "/dev/loop0", + COWLoop: "/dev/loop1", + DMDev: "/dev/mapper/fc-rootfs-test", + // DMName intentionally empty. + } + if err := Cleanup(context.Background(), runner, handles); err != nil { + t.Fatalf("Cleanup: %v", err) + } + if !argsContain(runner.calls[0], "dmsetup", "remove", "/dev/mapper/fc-rootfs-test") { + t.Fatalf("call 0 = %v, want dmsetup remove of DMDev path", runner.calls[0]) + } +} + +// TestCleanupTolerantOfMissingLoops pins the idempotency contract: +// running cleanup against handles whose loops are already detached +// (e.g. a daemon crash mid-cleanup, then a second pass) returns nil +// rather than failing. dmsnap.isMissing recognises kernel/losetup's +// "No such device" wording. +func TestCleanupTolerantOfMissingLoops(t *testing.T) { + missing := errors.New("losetup: /dev/loop1: No such device or address") + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {}, // dmsetup remove ok + {err: missing}, // losetup -d cow: already gone + {err: missing}, // losetup -d base: already gone + }, + } + handles := Handles{ + BaseLoop: "/dev/loop0", + COWLoop: "/dev/loop1", + DMName: "fc-rootfs-test", + } + if err := Cleanup(context.Background(), runner, handles); err != nil { + t.Fatalf("Cleanup: %v, want nil for already-gone loops", err) + } +} + +// TestCleanupSurfacesUnexpectedLoopErrors confirms that NON-missing +// errors do bubble up — the idempotency guard is narrow on purpose, +// so an EBUSY or permission error from losetup actually fails the +// cleanup. +func TestCleanupSurfacesUnexpectedLoopErrors(t *testing.T) { + wedged := errors.New("losetup: /dev/loop1: device is busy") + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {}, + {err: wedged}, + {}, + }, + } + handles := Handles{ + BaseLoop: "/dev/loop0", + COWLoop: "/dev/loop1", + DMName: "fc-rootfs-test", + } + err := Cleanup(context.Background(), runner, handles) + if !errors.Is(err, wedged) { + t.Fatalf("Cleanup error = %v, want busy error to bubble", err) + } +} + +// TestRemoveReturnsNilOnMissingTarget mirrors the loop-cleanup +// idempotency guard: an absent dm target is the desired end state, so +// Remove returns nil without retrying. +func TestRemoveReturnsNilOnMissingTarget(t *testing.T) { + missing := errors.New("dmsetup: target not found") + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {err: missing}, + }, + } + if err := Remove(context.Background(), runner, "fc-rootfs-test"); err != nil { + t.Fatalf("Remove: %v, want nil for missing target", err) + } + if len(runner.calls) != 1 { + t.Fatalf("got %d RunSudo calls, want 1 (missing should not retry)", len(runner.calls)) + } +} + +// TestRemoveBubblesNonRetryableErrors covers the third Remove branch: +// errors that aren't busy and aren't missing must surface immediately +// so the daemon can record the failure and clean up by other means. +func TestRemoveBubblesNonRetryableErrors(t *testing.T) { + denied := errors.New("dmsetup: permission denied") + runner := &scriptedRunner{ + t: t, + scripts: []scriptedReply{ + {err: denied}, + }, + } + err := Remove(context.Background(), runner, "fc-rootfs-test") + if !errors.Is(err, denied) { + t.Fatalf("Remove error = %v, want permission error to bubble", err) + } + if len(runner.calls) != 1 { + t.Fatalf("got %d RunSudo calls, want 1 (permission error should not retry)", len(runner.calls)) + } +} diff --git a/internal/daemon/dns_routing.go b/internal/daemon/dns_routing.go new file mode 100644 index 0000000..0167c5a --- /dev/null +++ b/internal/daemon/dns_routing.go @@ -0,0 +1,47 @@ +package daemon + +import ( + "context" + "strings" +) + +const vmResolverRouteDomain = "~vm" + +func (n *HostNetwork) syncVMDNSResolverRouting(ctx context.Context) error { + if n == nil || n.vmDNS == nil { + return nil + } + if strings.TrimSpace(n.config.BridgeName) == "" { + return nil + } + if _, err := n.lookupExecutable("resolvectl"); err != nil { + return nil + } + if _, err := n.runner.Run(ctx, "ip", "link", "show", n.config.BridgeName); err != nil { + return nil + } + serverAddr := strings.TrimSpace(n.vmDNSAddr(n.vmDNS)) + if serverAddr == "" { + return nil + } + return n.privOps().SyncResolverRouting(ctx, serverAddr) +} + +func (n *HostNetwork) clearVMDNSResolverRouting(ctx context.Context) error { + if n == nil || strings.TrimSpace(n.config.BridgeName) == "" { + return nil + } + if _, err := n.lookupExecutable("resolvectl"); err != nil { + return nil + } + if _, err := n.runner.Run(ctx, "ip", "link", "show", n.config.BridgeName); err != nil { + return nil + } + return n.privOps().ClearResolverRouting(ctx) +} + +func (n *HostNetwork) ensureVMDNSResolverRouting(ctx context.Context) { + if err := n.syncVMDNSResolverRouting(ctx); err != nil && n.logger != nil { + n.logger.Warn("vm dns resolver route sync failed", "bridge", n.config.BridgeName, "error", err.Error()) + } +} diff --git a/internal/daemon/dns_routing_test.go b/internal/daemon/dns_routing_test.go new file mode 100644 index 0000000..fb5c056 --- /dev/null +++ b/internal/daemon/dns_routing_test.go @@ -0,0 +1,62 @@ +package daemon + +import ( + "context" + "testing" + + "banger/internal/model" + "banger/internal/vmdns" +) + +func TestSyncVMDNSResolverRoutingConfiguresResolved(t *testing.T) { + runner := &scriptedRunner{ + t: t, + steps: []runnerStep{ + {call: runnerCall{name: "ip", args: []string{"link", "show", model.DefaultBridgeName}}, out: []byte("1: br-fc\n")}, + sudoStep("", nil, "resolvectl", "dns", model.DefaultBridgeName, "127.0.0.1:42069"), + sudoStep("", nil, "resolvectl", "domain", model.DefaultBridgeName, vmResolverRouteDomain), + sudoStep("", nil, "resolvectl", "default-route", model.DefaultBridgeName, "no"), + }, + } + cfg := model.DaemonConfig{BridgeName: model.DefaultBridgeName} + n := &HostNetwork{ + runner: runner, config: cfg, vmDNS: new(vmdns.Server), + lookupExecutable: func(name string) (string, error) { + if name == "resolvectl" { + return "/usr/bin/resolvectl", nil + } + return "", nil + }, + vmDNSAddr: func(*vmdns.Server) string { return "127.0.0.1:42069" }, + } + + if err := n.syncVMDNSResolverRouting(context.Background()); err != nil { + t.Fatalf("syncVMDNSResolverRouting: %v", err) + } + runner.assertExhausted() +} + +func TestClearVMDNSResolverRoutingRevertsBridgeConfig(t *testing.T) { + runner := &scriptedRunner{ + t: t, + steps: []runnerStep{ + {call: runnerCall{name: "ip", args: []string{"link", "show", model.DefaultBridgeName}}, out: []byte("1: br-fc\n")}, + sudoStep("", nil, "resolvectl", "revert", model.DefaultBridgeName), + }, + } + cfg := model.DaemonConfig{BridgeName: model.DefaultBridgeName} + n := &HostNetwork{ + runner: runner, config: cfg, + lookupExecutable: func(name string) (string, error) { + if name == "resolvectl" { + return "/usr/bin/resolvectl", nil + } + return "", nil + }, + } + + if err := n.clearVMDNSResolverRouting(context.Background()); err != nil { + t.Fatalf("clearVMDNSResolverRouting: %v", err) + } + runner.assertExhausted() +} diff --git a/internal/daemon/doc.go b/internal/daemon/doc.go new file mode 100644 index 0000000..d20dbf1 --- /dev/null +++ b/internal/daemon/doc.go @@ -0,0 +1,87 @@ +// Package daemon hosts the Banger owner-daemon process. +// +// The daemon exposes a JSON-RPC endpoint over a Unix socket. The +// *Daemon type is a thin composition root: it holds shared +// infrastructure (store, runner, logger, layout, config, listener, +// privileged-ops adapter) plus pointers to four focused services and +// forwards RPCs to them. +// +// On the supported systemd install path, this package runs inside +// `bangerd.service` as the configured owner user and delegates +// privileged host-kernel operations to `bangerd-root.service` through +// the privileged-ops seam. Non-system/dev paths use the same seam with +// an in-process adapter instead. +// +// Services: +// +// *HostNetwork Bridge / tap pool / NAT / DNS / firecracker +// process / DM snapshots / vsock readiness. +// Owns tapPool and vmDNS. +// *ImageService Register / promote / delete / pull (bundle + +// OCI) / kernel catalog / managed-seed refresh. +// Owns imageOpsMu. +// *WorkspaceService workspace.prepare / workspace.export + the +// per-VM authorised-key and git-identity sync +// that runs at start. Owns workspaceLocks. +// *VMService VM lifecycle (create/start/stop/restart/kill/ +// delete/set), stats, ports, preflight. Owns +// vmLocks, createVMMu, createOps, handles. +// +// Subpackages (stateless helpers): +// +// internal/daemon/opstate Generic Registry[T AsyncOp]. +// internal/daemon/dmsnap Device-mapper COW snapshot lifecycle. +// internal/daemon/fcproc Firecracker process helpers. +// internal/daemon/imagemgr Image subsystem helpers. +// internal/daemon/workspace Workspace helpers. +// +// File inventory: +// +// daemon.go Composition root, Open/Close/Serve, dispatch, +// reconcile orchestrator, backgroundLoop. +// host_network.go HostNetwork struct + constructor. +// image_service.go ImageService struct + constructor + FindImage. +// workspace_service.go WorkspaceService struct + constructor. +// vm_service.go VMService struct + constructor + FindVM, +// TouchVM, withVMLock* family, lockVMID. +// +// nat.go, dns_routing.go, tap_pool.go, snapshot.go HostNetwork methods. +// images.go, images_pull.go, image_seed.go, kernels.go ImageService methods. +// workspace.go, vm_authsync.go WorkspaceService methods. +// vm_lifecycle.go, vm_create.go, vm_create_ops.go, +// vm_stats.go, vm_set.go, vm_disk.go, vm_handles.go, +// ports.go, preflight.go VMService methods. +// +// vm.go Cross-service constants, rebuildDNS / +// cleanupRuntime / generateName (*VMService), +// and small stateless utilities. +// capabilities.go Pluggable capability hooks executed at VM +// start. Each capability is a plain struct +// with explicit service-pointer fields +// (workDiskCapability carries vm+ws+store, +// dnsCapability carries net, natCapability +// carries vm+net+logger). wireServices builds +// the default list; VMService invokes hooks +// through a capabilityHooks seam. No hook +// reaches back to *Daemon. +// vm_locks.go vmLockSet primitive. +// guest_ssh.go guestSSHClient, dialGuest, waitForGuestSSH. +// ssh_client_config.go Daemon-managed SSH client key material. +// doctor.go Host diagnostics. +// logger.go slog configuration. +// runtime_assets.go Companion-binary paths. +// +// Lock ordering: +// +// VMService.vmLocks[id] → WorkspaceService.workspaceLocks[id] +// → {VMService.createVMMu, ImageService.imageOpsMu} +// → subsystem-local locks +// +// vmLocks[id] and workspaceLocks[id] are NEVER held at the same +// time. workspace.prepare acquires vmLocks[id] only long enough to +// validate VM state, releases it, then acquires workspaceLocks[id] +// for the slow guest I/O phase. Lifecycle ops (start/stop/delete/ +// set) hold vmLocks[id] across the whole flow. Subsystem-local +// locks (tapPool.mu, opstate.Registry mu, handleCache.mu) are +// leaves. See ARCHITECTURE.md for details. +package daemon diff --git a/internal/daemon/doctor.go b/internal/daemon/doctor.go index b29c312..1f563d6 100644 --- a/internal/daemon/doctor.go +++ b/internal/daemon/doctor.go @@ -2,106 +2,593 @@ package daemon import ( "context" - "database/sql" + "fmt" + "os" + "path/filepath" + "runtime" "strings" + "syscall" + "time" + + "banger/internal/buildinfo" "banger/internal/config" + "banger/internal/firecracker" + "banger/internal/imagecat" + "banger/internal/installmeta" "banger/internal/model" "banger/internal/paths" "banger/internal/store" "banger/internal/system" ) +// systemdSystemDir is the path systemd reads enabled units from. Pulled +// out as a var (not a const) so the security-posture tests can swap it +// for a tempdir without faking /etc/systemd/system on the test host. +var systemdSystemDir = "/etc/systemd/system" + func Doctor(ctx context.Context) (system.Report, error) { - layout, err := paths.Resolve() + userLayout, err := paths.Resolve() if err != nil { return system.Report{}, err } - cfg, err := config.Load(layout) + cfg, err := config.Load(userLayout) if err != nil { return system.Report{}, err } + layout := paths.ResolveSystem() + // Doctor must be read-only: running it should never mutate the + // state DB (no migrations, no WAL checkpoint, no pragma writes). + // Skip OpenReadOnly entirely when the DB file doesn't exist — + // that's a fresh install, not an error condition. The first + // daemon start will create the file. storeMissing differentiates + // "no DB yet" (pass) from "DB present but unreadable" (fail) in + // the report. d := &Daemon{ - layout: layout, - config: cfg, - runner: system.NewRunner(), + layout: layout, + userLayout: userLayout, + config: cfg, + runner: system.NewRunner(), } - db, err := store.Open(layout.DBPath) - if err == nil { - defer db.Close() - d.store = db + var storeErr error + storeMissing := false + if _, statErr := os.Stat(layout.DBPath); statErr != nil { + if os.IsNotExist(statErr) { + storeMissing = true + } else { + storeErr = statErr + } + } else { + db, err := store.OpenReadOnly(layout.DBPath) + if err != nil { + storeErr = err + } else { + defer db.Close() + d.store = db + } } - return d.doctorReport(ctx), nil + wireServices(d) + return d.doctorReport(ctx, storeErr, storeMissing), nil } -func (d *Daemon) doctorReport(ctx context.Context) system.Report { +func (d *Daemon) doctorReport(ctx context.Context, storeErr error, storeMissing bool) system.Report { report := system.Report{} + addArchitectureCheck(&report) + addBangerVersionCheck(&report, installmeta.DefaultPath) + + switch { + case storeMissing: + report.AddPass("state store", "will be created on first daemon start at "+d.layout.DBPath) + case storeErr != nil: + report.AddFail( + "state store", + fmt.Sprintf("open %s: %v", d.layout.DBPath, storeErr), + "remove or restore the file if corrupt; otherwise check its permissions", + ) + default: + report.AddPass("state store", "readable at "+d.layout.DBPath) + } + report.AddPreflight("host runtime", d.runtimeChecks(), runtimeStatus(d.config)) report.AddPreflight("core vm lifecycle", d.coreVMLifecycleChecks(), "required host tools available") report.AddPreflight("vsock guest agent", d.vsockChecks(), "vsock guest agent prerequisites available") + d.addVMDefaultsCheck(&report) + d.addSSHShortcutCheck(&report) d.addCapabilityDoctorChecks(ctx, &report) - report.AddPreflight("image build", d.imageBuildChecks(ctx), "image build prerequisites available") + d.addFirecrackerVersionCheck(ctx, &report) + d.addSecurityPostureChecks(ctx, &report) return report } +// addFirecrackerVersionCheck verifies the configured firecracker +// binary exists, is recent enough for banger's expectations +// (firecracker.MinSupportedVersion), and surfaces a distro-aware +// install hint if it's missing. Three outcomes: +// +// - present + version in [Min, Tested]: PASS. +// - present + version above Tested: WARN. Newer firecracker +// usually works (the API is stable within a major), but it's +// outside banger's tested window. +// - present + version below Min: FAIL with the upgrade hint. +// - missing entirely: FAIL with a guess at the user's package +// manager plus the upstream Releases URL. +// +// We intentionally don't use the generic RequireExecutable preflight +// for this check — its static hint string can't carry the distro +// dispatch. +func (d *Daemon) addFirecrackerVersionCheck(ctx context.Context, report *system.Report) { + binPath := strings.TrimSpace(d.config.FirecrackerBin) + if binPath == "" { + binPath = "firecracker" + } + resolved, err := system.LookupExecutable(binPath) + if err != nil { + details := []string{fmt.Sprintf("not found: %s", binPath)} + details = append(details, firecrackerInstallHint(osReleaseSource)...) + report.AddFail("firecracker binary", details...) + return + } + parsed, err := firecracker.QueryVersion(ctx, d.runner, resolved) + if err != nil { + report.AddFail("firecracker binary", + fmt.Sprintf("`%s --version` failed: %v", resolved, err), + "reinstall firecracker; see https://github.com/firecracker-microvm/firecracker/releases") + return + } + reported := parsed.String() + min := firecracker.MustParseSemVer(firecracker.MinSupportedVersion) + tested := firecracker.MustParseSemVer(firecracker.KnownTestedVersion) + switch { + case parsed.Compare(min) < 0: + report.AddFail("firecracker binary", + fmt.Sprintf("%s at %s; banger requires ≥ v%s", reported, resolved, firecracker.MinSupportedVersion), + "upgrade firecracker — see https://github.com/firecracker-microvm/firecracker/releases") + case parsed.Compare(tested) > 0: + report.AddWarn("firecracker binary", + fmt.Sprintf("%s at %s (newer than banger's tested v%s; usually works)", reported, resolved, firecracker.KnownTestedVersion)) + default: + report.AddPass("firecracker binary", + fmt.Sprintf("%s at %s (within tested range; min v%s, tested v%s)", + reported, resolved, firecracker.MinSupportedVersion, firecracker.KnownTestedVersion)) + } +} + +// osReleaseSource is the file the install-hint reads to detect the +// host distro. Var rather than const so doctor tests can swap in a +// fixture. +var osReleaseSource = "/etc/os-release" + +// firecrackerInstallHint returns 1-2 detail lines describing how to +// install firecracker on the current host: a one-line guess based on +// /etc/os-release when the distro is recognised, plus the upstream +// Releases URL as a universal fallback. Anything we can't recognise +// gets only the URL — better silence than wrong instructions. +func firecrackerInstallHint(osReleasePath string) []string { + hints := []string{} + if cmd := guessFirecrackerInstallCommand(osReleasePath); cmd != "" { + hints = append(hints, "install: "+cmd) + } + hints = append(hints, "or download a static binary from https://github.com/firecracker-microvm/firecracker/releases") + return hints +} + +// guessFirecrackerInstallCommand reads osReleasePath and returns a +// short, copy-pasteable install command for the detected distro, or +// "" when no reliable mapping applies. We only suggest commands for +// distros where firecracker is actually packaged — guessing wrong +// here would send users on a wild goose chase. +func guessFirecrackerInstallCommand(osReleasePath string) string { + data, err := os.ReadFile(osReleasePath) + if err != nil { + return "" + } + id, idLike := parseOSReleaseIDs(string(data)) + candidates := append([]string{id}, strings.Fields(idLike)...) + for _, c := range candidates { + switch c { + case "debian": + // Packaged in Debian since trixie / bookworm-backports. + return "sudo apt install firecracker" + case "arch", "manjaro", "endeavouros": + // AUR; we don't assume a specific helper, but `paru` is the + // common one. Users who prefer yay/makepkg/etc. will + // substitute mentally. + return "paru -S firecracker # or your preferred AUR helper" + case "nixos": + return "nix-env -iA nixos.firecracker # or add to your configuration.nix" + } + } + return "" +} + +// parseOSReleaseIDs extracts the ID and ID_LIKE values from an +// /etc/os-release blob. Both are returned with surrounding quotes +// stripped; missing keys return empty strings. We don't validate +// the format beyond `KEY=value` — os-release is a simple format and +// any drift would manifest as a quiet "no distro hint" rather than +// a false positive. +func parseOSReleaseIDs(content string) (id, idLike string) { + for _, line := range strings.Split(content, "\n") { + line = strings.TrimSpace(line) + if rest, ok := strings.CutPrefix(line, "ID="); ok { + id = strings.Trim(rest, `"`) + } + if rest, ok := strings.CutPrefix(line, "ID_LIKE="); ok { + idLike = strings.Trim(rest, `"`) + } + } + return id, idLike +} + +// addSecurityPostureChecks verifies the install matches what +// docs/privileges.md describes: helper + owner-daemon units active, +// sockets at the expected mode/owner, unit files carrying the +// hardening directives, and the firecracker binary owned by root + +// non-writable. Drift between the doc and the running install would +// silently weaken the trust model; surfacing it here makes the doc +// load-bearing rather than aspirational. +// +// In non-system mode (no /etc/banger/install.toml) emits a single +// warn pointing at the docs section that explains the looser dev-mode +// trust model — a doctor PASS row in that mode would imply guarantees +// the install isn't actually providing. +func (d *Daemon) addSecurityPostureChecks(ctx context.Context, report *system.Report) { + d.addSecurityPostureChecksAt(ctx, report, installmeta.DefaultPath, systemdSystemDir) +} + +// addSecurityPostureChecksAt is the seam tests use: pass a fake +// install.toml + systemd dir to exercise the system-mode branch +// without writing to /etc. +func (d *Daemon) addSecurityPostureChecksAt(ctx context.Context, report *system.Report, installPath, systemdDir string) { + meta, err := installmeta.Load(installPath) + if err != nil { + report.AddWarn("security posture", + "running outside the system install (no "+installPath+")", + "helper SO_PEERCRED, narrow CapabilityBoundingSet, NoNewPrivileges, and ProtectSystem=strict are bypassed in this mode", + "see docs/privileges.md > 'Running outside the system install'; install via `sudo banger system install --owner $USER` for the supported trust model") + return + } + addServiceActiveCheck(ctx, d.runner, report, "helper service", installmeta.DefaultRootHelperService) + addServiceActiveCheck(ctx, d.runner, report, "owner daemon service", installmeta.DefaultService) + addSocketPermsCheck(report, "helper socket", installmeta.DefaultRootHelperSocketPath, meta.OwnerUID, 0o600) + addSocketPermsCheck(report, "daemon socket", installmeta.DefaultSocketPath, meta.OwnerUID, 0o600) + addUnitHardeningCheck(report, "helper unit hardening", + filepath.Join(systemdDir, installmeta.DefaultRootHelperService), + []string{ + "NoNewPrivileges=yes", + "ProtectSystem=strict", + "ProtectHome=yes", + "RestrictSUIDSGID=yes", + "LockPersonality=yes", + "CapabilityBoundingSet=", + }) + addUnitHardeningCheck(report, "daemon unit hardening", + filepath.Join(systemdDir, installmeta.DefaultService), + []string{ + "User=" + meta.OwnerUser, + "NoNewPrivileges=yes", + "ProtectSystem=strict", + "ProtectHome=read-only", + "RestrictSUIDSGID=yes", + "LockPersonality=yes", + }) + addExecutableOwnershipCheck(report, "firecracker binary ownership", d.config.FirecrackerBin) +} + +// addServiceActiveCheck shells `systemctl is-active ` and surfaces +// the result. is-active exits non-zero for inactive/failed states but +// always prints the state on stdout, so we read the trimmed output and +// ignore the exit code. Anything other than "active" is a fail with a +// systemctl-restart hint. +func addServiceActiveCheck(ctx context.Context, runner system.CommandRunner, report *system.Report, name, service string) { + out, _ := runner.Run(ctx, "systemctl", "is-active", service) + state := strings.TrimSpace(string(out)) + if state == "" { + state = "unknown" + } + if state == "active" { + report.AddPass(name, fmt.Sprintf("%s is active", service)) + return + } + report.AddFail(name, + fmt.Sprintf("%s is %s, not active", service, state), + fmt.Sprintf("run `sudo systemctl restart %s` and re-run `banger doctor`", service)) +} + +// addSocketPermsCheck stat()s the socket path and compares mode + +// owner against the values the install promises. Both daemon and +// helper sockets are 0600 chowned to the registered owner UID; any +// drift means filesystem perms aren't gating access the way the docs +// describe. +func addSocketPermsCheck(report *system.Report, name, path string, expectedUID int, expectedMode os.FileMode) { + info, err := os.Stat(path) + if err != nil { + report.AddFail(name, + fmt.Sprintf("%s: %v", path, err), + "is the service running? `sudo systemctl status` and check the runtime dir") + return + } + stat, ok := info.Sys().(*syscall.Stat_t) + if !ok { + report.AddWarn(name, fmt.Sprintf("%s: cannot read ownership metadata on this platform", path)) + return + } + actualMode := info.Mode().Perm() + var problems []string + if actualMode != expectedMode { + problems = append(problems, fmt.Sprintf("mode is %#o, want %#o", actualMode, expectedMode)) + } + if int(stat.Uid) != expectedUID { + problems = append(problems, fmt.Sprintf("uid is %d, want %d", stat.Uid, expectedUID)) + } + if len(problems) > 0 { + problems = append(problems, "restart the service so the socket gets recreated with correct perms") + report.AddFail(name, fmt.Sprintf("%s: %s", path, strings.Join(problems, "; "))) + return + } + report.AddPass(name, fmt.Sprintf("%s: mode %#o, uid %d", path, actualMode, expectedUID)) +} + +// addUnitHardeningCheck reads the systemd unit file and confirms +// every required directive is present as a literal substring. Brittle +// to formatting changes (a comment-out would slip through), but +// strong enough to catch the "someone hand-edited the unit and +// dropped NoNewPrivileges" failure mode that motivates this check. +// The directives list captures the security-relevant subset of the +// renderer in commands_system.go; everything else (Description, +// ExecStart, etc.) is operational and not worth pinning here. +func addUnitHardeningCheck(report *system.Report, name, path string, required []string) { + data, err := os.ReadFile(path) + if err != nil { + report.AddFail(name, + fmt.Sprintf("%s: %v", path, err), + "reinstall via `sudo banger system install` to refresh the unit") + return + } + content := string(data) + var missing []string + for _, directive := range required { + if !strings.Contains(content, directive) { + missing = append(missing, directive) + } + } + if len(missing) > 0 { + report.AddFail(name, + fmt.Sprintf("%s missing directives: %s", path, strings.Join(missing, ", ")), + "reinstall via `sudo banger system install` to refresh the unit") + return + } + report.AddPass(name, fmt.Sprintf("%s: %d hardening directives present", path, len(required))) +} + +// addExecutableOwnershipCheck mirrors validateRootExecutable's runtime +// check at doctor time: regular file, root-owned, executable, not +// group/world writable, not a symlink. Doctor catching this once at +// install time beats the helper failing every launch with a less +// helpful message. +func addExecutableOwnershipCheck(report *system.Report, name, path string) { + if strings.TrimSpace(path) == "" { + report.AddWarn(name, "no firecracker binary path configured") + return + } + info, err := os.Lstat(path) + if err != nil { + report.AddFail(name, fmt.Sprintf("%s: %v", path, err)) + return + } + if info.Mode()&os.ModeSymlink != 0 { + report.AddFail(name, + fmt.Sprintf("%s is a symlink", path), + "the helper opens the binary with O_NOFOLLOW; resolve the symlink and update firecracker_bin in the daemon config") + return + } + if !info.Mode().IsRegular() { + report.AddFail(name, fmt.Sprintf("%s is not a regular file", path)) + return + } + mode := info.Mode().Perm() + if mode&0o111 == 0 { + report.AddFail(name, + fmt.Sprintf("%s mode %#o is not executable", path, mode), + "chmod +x the binary") + return + } + if mode&0o022 != 0 { + report.AddFail(name, + fmt.Sprintf("%s mode %#o is group/world writable", path, mode), + "chmod g-w,o-w the binary so the helper accepts it") + return + } + stat, ok := info.Sys().(*syscall.Stat_t) + if !ok { + report.AddWarn(name, fmt.Sprintf("%s: cannot read ownership metadata on this platform", path)) + return + } + if stat.Uid != 0 { + report.AddFail(name, + fmt.Sprintf("%s is owned by uid %d, want 0", path, stat.Uid), + "`sudo chown root` the firecracker binary") + return + } + report.AddPass(name, fmt.Sprintf("%s: regular, root-owned, mode %#o", path, mode)) +} + +// addSSHShortcutCheck surfaces a gentle warning when banger maintains +// an ssh_config file but the user hasn't wired it into ~/.ssh/config. +// This is intentionally a warn, not a fail — the shortcut is opt-in +// convenience and `banger vm ssh` works either way. +func (d *Daemon) addSSHShortcutCheck(report *system.Report) { + bangerConfig := BangerSSHConfigPath(d.userLayout) + if strings.TrimSpace(bangerConfig) == "" { + return + } + if _, err := os.Stat(bangerConfig); err != nil { + // No banger ssh_config rendered yet — nothing to include. + return + } + installed, err := UserSSHIncludeInstalled() + if err != nil { + report.AddWarn("ssh shortcut", fmt.Sprintf("could not read ~/.ssh/config: %v", err)) + return + } + if installed { + report.AddPass("ssh shortcut", "enabled — `ssh .vm` routes through banger") + return + } + report.AddWarn( + "ssh shortcut", + fmt.Sprintf("`ssh .vm` not enabled (opt-in); run `banger ssh-config --install` or add `Include %s` to ~/.ssh/config", bangerConfig), + ) +} + +// addBangerVersionCheck reports the running CLI's version + commit +// alongside whatever's recorded in /etc/banger/install.toml. When +// the installed copy and the running binary disagree on version or +// commit, doctor warns: a stale `banger` running against a freshly- +// installed daemon (or vice versa) is the most common version-skew +// pitfall, and a one-line warning is friendlier than tracking down +// which side is wrong from a launch failure. +// +// Drift detection is suppressed when EITHER side is "dev"/"unknown" +// (untagged build) — those don't have a real version to compare. +func addBangerVersionCheck(report *system.Report, installPath string) { + cli := buildinfo.Current() + cliLine := fmt.Sprintf("CLI %s (commit %s, built %s)", cli.Version, shortCommit(cli.Commit), cli.BuiltAt) + + meta, err := installmeta.Load(installPath) + if err != nil { + // Non-system mode (no install.toml). Just report what we have. + report.AddPass("banger version", cliLine) + return + } + installLine := fmt.Sprintf("install %s (commit %s, installed %s)", meta.Version, shortCommit(meta.Commit), meta.InstalledAt.Format(time.RFC3339)) + if versionsDrift(cli, meta) { + report.AddWarn("banger version", + cliLine, + installLine, + "CLI and installed banger disagree; run `sudo banger system install` to refresh, or run the matching CLI binary") + return + } + report.AddPass("banger version", cliLine, installLine+" (matches CLI)") +} + +func versionsDrift(cli buildinfo.Info, meta installmeta.Metadata) bool { + // Treat dev/unknown as "no real version on this side" — comparing + // a dev build against a tagged install is the local-development + // case, not a drift problem worth surfacing. + if cli.Version == "dev" || strings.TrimSpace(meta.Version) == "" { + return false + } + if cli.Version != meta.Version { + return true + } + if cli.Commit != "unknown" && strings.TrimSpace(meta.Commit) != "" && cli.Commit != meta.Commit { + return true + } + return false +} + +func shortCommit(c string) string { + if len(c) > 8 { + return c[:8] + } + return c +} + +// addArchitectureCheck surfaces a hard-fail when banger is running on +// a non-amd64 host. Companion binaries are pinned to amd64 in the +// Makefile, the published kernel catalog ships only x86_64 images, and +// OCI import pulls linux/amd64 layers. Letting users discover this +// through cryptic downstream failures is worse than saying it up front. +func addArchitectureCheck(report *system.Report) { + if runtime.GOARCH == "amd64" { + report.AddPass("host architecture", "amd64") + return + } + report.AddFail( + "host architecture", + fmt.Sprintf("running on %s; banger today only supports amd64/x86_64 hosts", runtime.GOARCH), + "companion build, kernel catalog, and OCI import all assume linux/amd64", + ) +} + +// addVMDefaultsCheck surfaces the effective VM sizing that `vm run` / +// `vm create` will apply when the user omits the flags. Shown as a +// PASS check so it always renders, with per-field provenance +// (config|auto|builtin) so users can tell what's driving each number. +func (d *Daemon) addVMDefaultsCheck(report *system.Report) { + host, err := system.ReadHostResources() + var cpus int + var memBytes int64 + if err == nil { + cpus = host.CPUCount + memBytes = host.TotalMemoryBytes + } + defaults := model.ResolveVMDefaults(d.config.VMDefaults, cpus, memBytes) + details := []string{ + fmt.Sprintf("vcpu: %d (%s)", defaults.VCPUCount, defaults.VCPUSource), + fmt.Sprintf("memory: %d MiB (%s)", defaults.MemoryMiB, defaults.MemorySource), + fmt.Sprintf("disk: %s (%s)", model.FormatSizeBytes(defaults.WorkDiskSizeBytes), defaults.WorkDiskSource), + "override any of these in ~/.config/banger/config.toml under [vm_defaults]", + } + report.AddPass("vm defaults", details...) +} + func (d *Daemon) runtimeChecks() *system.Preflight { checks := system.NewPreflight() - checks.RequireExecutable(d.config.FirecrackerBin, "firecracker binary", `install firecracker or set "firecracker_bin"`) + // Firecracker presence + version is a separate top-level check (see + // addFirecrackerVersionCheck) so the report can carry a distro-aware + // install hint when the binary is missing — RequireExecutable's + // static `hint` string can't do that. checks.RequireFile(d.config.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`) - if helper, err := d.vsockAgentBinary(); err == nil { + if helper, err := vsockAgentBinary(d.layout); err == nil { checks.RequireExecutable(helper, "vsock agent helper", `run 'make build' or reinstall banger`) } else { checks.Addf("%v", err) } if d.store != nil && strings.TrimSpace(d.config.DefaultImageName) != "" { - image, err := d.store.GetImageByName(context.Background(), d.config.DefaultImageName) - switch { - case err == nil: + name := d.config.DefaultImageName + image, err := d.store.GetImageByName(context.Background(), name) + if err == nil { checks.RequireFile(image.RootfsPath, "default image rootfs", `re-register or rebuild the default image`) checks.RequireFile(image.KernelPath, "default image kernel", `re-register or rebuild the default image`) if strings.TrimSpace(image.InitrdPath) != "" { checks.RequireFile(image.InitrdPath, "default image initrd", `re-register or rebuild the default image`) } - case err != nil && err != sql.ErrNoRows: - checks.Addf("failed to inspect default image %q: %v", d.config.DefaultImageName, err) - default: - checks.Addf("default image %q is not registered", d.config.DefaultImageName) + } else if !defaultImageInCatalog(name) { + checks.Addf("default image %q is not registered and not in the imagecat catalog", name) } + // If the default image isn't local but is cataloged, vm create + // will auto-pull it on first use — no error to surface. } return checks } +func defaultImageInCatalog(name string) bool { + catalog, err := imagecat.LoadEmbedded() + if err != nil { + return false + } + _, err = catalog.Lookup(name) + return err == nil +} + func (d *Daemon) coreVMLifecycleChecks() *system.Preflight { checks := system.NewPreflight() - d.addBaseStartCommandPrereqs(checks) - return checks -} - -func (d *Daemon) imageBuildChecks(ctx context.Context) *system.Preflight { - checks := system.NewPreflight() - if d.store == nil || strings.TrimSpace(d.config.DefaultImageName) == "" { - checks.Addf("default image is not available for build inheritance") - return checks - } - image, err := d.store.GetImageByName(ctx, d.config.DefaultImageName) - if err != nil { - checks.Addf("default image %q is not registered", d.config.DefaultImageName) - return checks - } - d.addImageBuildPrereqs(ctx, checks, image.RootfsPath, image.KernelPath, image.InitrdPath, image.ModulesDir, "") + d.vm.addBaseStartCommandPrereqs(checks) return checks } func (d *Daemon) vsockChecks() *system.Preflight { checks := system.NewPreflight() - if helper, err := d.vsockAgentBinary(); err == nil { + if helper, err := vsockAgentBinary(d.layout); err == nil { checks.RequireExecutable(helper, "vsock agent helper", `run 'make build' or reinstall banger`) } else { checks.Addf("%v", err) } - checks.RequireFile(vsockHostDevicePath, "vsock host device", "load the vhost_vsock kernel module on the host") + checks.RequireFile(d.vm.vsockHostDevice, "vsock host device", "load the vhost_vsock kernel module on the host") return checks } diff --git a/internal/daemon/doctor_test.go b/internal/daemon/doctor_test.go new file mode 100644 index 0000000..37f766c --- /dev/null +++ b/internal/daemon/doctor_test.go @@ -0,0 +1,590 @@ +package daemon + +import ( + "context" + "errors" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/buildinfo" + "banger/internal/firecracker" + "banger/internal/installmeta" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" +) + +// permissiveRunner satisfies system.CommandRunner by returning a +// configurable response for every call. Doctor tests don't care about +// the exact ip/iptables commands run — they care that the aggregated +// report surfaces each feature check correctly, so a one-size runner +// keeps the test prelude short. +type permissiveRunner struct { + out []byte + err error +} + +func (r *permissiveRunner) Run(_ context.Context, _ string, _ ...string) ([]byte, error) { + return r.out, r.err +} + +func (r *permissiveRunner) RunSudo(_ context.Context, _ ...string) ([]byte, error) { + return r.out, r.err +} + +// buildDoctorDaemon stands up a Daemon the way doctorReport expects: +// fake PATH with every tool the preflights look for, fake firecracker +// + vsock companion binaries, fake vsock host device file, and a +// permissive runner that claims a default-route via eth0 so NAT's +// defaultUplink call succeeds. Returns the wired *Daemon. +func buildDoctorDaemon(t *testing.T) *Daemon { + t.Helper() + binDir := t.TempDir() + for _, name := range []string{ + "sudo", "ip", "dmsetup", "losetup", "blockdev", "truncate", "pgrep", + "chown", "chmod", "kill", "e2cp", "e2rm", "debugfs", + "iptables", "sysctl", "mkfs.ext4", "mount", "umount", "cp", + } { + writeFakeExecutable(t, filepath.Join(binDir, name)) + } + t.Setenv("PATH", binDir) + + firecrackerBin := filepath.Join(t.TempDir(), "firecracker") + if err := os.WriteFile(firecrackerBin, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { + t.Fatalf("write firecracker: %v", err) + } + vsockHelper := filepath.Join(t.TempDir(), "banger-vsock-agent") + if err := os.WriteFile(vsockHelper, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { + t.Fatalf("write vsock helper: %v", err) + } + t.Setenv("BANGER_VSOCK_AGENT_BIN", vsockHelper) + + sshKey := filepath.Join(t.TempDir(), "id_ed25519") + if err := os.WriteFile(sshKey, []byte("unused"), 0o600); err != nil { + t.Fatalf("write ssh key: %v", err) + } + + vsockHostDevice := filepath.Join(t.TempDir(), "vhost-vsock") + if err := os.WriteFile(vsockHostDevice, []byte{}, 0o644); err != nil { + t.Fatalf("write vsock host device: %v", err) + } + + runner := &permissiveRunner{out: []byte("default via 10.0.0.1 dev eth0 proto static\n")} + + d := &Daemon{ + layout: paths.Layout{ + ConfigDir: t.TempDir(), + StateDir: t.TempDir(), + DBPath: filepath.Join(t.TempDir(), "state.db"), + }, + config: model.DaemonConfig{ + FirecrackerBin: firecrackerBin, + SSHKeyPath: sshKey, + BridgeName: model.DefaultBridgeName, + BridgeIP: model.DefaultBridgeIP, + StatsPollInterval: model.DefaultStatsPollInterval, + }, + runner: runner, + } + wireServices(d) + d.vm.vsockHostDevice = vsockHostDevice + // HostNetwork defaults its own runner to the one on the struct, but + // wireServices only copies the Daemon's runner if d.net is nil + // before that call — in this test we constructed d.net implicitly, + // so belt-and-braces the permissive runner onto HostNetwork too. + d.net.runner = runner + return d +} + +// findCheck returns the first CheckResult with the given name, or nil +// if no such check was emitted. The test helper rather than a method +// on Report so the field scope stays tight. +func findCheck(report system.Report, name string) *system.CheckResult { + for i := range report.Checks { + if report.Checks[i].Name == name { + return &report.Checks[i] + } + } + return nil +} + +// TestDoctorReport_NonSystemModeEmitsSecurityWarn pins the non- +// system-mode branch: when install.toml is absent the security +// posture check must surface a warn that points at the dev-mode +// caveat in docs/privileges.md. A pass row in this mode would +// imply guarantees the install isn't actually providing. Drives +// the seam variant so the test is independent of whether the host +// happens to have /etc/banger/install.toml. +func TestDoctorReport_NonSystemModeEmitsSecurityWarn(t *testing.T) { + d := buildDoctorDaemon(t) + report := system.Report{} + missingInstall := filepath.Join(t.TempDir(), "install.toml") + d.addSecurityPostureChecksAt(context.Background(), &report, missingInstall, t.TempDir()) + + check := findCheck(report, "security posture") + if check == nil { + t.Fatal("security posture check missing from report") + } + if check.Status != system.CheckStatusWarn { + t.Fatalf("security posture status = %q, want warn", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "outside the system install") { + t.Fatalf("warn details = %q, want mention of non-system mode", joined) + } + if !strings.Contains(joined, "docs/privileges.md") { + t.Fatalf("warn details = %q, want pointer to docs/privileges.md", joined) + } +} + +func TestAddSocketPermsCheckRejectsWrongMode(t *testing.T) { + socketPath := filepath.Join(t.TempDir(), "fake.sock") + if err := os.WriteFile(socketPath, []byte{}, 0o644); err != nil { + t.Fatalf("write fake socket: %v", err) + } + report := system.Report{} + addSocketPermsCheck(&report, "test socket", socketPath, os.Getuid(), 0o600) + check := findCheck(report, "test socket") + if check == nil { + t.Fatal("expected test socket check") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("status = %q, want fail when mode is 0644 vs 0600 expected", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "mode is") { + t.Fatalf("details = %q, want mode-mismatch message", joined) + } +} + +func TestAddSocketPermsCheckPassesWhenModeAndOwnerMatch(t *testing.T) { + socketPath := filepath.Join(t.TempDir(), "fake.sock") + if err := os.WriteFile(socketPath, []byte{}, 0o600); err != nil { + t.Fatalf("write fake socket: %v", err) + } + report := system.Report{} + addSocketPermsCheck(&report, "test socket", socketPath, os.Getuid(), 0o600) + check := findCheck(report, "test socket") + if check == nil { + t.Fatal("expected test socket check") + } + if check.Status != system.CheckStatusPass { + t.Fatalf("status = %q, want pass when mode + uid match; details = %v", check.Status, check.Details) + } +} + +func TestAddUnitHardeningCheckFlagsMissingDirective(t *testing.T) { + unitPath := filepath.Join(t.TempDir(), "bangerd.service") + if err := os.WriteFile(unitPath, []byte("[Service]\nUser=alice\nProtectSystem=strict\n"), 0o644); err != nil { + t.Fatalf("write unit: %v", err) + } + report := system.Report{} + addUnitHardeningCheck(&report, "unit hardening", unitPath, []string{"User=alice", "NoNewPrivileges=yes", "ProtectSystem=strict"}) + check := findCheck(report, "unit hardening") + if check == nil { + t.Fatal("expected unit hardening check") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("status = %q, want fail when NoNewPrivileges is missing", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "NoNewPrivileges=yes") { + t.Fatalf("details = %q, want mention of the missing directive", joined) + } +} + +func TestAddUnitHardeningCheckPassesWhenAllPresent(t *testing.T) { + unitPath := filepath.Join(t.TempDir(), "bangerd-root.service") + body := "[Service]\nNoNewPrivileges=yes\nProtectSystem=strict\nProtectHome=yes\nCapabilityBoundingSet=CAP_CHOWN\n" + if err := os.WriteFile(unitPath, []byte(body), 0o644); err != nil { + t.Fatalf("write unit: %v", err) + } + report := system.Report{} + addUnitHardeningCheck(&report, "unit hardening", unitPath, []string{"NoNewPrivileges=yes", "ProtectSystem=strict", "CapabilityBoundingSet="}) + check := findCheck(report, "unit hardening") + if check == nil { + t.Fatal("expected unit hardening check") + } + if check.Status != system.CheckStatusPass { + t.Fatalf("status = %q, want pass when every directive is present; details = %v", check.Status, check.Details) + } +} + +func TestAddExecutableOwnershipCheckRejectsSymlink(t *testing.T) { + dir := t.TempDir() + real := filepath.Join(dir, "fc") + if err := os.WriteFile(real, []byte{}, 0o755); err != nil { + t.Fatalf("write fc: %v", err) + } + link := filepath.Join(dir, "fc-symlink") + if err := os.Symlink(real, link); err != nil { + t.Fatalf("symlink: %v", err) + } + report := system.Report{} + addExecutableOwnershipCheck(&report, "fc binary", link) + check := findCheck(report, "fc binary") + if check == nil { + t.Fatal("expected fc binary check") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("status = %q, want fail for symlinked binary", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "symlink") { + t.Fatalf("details = %q, want symlink rejection message", joined) + } +} + +func TestAddExecutableOwnershipCheckRejectsGroupWritable(t *testing.T) { + if os.Getuid() == 0 { + t.Skip("test runs as root; can't construct a non-root-owned check target meaningfully") + } + path := filepath.Join(t.TempDir(), "fc") + if err := os.WriteFile(path, []byte{}, 0o775); err != nil { + t.Fatalf("write fc: %v", err) + } + report := system.Report{} + addExecutableOwnershipCheck(&report, "fc binary", path) + check := findCheck(report, "fc binary") + if check == nil { + t.Fatal("expected fc binary check") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("status = %q, want fail when binary is group/world writable", check.Status) + } +} + +// TestDoctorReport_SystemModeRunsAllSecurityChecks pins the system-mode +// branch end-to-end: with a fake install.toml + fake systemd dir it +// must contribute every security row (services, sockets, unit +// hardening, fc ownership). Statuses themselves vary because we can't +// easily fake root-owned files in a test, but every check name must +// appear so a future refactor can't silently drop one. +func TestDoctorReport_SystemModeRunsAllSecurityChecks(t *testing.T) { + d := buildDoctorDaemon(t) + + installDir := t.TempDir() + installPath := filepath.Join(installDir, "install.toml") + if err := os.WriteFile(installPath, []byte("owner_user = \"alice\"\nowner_uid = 1000\nowner_gid = 1000\nowner_home = \"/home/alice\"\ninstalled_at = 2026-04-28T00:00:00Z\n"), 0o644); err != nil { + t.Fatalf("write install.toml: %v", err) + } + systemdDir := t.TempDir() + for _, svc := range []string{"bangerd.service", "bangerd-root.service"} { + if err := os.WriteFile(filepath.Join(systemdDir, svc), []byte(""), 0o644); err != nil { + t.Fatalf("write fake unit %s: %v", svc, err) + } + } + + report := system.Report{} + d.addSecurityPostureChecksAt(context.Background(), &report, installPath, systemdDir) + + for _, name := range []string{ + "helper service", + "owner daemon service", + "helper socket", + "daemon socket", + "helper unit hardening", + "daemon unit hardening", + "firecracker binary ownership", + } { + if findCheck(report, name) == nil { + t.Errorf("system-mode security check %q missing from report", name) + } + } + if findCheck(report, "security posture") != nil { + t.Error("system mode should NOT emit the non-system-mode warn") + } +} + +func TestDoctorReport_StoreErrorSurfacesAsFail(t *testing.T) { + d := buildDoctorDaemon(t) + report := d.doctorReport(context.Background(), errors.New("simulated open failure"), false) + + check := findCheck(report, "state store") + if check == nil { + t.Fatal("state store check missing from report") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("state store status = %q, want fail (store error should surface)", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "simulated open failure") { + t.Fatalf("state store details = %q, want the storeErr message", joined) + } +} + +func TestDoctorReport_StoreMissingSurfacesAsPassForFreshInstall(t *testing.T) { + d := buildDoctorDaemon(t) + // Fresh install: the DB file simply doesn't exist yet. doctor must + // not treat that as a failure — nothing's broken, the first daemon + // start will create the file. The status message should say so, + // so a user running `banger doctor` before ever booting a VM + // doesn't see a scary red check. + report := d.doctorReport(context.Background(), nil, true) + + check := findCheck(report, "state store") + if check == nil { + t.Fatal("state store check missing from report") + } + if check.Status != system.CheckStatusPass { + t.Fatalf("state store status = %q, want pass for a missing DB on fresh install", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "will be created") { + t.Fatalf("state store details = %q, want mention of 'will be created' so users know this is expected", joined) + } +} + +func TestDoctorReport_StoreSuccessSurfacesAsPass(t *testing.T) { + d := buildDoctorDaemon(t) + report := d.doctorReport(context.Background(), nil, false) + + check := findCheck(report, "state store") + if check == nil { + t.Fatal("state store check missing from report") + } + if check.Status != system.CheckStatusPass { + t.Fatalf("state store status = %q, want pass", check.Status) + } +} + +func TestDoctorReport_MissingFirecrackerFailsFirecrackerBinaryCheck(t *testing.T) { + d := buildDoctorDaemon(t) + // Point at a nonexistent path. Note: the doctor's PATH lookup + // looks for the basename, so use an absolute non-existent path + // (that's the configured-path branch — bare-name lookups would + // fall through to the test-fixture binDir which DOES contain a + // fake `firecracker`). + d.config.FirecrackerBin = filepath.Join(t.TempDir(), "does-not-exist") + + report := d.doctorReport(context.Background(), nil, false) + check := findCheck(report, "firecracker binary") + if check == nil { + t.Fatal("firecracker binary check missing from report") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("firecracker binary status = %q, want fail when binary missing", check.Status) + } + joined := strings.Join(check.Details, " ") + if !strings.Contains(joined, "firecracker-microvm/firecracker/releases") { + t.Fatalf("missing-binary report should include the upstream URL; got %q", joined) + } +} + +// TestVersionsDriftToleratesDevAndUnknown pins the suppression +// branches: a "dev"/"unknown" build on either side is the local- +// development case, not a drift problem; we don't want every +// developer-machine doctor run to emit a noisy warn. +func TestVersionsDriftToleratesDevAndUnknown(t *testing.T) { + t.Parallel() + cliReleased := buildinfo.Info{Version: "0.1.0", Commit: "abcd1234efgh", BuiltAt: "2026-04-28"} + metaReleased := installmeta.Metadata{Version: "0.1.0", Commit: "abcd1234efgh"} + + // Match → no drift. + if versionsDrift(cliReleased, metaReleased) { + t.Fatal("identical CLI and install metadata reported as drifted") + } + // Real version mismatch → drift. + bumped := metaReleased + bumped.Version = "0.2.0" + if !versionsDrift(cliReleased, bumped) { + t.Fatal("differing version not flagged as drift") + } + // Same version, different commit → drift (rebuilt without retag). + differCommit := metaReleased + differCommit.Commit = "deadbeefdead" + if !versionsDrift(cliReleased, differCommit) { + t.Fatal("differing commit at same version not flagged as drift") + } + // "dev" CLI vs released install → suppressed. + devCLI := buildinfo.Info{Version: "dev", Commit: "f00fb00b", BuiltAt: "now"} + if versionsDrift(devCLI, metaReleased) { + t.Fatal("dev CLI vs released install incorrectly flagged as drift") + } + // Empty install version → suppressed (predates the field). + emptyMeta := installmeta.Metadata{} + if versionsDrift(cliReleased, emptyMeta) { + t.Fatal("empty install metadata incorrectly flagged as drift") + } +} + +// TestFirecrackerInstallHintDispatchesByDistro pins the per-distro +// install command guess. Pinned IDs are the ones banger is willing to +// suggest a concrete command for; everything else gets only the +// upstream URL. +func TestFirecrackerInstallHintDispatchesByDistro(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + release string + wantSub string + wantNone bool + }{ + {name: "debian", release: "ID=debian\nVERSION_CODENAME=bookworm\n", wantSub: "apt install firecracker"}, + {name: "ubuntu_id_like_debian", release: "ID=ubuntu\nID_LIKE=debian\n", wantSub: "apt install firecracker"}, + {name: "arch", release: "ID=arch\n", wantSub: "paru -S firecracker"}, + {name: "manjaro_via_id_like", release: "ID=manjaro\nID_LIKE=arch\n", wantSub: "paru -S firecracker"}, + {name: "nixos", release: "ID=nixos\n", wantSub: "nixos.firecracker"}, + {name: "fedora_falls_back_to_url", release: "ID=fedora\n", wantNone: true}, + {name: "missing_file", release: "", wantNone: true}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + osPath := filepath.Join(t.TempDir(), "os-release") + if tc.release != "" { + if err := os.WriteFile(osPath, []byte(tc.release), 0o644); err != nil { + t.Fatalf("write os-release: %v", err) + } + } + hints := firecrackerInstallHint(osPath) + joined := strings.Join(hints, " ") + if !strings.Contains(joined, "firecracker-microvm/firecracker/releases") { + t.Fatalf("hints missing upstream URL; got %q", joined) + } + if tc.wantNone { + // Distro-specific hint must NOT be present — only the URL. + if len(hints) != 1 { + t.Fatalf("unrecognised distro got distro-specific hint(s); want only the URL line, got %v", hints) + } + return + } + if !strings.Contains(joined, tc.wantSub) { + t.Fatalf("hints %q do not contain expected substring %q", joined, tc.wantSub) + } + if len(hints) < 2 { + t.Fatalf("expected distro hint + URL; got only %v", hints) + } + }) + } +} + +// firecrackerVersionRunner is a CommandRunner that actually executes +// firecracker --version (via system.Runner) but stubs everything else +// with the permissive default. The doctor uses d.runner for the +// firecracker version query AND for several other checks; this tiny +// dispatcher lets us run a real script for one command without +// rewiring the rest. +type firecrackerVersionRunner struct { + real system.Runner + canned []byte + bin string +} + +func (r *firecrackerVersionRunner) Run(ctx context.Context, name string, args ...string) ([]byte, error) { + if name == r.bin { + return r.real.Run(ctx, name, args...) + } + return r.canned, nil +} + +func (r *firecrackerVersionRunner) RunSudo(_ context.Context, _ ...string) ([]byte, error) { + return r.canned, nil +} + +// stubFirecrackerVersion replaces the test daemon's firecracker +// stub with a script that prints the requested version line, then +// swaps d.runner for one that actually executes the script when the +// firecracker path is queried. Returns the resulting daemon ready +// for doctorReport. +func stubFirecrackerVersion(t *testing.T, d *Daemon, version string) { + t.Helper() + if err := os.WriteFile(d.config.FirecrackerBin, []byte("#!/bin/sh\necho 'Firecracker v"+version+"'\n"), 0o755); err != nil { + t.Fatalf("write firecracker stub: %v", err) + } + d.runner = &firecrackerVersionRunner{ + real: system.NewRunner(), + canned: []byte("default via 10.0.0.1 dev eth0 proto static\n"), + bin: d.config.FirecrackerBin, + } +} + +// TestFirecrackerVersionCheckPasses pins the happy path: when the +// configured firecracker reports a tested-range version, doctor +// emits a PASS row. +func TestFirecrackerVersionCheckPasses(t *testing.T) { + d := buildDoctorDaemon(t) + stubFirecrackerVersion(t, d, firecracker.KnownTestedVersion) + report := d.doctorReport(context.Background(), nil, false) + check := findCheck(report, "firecracker binary") + if check == nil { + t.Fatal("firecracker binary check missing from report") + } + if check.Status != system.CheckStatusPass { + t.Fatalf("status = %q, want pass; details=%v", check.Status, check.Details) + } +} + +// TestFirecrackerVersionCheckFailsBelowMin pins the too-old path: +// a binary reporting a version below MinSupportedVersion must FAIL +// with the upgrade hint. +func TestFirecrackerVersionCheckFailsBelowMin(t *testing.T) { + d := buildDoctorDaemon(t) + stubFirecrackerVersion(t, d, "0.25.0") + report := d.doctorReport(context.Background(), nil, false) + check := findCheck(report, "firecracker binary") + if check == nil { + t.Fatal("firecracker binary check missing from report") + } + if check.Status != system.CheckStatusFail { + t.Fatalf("status = %q, want fail for below-min version", check.Status) + } +} + +// TestFirecrackerVersionCheckWarnsAboveTested pins the over-tested +// path: a binary reporting a version newer than KnownTestedVersion +// must WARN — newer firecracker usually works, but it's outside the +// tested window. +func TestFirecrackerVersionCheckWarnsAboveTested(t *testing.T) { + d := buildDoctorDaemon(t) + stubFirecrackerVersion(t, d, "99.0.0") + report := d.doctorReport(context.Background(), nil, false) + check := findCheck(report, "firecracker binary") + if check == nil { + t.Fatal("firecracker binary check missing from report") + } + if check.Status != system.CheckStatusWarn { + t.Fatalf("status = %q, want warn for above-tested version", check.Status) + } +} + +func TestDoctorReport_IncludesEveryDefaultCapability(t *testing.T) { + d := buildDoctorDaemon(t) + report := d.doctorReport(context.Background(), nil, false) + + // Every registered capability that implements doctorCapability must + // contribute a check. Current defaults: work-disk, dns, nat. If a + // capability is added later it should either extend this list or + // register its own check name — either way, the assertion makes + // the contract visible. + for _, name := range []string{ + "feature /root work disk", + "feature vm dns", + "feature nat", + } { + if findCheck(report, name) == nil { + t.Errorf("capability check %q missing from report", name) + } + } +} + +func TestDoctorReport_EmitsVMDefaultsProvenance(t *testing.T) { + d := buildDoctorDaemon(t) + report := d.doctorReport(context.Background(), nil, false) + + check := findCheck(report, "vm defaults") + if check == nil { + t.Fatal("vm defaults check missing from report") + } + if check.Status != system.CheckStatusPass { + t.Fatalf("vm defaults status = %q, want pass (this is an always-pass informational check)", check.Status) + } + joined := strings.Join(check.Details, "\n") + for _, needle := range []string{"vcpu:", "memory:", "disk:"} { + if !strings.Contains(joined, needle) { + t.Errorf("vm defaults details missing %q; got:\n%s", needle, joined) + } + } +} diff --git a/internal/daemon/fake_firecracker_test.go b/internal/daemon/fake_firecracker_test.go new file mode 100644 index 0000000..2ad1555 --- /dev/null +++ b/internal/daemon/fake_firecracker_test.go @@ -0,0 +1,26 @@ +package daemon + +import ( + "fmt" + "os/exec" + "testing" +) + +// startFakeFirecracker launches a bash sleep-loop rewritten to match +// the firecracker command line a real process would expose, so +// reconcile / handle-cache paths that grep /proc//cmdline accept +// it as a firecracker process. Killed on test cleanup. +func startFakeFirecracker(t *testing.T, apiSock string) *exec.Cmd { + t.Helper() + cmd := exec.Command("bash", "-lc", fmt.Sprintf("exec -a %q sleep 60", "firecracker --api-sock "+apiSock)) + if err := cmd.Start(); err != nil { + t.Fatalf("start fake firecracker: %v", err) + } + t.Cleanup(func() { + if cmd.Process != nil { + _ = cmd.Process.Kill() + _, _ = cmd.Process.Wait() + } + }) + return cmd +} diff --git a/internal/daemon/fastpath_test.go b/internal/daemon/fastpath_test.go index aeafe7e..a68272f 100644 --- a/internal/daemon/fastpath_test.go +++ b/internal/daemon/fastpath_test.go @@ -16,43 +16,6 @@ import ( "banger/internal/model" ) -func TestEnsureWorkDiskClonesSeedImageAndResizes(t *testing.T) { - t.Parallel() - - vmDir := t.TempDir() - seedPath := filepath.Join(t.TempDir(), "root.work-seed.ext4") - if err := os.WriteFile(seedPath, []byte("seed-data"), 0o644); err != nil { - t.Fatalf("WriteFile(seed): %v", err) - } - workDiskPath := filepath.Join(vmDir, "root.ext4") - runner := &scriptedRunner{ - t: t, - steps: []runnerStep{ - {call: runnerCall{name: "e2fsck", args: []string{"-p", "-f", workDiskPath}}}, - {call: runnerCall{name: "resize2fs", args: []string{workDiskPath}}}, - }, - } - d := &Daemon{runner: runner} - vm := testVM("seeded", "image-seeded", "172.16.0.60") - vm.Runtime.WorkDiskPath = workDiskPath - vm.Spec.WorkDiskSizeBytes = 2 * 1024 * 1024 - image := testImage("image-seeded") - image.WorkSeedPath = seedPath - - if _, err := d.ensureWorkDisk(context.Background(), &vm, image); err != nil { - t.Fatalf("ensureWorkDisk: %v", err) - } - runner.assertExhausted() - - info, err := os.Stat(workDiskPath) - if err != nil { - t.Fatalf("Stat(work disk): %v", err) - } - if info.Size() != vm.Spec.WorkDiskSizeBytes { - t.Fatalf("work disk size = %d, want %d", info.Size(), vm.Spec.WorkDiskSizeBytes) - } -} - func TestTapPoolWarmsAndReusesIdleTap(t *testing.T) { t.Parallel() @@ -74,19 +37,20 @@ func TestTapPoolWarmsAndReusesIdleTap(t *testing.T) { }, closing: make(chan struct{}), } + wireServices(d) - d.ensureTapPool(context.Background()) - tapName, err := d.acquireTap(context.Background(), "tap-fallback") + d.net.ensureTapPool(context.Background()) + tapName, err := d.net.acquireTap(context.Background(), "tap-fallback") if err != nil { t.Fatalf("acquireTap: %v", err) } if tapName != "tap-pool-0" { t.Fatalf("tapName = %q, want tap-pool-0", tapName) } - if err := d.releaseTap(context.Background(), tapName); err != nil { + if err := d.net.releaseTap(context.Background(), tapName); err != nil { t.Fatalf("releaseTap: %v", err) } - tapName, err = d.acquireTap(context.Background(), "tap-fallback") + tapName, err = d.net.acquireTap(context.Background(), "tap-fallback") if err != nil { t.Fatalf("acquireTap second time: %v", err) } @@ -121,11 +85,12 @@ func TestEnsureAuthorizedKeyOnWorkDiskSkipsRepairForMatchingSeededFingerprint(t runner: runner, config: model.DaemonConfig{SSHKeyPath: sshKeyPath}, } + wireServices(d) vm := testVM("seeded-fastpath", "image-seeded-fastpath", "172.16.0.62") vm.Runtime.WorkDiskPath = filepath.Join(t.TempDir(), "root.ext4") image := model.Image{SeededSSHPublicKeyFingerprint: fingerprint} - if err := d.ensureAuthorizedKeyOnWorkDisk(context.Background(), &vm, image, workDiskPreparation{ClonedFromSeed: true}); err != nil { + if err := d.ws.ensureAuthorizedKeyOnWorkDisk(context.Background(), &vm, image, workDiskPreparation{ClonedFromSeed: true}); err != nil { t.Fatalf("ensureAuthorizedKeyOnWorkDisk: %v", err) } runner.assertExhausted() diff --git a/internal/daemon/fcproc/fcproc.go b/internal/daemon/fcproc/fcproc.go new file mode 100644 index 0000000..fd23402 --- /dev/null +++ b/internal/daemon/fcproc/fcproc.go @@ -0,0 +1,773 @@ +// Package fcproc owns the host-side process primitives needed to launch, +// inspect, and tear down Firecracker VMs: bridge/tap setup, binary +// resolution, socket permissions, PID lookup, graceful and forceful +// shutdown. Shared by the VM lifecycle and image build paths so neither +// needs to import the other. +package fcproc + +import ( + "context" + "errors" + "fmt" + "log/slog" + "os" + "path/filepath" + "sort" + "strconv" + "strings" + "sync" + "syscall" + "time" + + "golang.org/x/sys/unix" + + "banger/internal/firecracker" + "banger/internal/system" +) + +// errFirecrackerPIDNotFound is returned by findByJailerPidfile when the +// pidfile is missing, unreadable, or doesn't point at a live firecracker +// process. Surfaces to callers as a "this VM isn't running" signal, not +// as a hard failure. +var errFirecrackerPIDNotFound = errors.New("firecracker pid not found") + +// procDir is the kernel's per-process inspection directory. Var so tests +// can swap in a fake /proc-shaped fixture in t.TempDir(). +var procDir = "/proc" + +// ErrWaitForExitTimeout is returned by WaitForExit when the deadline passes +// before the process exits. Callers use errors.Is to detect it. +var ErrWaitForExitTimeout = errors.New("timed out waiting for VM to exit") + +// Runner is the command-runner surface fcproc needs. system.Runner satisfies +// it. +type Runner interface { + Run(ctx context.Context, name string, args ...string) ([]byte, error) + RunSudo(ctx context.Context, args ...string) ([]byte, error) +} + +// Config captures the host networking + runtime paths fcproc operations need. +type Config struct { + FirecrackerBin string + BridgeName string + BridgeIP string + CIDR string + RuntimeDir string +} + +// Manager owns the shared configuration + runner and exposes the per-process +// helpers. Stateless beyond its dependencies — safe to share. +type Manager struct { + runner Runner + cfg Config + logger *slog.Logger +} + +// New returns a Manager that issues commands through runner using cfg. +func New(runner Runner, cfg Config, logger *slog.Logger) *Manager { + return &Manager{runner: runner, cfg: cfg, logger: logger} +} + +// EnsureBridge makes sure the host bridge exists and is up. +func (m *Manager) EnsureBridge(ctx context.Context) error { + if _, err := m.runner.Run(ctx, "ip", "link", "show", m.cfg.BridgeName); err == nil { + _, err = m.runner.RunSudo(ctx, "ip", "link", "set", m.cfg.BridgeName, "up") + return err + } + if _, err := m.runner.RunSudo(ctx, "ip", "link", "add", "name", m.cfg.BridgeName, "type", "bridge"); err != nil { + return err + } + if _, err := m.runner.RunSudo(ctx, "ip", "addr", "add", fmt.Sprintf("%s/%s", m.cfg.BridgeIP, m.cfg.CIDR), "dev", m.cfg.BridgeName); err != nil { + return err + } + _, err := m.runner.RunSudo(ctx, "ip", "link", "set", m.cfg.BridgeName, "up") + return err +} + +// EnsureSocketDir creates the runtime socket directory at 0700. This is +// the directory the daemon socket, per-VM firecracker API sockets, and +// vsock sockets all live inside, so it must be readable only by the +// invoking user. +func (m *Manager) EnsureSocketDir() error { + mode := os.FileMode(0o700) + if os.Geteuid() == 0 { + mode = 0o711 + } + if err := os.MkdirAll(m.cfg.RuntimeDir, mode); err != nil { + return err + } + return os.Chmod(m.cfg.RuntimeDir, mode) +} + +// CreateTap (re)creates a TAP owned by the current uid/gid, attaches it to +// the bridge, and brings both up. +func (m *Manager) CreateTap(ctx context.Context, tap string) error { + return m.CreateTapOwned(ctx, tap, os.Getuid(), os.Getgid()) +} + +// CreateTapOwned (re)creates a TAP owned by uid:gid, attaches it to the +// bridge, and brings both up. +func (m *Manager) CreateTapOwned(ctx context.Context, tap string, uid, gid int) error { + if _, err := m.runner.Run(ctx, "ip", "link", "show", tap); err == nil { + _, _ = m.runner.RunSudo(ctx, "ip", "link", "del", tap) + } + if _, err := m.runner.RunSudo(ctx, "ip", "tuntap", "add", "dev", tap, "mode", "tap", "user", strconv.Itoa(uid), "group", strconv.Itoa(gid)); err != nil { + return err + } + if _, err := m.runner.RunSudo(ctx, "ip", "link", "set", tap, "master", m.cfg.BridgeName); err != nil { + return err + } + if _, err := m.runner.RunSudo(ctx, "ip", "link", "set", tap, "up"); err != nil { + return err + } + _, err := m.runner.RunSudo(ctx, "ip", "link", "set", m.cfg.BridgeName, "up") + return err +} + +// ResolveBinary returns the path to the firecracker binary: either an +// absolute path from config, or the first hit on PATH. +func (m *Manager) ResolveBinary() (string, error) { + if m.cfg.FirecrackerBin == "" { + return "", fmt.Errorf("firecracker binary not configured; install firecracker or set firecracker_bin") + } + path := m.cfg.FirecrackerBin + if strings.ContainsRune(path, os.PathSeparator) { + if _, err := os.Stat(path); err != nil { + return "", fmt.Errorf("firecracker binary not found at %s; install firecracker or set firecracker_bin", path) + } + return path, nil + } + resolved, err := system.LookupExecutable(path) + if err != nil { + return "", fmt.Errorf("firecracker binary %q not found in PATH; install firecracker or set firecracker_bin", path) + } + return resolved, nil +} + +// EnsureSocketAccess waits for the socket to appear then chowns/chmods it to +// the current uid/gid, mode 0600. +func (m *Manager) EnsureSocketAccess(ctx context.Context, socketPath, label string) error { + return m.EnsureSocketAccessFor(ctx, socketPath, label, os.Getuid(), os.Getgid()) +} + +// EnsureSocketAccessFor waits for the socket to appear then chowns/chmods it +// to uid:gid, mode 0600. +func (m *Manager) EnsureSocketAccessFor(ctx context.Context, socketPath, label string, uid, gid int) error { + return m.ensureSocketAccessFor(ctx, socketPath, label, uid, gid, 5*time.Second, 100*time.Millisecond) +} + +// EnsureSocketAccessForAsync runs EnsureSocketAccessFor concurrently for each +// non-empty path and returns a channel that receives a single error (nil on +// full success) once all per-path operations complete. Caller MUST receive on +// the channel to unblock the goroutine. +// +// Used during firecracker boot: the SDK's HTTP probe inside Machine.Start +// connects to the API socket the moment it appears. When firecracker is +// launched under sudo the socket is created root-owned, and the daemon's +// connect(2) gets EACCES until something chowns it. Running the chown +// concurrently with Start (instead of after Start returns, which deadlocks) +// closes the race without a shell-level chown_watcher. +// +// Uses a 25ms poll cadence (vs 100ms for the synchronous variant) to win +// against the SDK's tight HTTP retry loop. +func (m *Manager) EnsureSocketAccessForAsync(ctx context.Context, socketPaths []string, uid, gid int) <-chan error { + var clean []string + for _, p := range socketPaths { + if strings.TrimSpace(p) != "" { + clean = append(clean, p) + } + } + done := make(chan error, 1) + if len(clean) == 0 { + done <- nil + close(done) + return done + } + go func() { + defer close(done) + var wg sync.WaitGroup + errCh := make(chan error, len(clean)) + for _, p := range clean { + wg.Add(1) + go func(path string) { + defer wg.Done() + if err := m.ensureSocketAccessFor(ctx, path, "firecracker socket", uid, gid, 3*time.Second, 25*time.Millisecond); err != nil { + errCh <- err + } + }(p) + } + wg.Wait() + close(errCh) + for err := range errCh { + if err != nil { + done <- err + return + } + } + done <- nil + }() + return done +} + +func (m *Manager) ensureSocketAccessFor(ctx context.Context, socketPath, label string, uid, gid int, timeout, interval time.Duration) error { + if err := pollPath(ctx, socketPath, timeout, interval, label); err != nil { + return err + } + return chownChmodNoFollow(ctx, m.runner, socketPath, uid, gid, 0o600) +} + +// chownChmodNoFollow sets owner/group/mode on path without following +// symlinks at the leaf. Required because the helper RPCs that drive +// socket access run as root: a follow-symlink chmod/chown becomes an +// arbitrary file-ownership primitive if the caller can plant a symlink +// at the target. +// +// Linux idiom: open with O_PATH|O_NOFOLLOW (errors out if the leaf is a +// symlink), Fstat the fd to confirm the file is a unix socket, then +// chown via Fchownat(AT_EMPTY_PATH) and chmod via /proc/self/fd/N +// (fchmod on an O_PATH fd returns EBADF, but the /proc path resolves +// straight back to the inode the fd already pins, so no leaf re-traversal +// happens). +// +// Falls back to `sudo chown -h` + `sudo chmod` for the local-priv mode +// where the daemon isn't root and can't issue the syscalls itself; the +// `-h` flag still avoids the symlink-follow on the chown side. +func chownChmodNoFollow(ctx context.Context, runner Runner, path string, uid, gid int, mode os.FileMode) error { + if os.Geteuid() != 0 { + // Mode-then-owner ordering preserves the pre-existing failure + // semantics of the legacy `chmod 600 / chown` shell-out path + // (chmod-failure tests expect chown to be skipped). `chown -h` + // keeps the symlink-no-follow guarantee on this branch. + if _, err := runner.RunSudo(ctx, "chmod", fmt.Sprintf("%o", mode.Perm()), path); err != nil { + return err + } + _, err := runner.RunSudo(ctx, "chown", "-h", fmt.Sprintf("%d:%d", uid, gid), path) + return err + } + fd, err := unix.Open(path, unix.O_PATH|unix.O_NOFOLLOW|unix.O_CLOEXEC, 0) + if err != nil { + return fmt.Errorf("open %s: %w", path, err) + } + defer unix.Close(fd) + var st unix.Stat_t + if err := unix.Fstat(fd, &st); err != nil { + return fmt.Errorf("fstat %s: %w", path, err) + } + if st.Mode&unix.S_IFMT != unix.S_IFSOCK { + return fmt.Errorf("%s is not a unix socket (mode %#o)", path, st.Mode&unix.S_IFMT) + } + procPath := "/proc/self/fd/" + strconv.Itoa(fd) + if err := unix.Fchmodat(unix.AT_FDCWD, procPath, uint32(mode.Perm()), 0); err != nil { + return fmt.Errorf("chmod %s: %w", path, err) + } + if err := unix.Fchownat(fd, "", uid, gid, unix.AT_EMPTY_PATH); err != nil { + return fmt.Errorf("chown %s: %w", path, err) + } + return nil +} + +// FindPID returns the PID of the firecracker process backing apiSock. +// +// Two strategies, tried in order: +// +// 1. pgrep -n -f apiSock — cheap, works for direct (non-jailer) launches +// because the host-side socket path appears verbatim in firecracker's +// cmdline. +// 2. Jailer pidfile — for jailer'd firecrackers, pgrep can't match +// because the cmdline only carries the chroot-relative +// `--api-sock /firecracker.socket`. Jailer (v1.x) writes the +// post-exec firecracker PID to `/firecracker.pid` by default. +// Read it; verify the PID is alive and its comm is `firecracker`. +// Caller must run with read access to the pidfile (root in the +// system-mode helper; daemon UID in dev mode where banger doesn't +// drop privs). +// +// This is what makes post-restart reconcile re-attach to surviving +// guests instead of mistaking them for stale. +func (m *Manager) FindPID(ctx context.Context, apiSock string) (int, error) { + if pid, err := m.findPIDByPgrep(ctx, apiSock); err == nil && pid > 0 { + return pid, nil + } + if pid, err := findByJailerPidfile(apiSock); err == nil && pid > 0 { + return pid, nil + } + return 0, errFirecrackerPIDNotFound +} + +func (m *Manager) findPIDByPgrep(ctx context.Context, apiSock string) (int, error) { + out, err := m.runner.Run(ctx, "pgrep", "-n", "-f", apiSock) + if err != nil { + return 0, err + } + return strconv.Atoi(strings.TrimSpace(string(out))) +} + +// findByJailerPidfile reads the jailer-written pidfile that lives at +// `/firecracker.pid` (sibling of the api socket inside the +// chroot), verifies the PID is alive and its /proc//comm is +// `firecracker`, and returns it. +// +// Returns errFirecrackerPIDNotFound when the api-sock isn't a symlink +// (direct launch — pidfile shape doesn't apply), the pidfile is +// missing or unreadable (VM stopped, or caller lacks privileges), +// the pidfile content is garbage, or the PID points at a process +// that's gone or never was firecracker. +func findByJailerPidfile(apiSock string) (int, error) { + target, err := os.Readlink(apiSock) + if err != nil { + return 0, errFirecrackerPIDNotFound + } + if !filepath.IsAbs(target) { + target = filepath.Join(filepath.Dir(apiSock), target) + } + pidPath := filepath.Join(filepath.Dir(target), "firecracker.pid") + pidBytes, err := os.ReadFile(pidPath) + if err != nil { + return 0, errFirecrackerPIDNotFound + } + pid, err := strconv.Atoi(strings.TrimSpace(string(pidBytes))) + if err != nil || pid <= 0 { + return 0, errFirecrackerPIDNotFound + } + commBytes, err := os.ReadFile(filepath.Join(procDir, strconv.Itoa(pid), "comm")) + if err != nil { + return 0, errFirecrackerPIDNotFound + } + if strings.TrimSpace(string(commBytes)) != "firecracker" { + return 0, errFirecrackerPIDNotFound + } + return pid, nil +} + +// ResolvePID prefers pgrep and falls back to the firecracker machine PID. +// Returns 0 if neither source yields a PID. +func (m *Manager) ResolvePID(ctx context.Context, machine *firecracker.Machine, apiSock string) int { + if pid, err := m.FindPID(ctx, apiSock); err == nil && pid > 0 { + return pid + } + if machine != nil { + if pid, err := machine.PID(); err == nil && pid > 0 { + return pid + } + } + return 0 +} + +// SendCtrlAltDel requests a graceful guest shutdown via the firecracker API +// socket. +func (m *Manager) SendCtrlAltDel(ctx context.Context, apiSock string) error { + if err := m.EnsureSocketAccess(ctx, apiSock, "firecracker api socket"); err != nil { + return err + } + client := firecracker.New(apiSock, m.logger) + return client.SendCtrlAltDel(ctx) +} + +// WaitForExit polls until the process is gone or the timeout fires. Returns +// ErrWaitForExitTimeout on timeout, ctx.Err() on cancellation. +func (m *Manager) WaitForExit(ctx context.Context, pid int, apiSock string, timeout time.Duration) error { + deadline := time.Now().Add(timeout) + for { + if !system.ProcessRunning(pid, apiSock) { + return nil + } + if time.Now().After(deadline) { + return ErrWaitForExitTimeout + } + select { + case <-ctx.Done(): + return ctx.Err() + case <-time.After(100 * time.Millisecond): + } + } +} + +// Kill sends SIGKILL to pid. +func (m *Manager) Kill(ctx context.Context, pid int) error { + _, err := m.runner.RunSudo(ctx, "kill", "-KILL", strconv.Itoa(pid)) + return err +} + +// ChrootDriveSpec describes how a single drive should appear inside the +// jailer chroot. HostPath is the host-side source (a regular file or a +// /dev/mapper/* block device); ChrootName is the bare filename it should +// be reachable as inside the chroot (e.g. "rootfs"). The DM block device +// case is detected via os.Stat (S_IFBLK) — the helper mknods a matching +// node; everything else is hard-linked. +type ChrootDriveSpec struct { + ChrootName string + HostPath string +} + +// PrepareJailerChroot stages the chroot tree at chrootRoot for the jailer +// to take over on launch. After this call: +// +// - chrootRoot exists, mode 0700, owned by uid:gid. +// - chrootRoot/ is a hard link of kernelHostPath, owned uid:gid. +// - chrootRoot/ is a hard link of initrdHostPath if set. +// - For each drive: a hard link (regular file source) or a freshly +// mknod'd block device with the source's major/minor (DM source). +// - If wantVSock, /dev/vhost-vsock is mknod'd into the chroot so +// firecracker can open it after chroot. +// +// All filesystem mutations go through runner.RunSudo when the caller isn't +// root, so this works in dev (sudo) and system (root helper) modes alike. +// Path components are validated by the caller (roothelper) — this helper +// trusts them. +func (m *Manager) PrepareJailerChroot(ctx context.Context, chrootRoot string, uid, gid int, firecrackerHostPath, kernelHostPath, kernelName, initrdHostPath, initrdName string, drives []ChrootDriveSpec, wantVSock bool) error { + if strings.TrimSpace(chrootRoot) == "" { + return fmt.Errorf("chroot root is required") + } + if err := m.sudo(ctx, "mkdir", "-p", chrootRoot); err != nil { + return fmt.Errorf("create chroot root: %w", err) + } + if err := m.sudo(ctx, "chmod", "0700", chrootRoot); err != nil { + return fmt.Errorf("chmod chroot root: %w", err) + } + if err := m.chown(ctx, chrootRoot, uid, gid); err != nil { + return fmt.Errorf("chown chroot root: %w", err) + } + // The daemon (uid) needs to traverse the intermediate directories to reach + // the sockets firecracker creates inside the chroot. The per-VM dir + // (/firecracker//) is chowned to uid so the daemon can reach + // /root/. The /firecracker/ base and /jail/ dirs get + // world-execute (--x) so any UID can traverse through them without listing + // their contents (the per-VM dirs are still protected by their own mode). + vmDir := filepath.Dir(chrootRoot) + if err := m.chown(ctx, vmDir, uid, gid); err != nil { + return fmt.Errorf("chown vm dir: %w", err) + } + fcBaseDir := filepath.Dir(vmDir) + if err := m.sudo(ctx, "chmod", "0711", fcBaseDir); err != nil { + return fmt.Errorf("chmod firecracker base dir: %w", err) + } + jailBaseDir := filepath.Dir(fcBaseDir) + if err := m.sudo(ctx, "chmod", "0711", jailBaseDir); err != nil { + return fmt.Errorf("chmod jail base dir: %w", err) + } + // Order matters: hard-link the kernel + file-backed drives BEFORE + // the self-bind below. link(2) refuses to cross mount points even + // when the underlying superblock is the same — once chrootRoot is a + // mount point, `ln /var/lib/.../kernel /vmlinux` returns + // EXDEV. + if err := m.linkInto(ctx, chrootRoot, kernelHostPath, kernelName, uid, gid); err != nil { + return fmt.Errorf("link kernel: %w", err) + } + if strings.TrimSpace(initrdHostPath) != "" { + if err := m.linkInto(ctx, chrootRoot, initrdHostPath, initrdName, uid, gid); err != nil { + return fmt.Errorf("link initrd: %w", err) + } + } + for _, d := range drives { + if err := m.stageDrive(ctx, chrootRoot, d, uid, gid); err != nil { + return fmt.Errorf("stage drive %s: %w", d.ChrootName, err) + } + } + if wantVSock { + // The jailer creates /dev inside the chroot, but /dev/vhost-vsock must + // be pre-staged so firecracker can open it after the jailer chroots. + devDir := chrootRoot + "/dev" + if err := m.sudo(ctx, "mkdir", "-p", devDir); err != nil { + return fmt.Errorf("create chroot/dev: %w", err) + } + if err := m.chown(ctx, devDir, uid, gid); err != nil { + return fmt.Errorf("chown chroot/dev: %w", err) + } + if err := m.stageDevice(ctx, chrootRoot, "dev/vhost-vsock", "/dev/vhost-vsock", uid, gid); err != nil { + return fmt.Errorf("stage vhost-vsock: %w", err) + } + } + // Bind firecracker + the host libdirs into the chroot read-only. + // firecracker is dynamically linked (interpreter /lib64/ld-linux-*, + // libc, libgcc), and inside the chroot ENOENT on those is reported + // as "Failed to exec into Firecracker: No such file or directory" — + // the kernel's misleading ENOENT-for-missing-interpreter error. + // + // Done last so the link/mknod steps above don't have to cross the + // self-bind mount boundary (link(2) returns EXDEV at mount edges). + // Self-bind first so CleanupJailerChroot's `umount -lR` can recurse + // from chrootRoot itself; --make-private blocks propagation back to + // the host mount namespace. + // firecracker is copied (not bind-mounted) because jailer opens the + // binary O_RDWR — apparently to seal it or rewrite something — and + // fails with EROFS on a ro-bind. + chrootFC := chrootRoot + "/" + filepath.Base(firecrackerHostPath) + if err := m.sudo(ctx, "cp", "-f", firecrackerHostPath, chrootFC); err != nil { + return fmt.Errorf("copy firecracker into chroot: %w", err) + } + if err := m.sudo(ctx, "chmod", "0755", chrootFC); err != nil { + return fmt.Errorf("chmod firecracker in chroot: %w", err) + } + if err := m.chown(ctx, chrootFC, uid, gid); err != nil { + return fmt.Errorf("chown firecracker in chroot: %w", err) + } + if err := m.sudo(ctx, "mount", "--bind", chrootRoot, chrootRoot); err != nil { + return fmt.Errorf("self-bind chroot: %w", err) + } + // Remount without nosuid: the helper unit's ReadWritePaths binding marks + // /var/lib/banger nosuid, and bind mounts inherit that flag. The jailer + // needs to exec /firecracker as UID 1000, which the kernel denies on a + // nosuid mount when NoNewPrivileges is set on the unit. + if err := m.sudo(ctx, "mount", "-o", "remount,bind,suid", chrootRoot, chrootRoot); err != nil { + return fmt.Errorf("remount chroot suid: %w", err) + } + if err := m.sudo(ctx, "mount", "--make-private", chrootRoot); err != nil { + return fmt.Errorf("make-private chroot: %w", err) + } + // Pre-create /usr with world-traversable permissions. UMask=0077 on the + // helper unit causes plain mkdir to produce 0700 dirs; UID 1000 must be + // able to traverse /usr/ to reach the dynamic linker via lib64 → usr/lib. + if err := m.sudo(ctx, "install", "-d", "-m", "0755", chrootRoot+"/usr"); err != nil { + return fmt.Errorf("create chroot/usr: %w", err) + } + // Bind real libdirs and replicate the host's compat symlinks + // (/lib64 → /usr/lib, etc) inside the chroot so firecracker's + // PT_INTERP path (/lib64/ld-linux-*) resolves to the bound libs. + for _, libDir := range []string{"/usr/lib", "/usr/lib64", "/lib", "/lib64"} { + info, err := os.Lstat(libDir) + if err != nil { + continue + } + target := chrootRoot + libDir + if info.Mode()&os.ModeSymlink != 0 { + link, err := os.Readlink(libDir) + if err != nil { + continue + } + if err := m.sudo(ctx, "ln", "-sfn", link, target); err != nil { + return fmt.Errorf("symlink %s -> %s: %w", target, link, err) + } + continue + } + if !info.IsDir() { + continue + } + if err := m.bindDir(ctx, libDir, target, true); err != nil { + return fmt.Errorf("bind %s: %w", libDir, err) + } + } + return nil +} + +// CleanupJailerChroot tears down a chroot built by PrepareJailerChroot: +// lazy-recursive umount of every mount under (or at) chrootRoot, then a +// findmnt-guarded `rm -rf`. The guard is load-bearing: if any bind mount +// remained, `rm -rf` would descend into the bind source (e.g. /usr/lib) +// and start deleting host files. The umount runs `-l` (lazy) so an in-use +// bind point still gets detached from the namespace; the guarded check +// then catches the rare case where detachment didn't happen. +func (m *Manager) CleanupJailerChroot(ctx context.Context, chrootRoot string) error { + if strings.TrimSpace(chrootRoot) == "" { + return nil + } + // Lstat (not Stat): if chrootRoot is a symlink the umount/rm shell-outs + // below would chase it. The handler-side validateNotSymlink also catches + // this, but lifting the check inside fcproc closes the TOCTOU window + // between the handler check and our umount command. + info, err := os.Lstat(chrootRoot) + if err != nil { + if os.IsNotExist(err) { + return nil + } + return fmt.Errorf("inspect chroot %s: %w", chrootRoot, err) + } + if info.Mode()&os.ModeSymlink != 0 { + return fmt.Errorf("refusing to clean up %q: path is a symlink", chrootRoot) + } + if !info.IsDir() { + return fmt.Errorf("refusing to clean up %q: not a directory", chrootRoot) + } + // Resolve any intermediate symlinks and require the result equals the + // input — that catches a planted `…/jail/firecracker/ → /` even + // though the leaf "/root" component is itself a real directory inside + // the redirected target. Equality + Lstat together cover both top and + // intermediate symlink shapes. + resolved, err := filepath.EvalSymlinks(chrootRoot) + if err != nil { + return fmt.Errorf("resolve chroot %s: %w", chrootRoot, err) + } + if filepath.Clean(resolved) != filepath.Clean(chrootRoot) { + return fmt.Errorf("refusing to clean up %q: resolves to %q via symlink", chrootRoot, resolved) + } + // Switch from `umount --recursive --lazy ` (shell-resolved, + // follows symlinks at exec time) to direct umount2() syscalls per child + // mount with UMOUNT_NOFOLLOW. That fully closes the residual TOCTOU + // between the EvalSymlinks check above and the unmount: even if a daemon- + // uid attacker swapped a child mount's path to a symlink in the gap, the + // kernel refuses to follow it. The findmnt guard below still catches any + // mount we couldn't detach. + mounts, err := m.mountsUnder(ctx, chrootRoot) + if err != nil { + return fmt.Errorf("inspect chroot mounts: %w", err) + } + // Deepest-first so child mounts come off before parents; otherwise a + // parent unmount would EBUSY against in-use children. + sort.Slice(mounts, func(i, j int) bool { + return strings.Count(mounts[i], "/") > strings.Count(mounts[j], "/") + }) + for _, mt := range mounts { + if err := m.detachMount(ctx, mt); err != nil { + return fmt.Errorf("detach %q: %w", mt, err) + } + } + if remaining, err := m.mountsUnder(ctx, chrootRoot); err != nil { + return fmt.Errorf("re-inspect chroot mounts: %w", err) + } else if len(remaining) > 0 { + return fmt.Errorf("refusing to rm -rf %q: still has %d mount(s): %v", chrootRoot, len(remaining), remaining) + } + return m.sudo(ctx, "rm", "-rf", "--", chrootRoot) +} + +// detachMount tears down a single mount target with MNT_DETACH (lazy) + +// UMOUNT_NOFOLLOW (refuse symlinks). Falls back to `sudo umount --lazy` +// when not running as root, since umount2() requires CAP_SYS_ADMIN. +// +// ENOENT and EINVAL on the syscall path are treated as "already gone" — +// findmnt's snapshot can race with parallel cleanups, and a missing +// mount is the desired end state. +func (m *Manager) detachMount(ctx context.Context, target string) error { + if os.Geteuid() == 0 { + err := unix.Unmount(target, unix.MNT_DETACH|unix.UMOUNT_NOFOLLOW) + if err == nil || errors.Is(err, unix.ENOENT) || errors.Is(err, unix.EINVAL) { + return nil + } + return err + } + // Local-priv fallback: shell `umount --lazy` resolves the path through + // the kernel without UMOUNT_NOFOLLOW, but the EvalSymlinks check earlier + // already constrained the chroot tree. The dev-mode caveat in + // docs/privileges.md covers this branch's looser guarantees. + _, err := m.runner.RunSudo(ctx, "umount", "--lazy", target) + return err +} + +func (m *Manager) bindFile(ctx context.Context, source, target string, readOnly bool) error { + if err := m.sudo(ctx, "install", "-D", "-m", "0644", "/dev/null", target); err != nil { + return fmt.Errorf("create bind target file: %w", err) + } + return m.bindMount(ctx, source, target, readOnly) +} + +func (m *Manager) bindDir(ctx context.Context, source, target string, readOnly bool) error { + if err := m.sudo(ctx, "mkdir", "-p", target); err != nil { + return fmt.Errorf("create bind target dir: %w", err) + } + return m.bindMount(ctx, source, target, readOnly) +} + +func (m *Manager) bindMount(ctx context.Context, source, target string, readOnly bool) error { + if err := m.sudo(ctx, "mount", "--bind", source, target); err != nil { + return err + } + if !readOnly { + return nil + } + // Single-step ro bind isn't honored by all kernels — the bind happens + // rw and the ro flag is silently ignored. Remount makes it stick. + return m.sudo(ctx, "mount", "-o", "remount,bind,ro", target) +} + +// mountsUnder returns the list of mount targets at or under chrootRoot. +// findmnt's output is one path per line; an empty list means no leftovers. +func (m *Manager) mountsUnder(ctx context.Context, chrootRoot string) ([]string, error) { + out, err := m.runner.Run(ctx, "findmnt", "--output", "TARGET", "--list", "--noheadings") + if err != nil { + return nil, err + } + var mounts []string + prefix := chrootRoot + string(os.PathSeparator) + for _, line := range strings.Split(string(out), "\n") { + t := strings.TrimSpace(line) + if t == chrootRoot || strings.HasPrefix(t, prefix) { + mounts = append(mounts, t) + } + } + return mounts, nil +} + +func (m *Manager) stageDrive(ctx context.Context, chrootRoot string, d ChrootDriveSpec, uid, gid int) error { + info, err := os.Stat(d.HostPath) + if err != nil { + return err + } + if info.Mode()&os.ModeDevice != 0 { + stat, ok := info.Sys().(*syscall.Stat_t) + if !ok { + return fmt.Errorf("stat %s: cannot read device numbers", d.HostPath) + } + major := unix.Major(stat.Rdev) + minor := unix.Minor(stat.Rdev) + return m.mknodBlock(ctx, chrootRoot, d.ChrootName, major, minor, uid, gid) + } + return m.linkInto(ctx, chrootRoot, d.HostPath, d.ChrootName, uid, gid) +} + +func (m *Manager) stageDevice(ctx context.Context, chrootRoot, chrootName, hostDevice string, uid, gid int) error { + info, err := os.Stat(hostDevice) + if err != nil { + return err + } + stat, ok := info.Sys().(*syscall.Stat_t) + if !ok { + return fmt.Errorf("stat %s: cannot read device numbers", hostDevice) + } + major := unix.Major(stat.Rdev) + minor := unix.Minor(stat.Rdev) + target := chrootRoot + "/" + chrootName + if err := m.sudo(ctx, "mknod", "-m", "0660", target, "c", strconv.FormatUint(uint64(major), 10), strconv.FormatUint(uint64(minor), 10)); err != nil { + return err + } + return m.chown(ctx, target, uid, gid) +} + +func (m *Manager) mknodBlock(ctx context.Context, chrootRoot, name string, major, minor uint32, uid, gid int) error { + target := chrootRoot + "/" + name + if err := m.sudo(ctx, "mknod", "-m", "0660", target, "b", strconv.FormatUint(uint64(major), 10), strconv.FormatUint(uint64(minor), 10)); err != nil { + return err + } + return m.chown(ctx, target, uid, gid) +} + +func (m *Manager) linkInto(ctx context.Context, chrootRoot, source, name string, uid, gid int) error { + target := chrootRoot + "/" + name + if err := m.sudo(ctx, "ln", "-f", source, target); err != nil { + return err + } + return m.chown(ctx, target, uid, gid) +} + +func (m *Manager) chown(ctx context.Context, target string, uid, gid int) error { + return m.sudo(ctx, "chown", fmt.Sprintf("%d:%d", uid, gid), target) +} + +func (m *Manager) sudo(ctx context.Context, name string, args ...string) error { + if os.Geteuid() == 0 { + _, err := m.runner.Run(ctx, name, args...) + return err + } + _, err := m.runner.RunSudo(ctx, append([]string{name}, args...)...) + return err +} + +func waitForPath(ctx context.Context, path string, timeout time.Duration, label string) error { + return pollPath(ctx, path, timeout, 100*time.Millisecond, label) +} + +func pollPath(ctx context.Context, path string, timeout, interval time.Duration, label string) error { + deadline := time.Now().Add(timeout) + for { + if _, err := os.Stat(path); err == nil { + return nil + } else if err != nil && !os.IsNotExist(err) { + return err + } + if time.Now().After(deadline) { + return fmt.Errorf("%s not ready: %s: %w", label, path, context.DeadlineExceeded) + } + select { + case <-ctx.Done(): + return ctx.Err() + case <-time.After(interval): + } + } +} diff --git a/internal/daemon/fcproc/fcproc_test.go b/internal/daemon/fcproc/fcproc_test.go new file mode 100644 index 0000000..99ff665 --- /dev/null +++ b/internal/daemon/fcproc/fcproc_test.go @@ -0,0 +1,471 @@ +package fcproc + +import ( + "context" + "errors" + "log/slog" + "os" + "path/filepath" + "strings" + "testing" + "time" +) + +// scriptedRunner is a minimal Runner that records every call and +// plays back a pre-scripted sequence of (name, args, out, err) +// steps. Failing to match or running past the script fails the +// test. Mirrors the pattern from internal/daemon/snapshot_test.go +// but lives here because fcproc is a leaf package — it can't import +// its parent's test helpers. +type scriptedRunner struct { + t *testing.T + runs []scriptedCall + sudos []scriptedCall +} + +type scriptedCall struct { + matchName string // empty for RunSudo (sudo has no distinct name arg) + matchArgs []string // nil means "don't care" + out []byte + err error +} + +func (r *scriptedRunner) Run(_ context.Context, name string, args ...string) ([]byte, error) { + r.t.Helper() + if len(r.runs) == 0 { + r.t.Fatalf("unexpected Run(%q, %v)", name, args) + } + step := r.runs[0] + r.runs = r.runs[1:] + if step.matchName != "" && step.matchName != name { + r.t.Fatalf("Run name = %q, want %q", name, step.matchName) + } + return step.out, step.err +} + +func (r *scriptedRunner) RunSudo(_ context.Context, args ...string) ([]byte, error) { + r.t.Helper() + if len(r.sudos) == 0 { + r.t.Fatalf("unexpected RunSudo(%v)", args) + } + step := r.sudos[0] + r.sudos = r.sudos[1:] + return step.out, step.err +} + +// TestWaitForPathReturnsDeadlineExceededWhenSocketNeverAppears pins +// the timeout branch of waitForPath. If this drifts, every callsite +// that wraps it (EnsureSocketAccess on the firecracker API + +// vsock sockets) loses its bounded wait. +func TestWaitForPathReturnsDeadlineExceededWhenSocketNeverAppears(t *testing.T) { + missing := filepath.Join(t.TempDir(), "never-created.sock") + start := time.Now() + err := waitForPath(context.Background(), missing, 150*time.Millisecond, "api socket") + elapsed := time.Since(start) + + if !errors.Is(err, context.DeadlineExceeded) { + t.Fatalf("err = %v, want wrapped context.DeadlineExceeded", err) + } + if !contains(err.Error(), "api socket") { + t.Fatalf("err = %v, want label 'api socket' in message", err) + } + // Timeout should fire close to the configured budget, not zero + // (tight-loop regression) and not way over (missing select + // regression). The 100ms poll tick plus the initial stat makes + // the lower bound noisy; check we at least waited a tick. + if elapsed < 90*time.Millisecond { + t.Fatalf("returned after %s; waitForPath exited before its timeout budget", elapsed) + } +} + +// TestWaitForPathReturnsOnceSocketAppears pins the happy path: +// when the file materialises mid-wait, the function returns nil +// without having to walk to its deadline. +func TestWaitForPathReturnsOnceSocketAppears(t *testing.T) { + socketPath := filepath.Join(t.TempDir(), "will-appear.sock") + go func() { + time.Sleep(50 * time.Millisecond) + _ = os.WriteFile(socketPath, []byte{}, 0o600) + }() + if err := waitForPath(context.Background(), socketPath, 2*time.Second, "api socket"); err != nil { + t.Fatalf("waitForPath: %v", err) + } +} + +// TestWaitForPathRespectsContextCancellation pins the ctx.Done() +// branch — a canceled request must not be blocked by the poll +// interval. +func TestWaitForPathRespectsContextCancellation(t *testing.T) { + missing := filepath.Join(t.TempDir(), "never.sock") + ctx, cancel := context.WithCancel(context.Background()) + go func() { + time.Sleep(30 * time.Millisecond) + cancel() + }() + err := waitForPath(ctx, missing, 5*time.Second, "api socket") + if !errors.Is(err, context.Canceled) { + t.Fatalf("err = %v, want context.Canceled when ctx is cancelled mid-wait", err) + } +} + +// TestEnsureSocketAccessChmodFailureBubbles verifies the chmod step +// fails fast before any ownership handoff. Once chown runs, the +// bounded helper no longer owns the socket and can't tighten its mode +// without CAP_FOWNER, so the order matters. +func TestEnsureSocketAccessChmodFailureBubbles(t *testing.T) { + socketPath := filepath.Join(t.TempDir(), "present.sock") + if err := os.WriteFile(socketPath, []byte{}, 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + + chmodErr := errors.New("sudo chmod failed") + runner := &scriptedRunner{ + t: t, + sudos: []scriptedCall{{err: chmodErr}}, + } + mgr := New(runner, Config{}, slog.Default()) + + err := mgr.EnsureSocketAccess(context.Background(), socketPath, "api socket") + if !errors.Is(err, chmodErr) { + t.Fatalf("err = %v, want chmod error", err) + } + // chown must not have been attempted. + if len(runner.sudos) != 0 { + t.Fatalf("chown was attempted after chmod failed: %d sudo calls left", len(runner.sudos)) + } +} + +// TestEnsureSocketAccessChownFailureBubbles verifies the ownership +// handoff still surfaces errors after chmod succeeds. +func TestEnsureSocketAccessChownFailureBubbles(t *testing.T) { + socketPath := filepath.Join(t.TempDir(), "present.sock") + if err := os.WriteFile(socketPath, []byte{}, 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + + chownErr := errors.New("sudo chown failed") + runner := &scriptedRunner{ + t: t, + sudos: []scriptedCall{ + {}, // chmod succeeds + {err: chownErr}, // chown fails + }, + } + mgr := New(runner, Config{}, slog.Default()) + + err := mgr.EnsureSocketAccess(context.Background(), socketPath, "api socket") + if !errors.Is(err, chownErr) { + t.Fatalf("err = %v, want chown error", err) + } +} + +// TestEnsureSocketAccessTimesOutBeforeTouchingRunner pins the +// ordering contract: if waitForPath never sees the socket, the +// sudo commands must not run. Running chown/chmod against a +// non-existent path would just noise the logs. +func TestEnsureSocketAccessTimesOutBeforeTouchingRunner(t *testing.T) { + missing := filepath.Join(t.TempDir(), "never.sock") + runner := &scriptedRunner{t: t} // no scripted calls — any runner invocation fails the test + mgr := New(runner, Config{}, slog.Default()) + + // EnsureSocketAccess's waitForPath has a hardcoded 5s timeout, + // and we can't inject a shorter one without widening the API. + // Use a short context instead — cancellation short-circuits + // waitForPath via the ctx.Done() branch. + ctx, cancel := context.WithTimeout(context.Background(), 150*time.Millisecond) + defer cancel() + + err := mgr.EnsureSocketAccess(ctx, missing, "api socket") + if err == nil { + t.Fatal("EnsureSocketAccess: want error when socket never appears") + } +} + +// TestEnsureSocketAccessForAsyncReturnsImmediatelyWhenNoPaths pins the +// fast-path: callers can hand the helper an empty list (e.g. when VSockPath +// is unset) and get a no-op channel back without spinning a goroutine. +func TestEnsureSocketAccessForAsyncReturnsImmediatelyWhenNoPaths(t *testing.T) { + runner := &scriptedRunner{t: t} // any runner call would fail the test + mgr := New(runner, Config{}, slog.Default()) + + done := mgr.EnsureSocketAccessForAsync(context.Background(), []string{"", " "}, 1000, 1000) + select { + case err := <-done: + if err != nil { + t.Fatalf("got %v, want nil for empty input", err) + } + case <-time.After(time.Second): + t.Fatal("EnsureSocketAccessForAsync did not signal completion") + } +} + +// TestEnsureSocketAccessForAsyncWaitsForSocketThenChowns pins the boot-time +// race fix: while Machine.Start spins up firecracker, the helper polls for the +// socket and runs chmod + chown the moment it appears. If this drifts, the +// SDK's HTTP probe gets EACCES on a root-owned socket and Start times out. +func TestEnsureSocketAccessForAsyncWaitsForSocketThenChowns(t *testing.T) { + socketPath := filepath.Join(t.TempDir(), "delayed.sock") + go func() { + time.Sleep(50 * time.Millisecond) + _ = os.WriteFile(socketPath, []byte{}, 0o600) + }() + + runner := &scriptedRunner{ + t: t, + sudos: []scriptedCall{ + {}, // chmod 600 + {}, // chown uid:gid + }, + } + mgr := New(runner, Config{}, slog.Default()) + + done := mgr.EnsureSocketAccessForAsync(context.Background(), []string{socketPath}, 4242, 4242) + select { + case err := <-done: + if err != nil { + t.Fatalf("EnsureSocketAccessForAsync: %v", err) + } + case <-time.After(2 * time.Second): + t.Fatal("EnsureSocketAccessForAsync did not signal completion") + } + if len(runner.sudos) != 0 { + t.Fatalf("expected both chmod and chown to run, %d sudo calls remaining", len(runner.sudos)) + } +} + +// recordingRunner captures every Run/RunSudo invocation's full +// argv. Used to assert that ensureSocketAccessFor's fallback path +// passes `chown -h` rather than the symlink-following plain `chown`. +type recordingRunner struct { + sudos [][]string + runs [][]string +} + +func (r *recordingRunner) Run(_ context.Context, name string, args ...string) ([]byte, error) { + r.runs = append(r.runs, append([]string{name}, args...)) + return nil, nil +} + +func (r *recordingRunner) RunSudo(_ context.Context, args ...string) ([]byte, error) { + r.sudos = append(r.sudos, append([]string(nil), args...)) + return nil, nil +} + +// TestCleanupJailerChrootRejectsSymlink pins the TOCTOU-closing +// fcproc-side check: even if a daemon-uid attacker somehow bypasses +// the helper handler's validateNotSymlink (or races it), the cleanup +// itself refuses a symlinked path before any umount/rm shells. +func TestCleanupJailerChrootRejectsSymlink(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "real") + if err := os.Mkdir(target, 0o700); err != nil { + t.Fatalf("mkdir target: %v", err) + } + link := filepath.Join(dir, "link") + if err := os.Symlink(target, link); err != nil { + t.Fatalf("symlink: %v", err) + } + + // scriptedRunner with no scripted calls — any shell invocation + // trips r.t.Fatalf, proving rejection happened before umount/rm. + runner := &scriptedRunner{t: t} + mgr := New(runner, Config{}, slog.Default()) + if err := mgr.CleanupJailerChroot(context.Background(), link); err == nil { + t.Fatal("CleanupJailerChroot(symlink) succeeded, want error") + } +} + +// TestCleanupJailerChrootRejectsIntermediateSymlink covers the +// `/jail/firecracker/ → /` shape: the leaf "/root" component +// is a real directory inside the redirected target, but EvalSymlinks +// resolves to a different path so we still bail. +func TestCleanupJailerChrootRejectsIntermediateSymlink(t *testing.T) { + dir := t.TempDir() + realParent := filepath.Join(dir, "real-parent") + if err := os.MkdirAll(filepath.Join(realParent, "root"), 0o700); err != nil { + t.Fatalf("mkdir real: %v", err) + } + linkParent := filepath.Join(dir, "link-parent") + if err := os.Symlink(realParent, linkParent); err != nil { + t.Fatalf("symlink: %v", err) + } + chrootViaSymlink := filepath.Join(linkParent, "root") + + runner := &scriptedRunner{t: t} + mgr := New(runner, Config{}, slog.Default()) + if err := mgr.CleanupJailerChroot(context.Background(), chrootViaSymlink); err == nil { + t.Fatal("CleanupJailerChroot(symlinked-parent) succeeded, want error") + } +} + +// TestCleanupJailerChrootHappyPathWithoutMounts pins the no-leak case: +// when findmnt reports zero mounts under the chroot, the cleanup +// skips straight to `sudo rm -rf` without invoking umount2 / sudo +// umount at all. Regression guard for the umount2 rewrite — if the +// new logic leaks an extra runner call here, this test will fail. +func TestCleanupJailerChrootHappyPathWithoutMounts(t *testing.T) { + dir := t.TempDir() + chroot := filepath.Join(dir, "root") + if err := os.Mkdir(chroot, 0o700); err != nil { + t.Fatalf("mkdir chroot: %v", err) + } + runner := &scriptedRunner{ + t: t, + runs: []scriptedCall{ + // First mountsUnder() — pre-detach. Empty stdout = no mounts. + {matchName: "findmnt", out: nil}, + // Second mountsUnder() — post-detach guard. Same. + {matchName: "findmnt", out: nil}, + }, + // sudo rm -rf -- chroot. + sudos: []scriptedCall{{}}, + } + mgr := New(runner, Config{}, slog.Default()) + if err := mgr.CleanupJailerChroot(context.Background(), chroot); err != nil { + t.Fatalf("CleanupJailerChroot: %v", err) + } + if len(runner.runs) != 0 { + t.Fatalf("findmnt scripted calls left over: %d", len(runner.runs)) + } + if len(runner.sudos) != 0 { + t.Fatalf("sudo scripted calls left over: %d", len(runner.sudos)) + } +} + +// TestCleanupJailerChrootDetachesMountsDeepestFirst pins the ordering +// contract for the umount2 rewrite: child mounts come off before +// parents, otherwise the parent unmount would race against in-use +// children. The non-root code path shells `sudo umount --lazy`, which +// the recording runner captures so we can assert order + the --lazy +// flag. +func TestCleanupJailerChrootDetachesMountsDeepestFirst(t *testing.T) { + if os.Geteuid() == 0 { + t.Skip("euid 0 takes the umount2 syscall branch; this test exercises the sudo fallback") + } + dir := t.TempDir() + chroot := filepath.Join(dir, "root") + if err := os.Mkdir(chroot, 0o700); err != nil { + t.Fatalf("mkdir chroot: %v", err) + } + parent := chroot + child := filepath.Join(chroot, "lib") + deep := filepath.Join(child, "deep") + findmntOut := []byte(strings.Join([]string{parent, child, deep}, "\n")) + runner := &mountRecordingRunner{findmntOut: findmntOut} + mgr := New(runner, Config{}, slog.Default()) + if err := mgr.CleanupJailerChroot(context.Background(), chroot); err != nil { + t.Fatalf("CleanupJailerChroot: %v", err) + } + // Three umount + final rm -rf. The umount targets must be deep, + // child, parent in that order. + wantTargets := []string{deep, child, parent} + if len(runner.umountTargets) != len(wantTargets) { + t.Fatalf("umount calls = %v, want %d", runner.umountTargets, len(wantTargets)) + } + for i, want := range wantTargets { + if runner.umountTargets[i] != want { + t.Fatalf("umount[%d] = %q, want %q", i, runner.umountTargets[i], want) + } + } + if !runner.lazyFlagSeen { + t.Fatalf("expected umount --lazy on the sudo branch, args = %v", runner.umountArgs) + } + if !runner.rmCalled { + t.Fatal("rm -rf was never invoked after the umount sweep") + } +} + +// mountRecordingRunner stubs out findmnt + sudo for the cleanup path: +// the first findmnt call returns the canned mount list (pre-detach), +// subsequent calls return empty to simulate the kernel having dropped +// each mount as we asked. sudo umount/rm calls are captured and +// answer success. +type mountRecordingRunner struct { + findmntOut []byte + findmntCalls int + umountTargets []string + umountArgs [][]string + lazyFlagSeen bool + rmCalled bool +} + +func (r *mountRecordingRunner) Run(_ context.Context, name string, _ ...string) ([]byte, error) { + if name == "findmnt" { + r.findmntCalls++ + if r.findmntCalls == 1 { + return r.findmntOut, nil + } + return nil, nil + } + return nil, nil +} + +func (r *mountRecordingRunner) RunSudo(_ context.Context, args ...string) ([]byte, error) { + if len(args) == 0 { + return nil, nil + } + switch args[0] { + case "umount": + // Last arg is the target. Earlier args are flags. + if len(args) >= 2 { + r.umountTargets = append(r.umountTargets, args[len(args)-1]) + } + r.umountArgs = append(r.umountArgs, append([]string(nil), args...)) + for _, a := range args[1 : len(args)-1] { + if a == "--lazy" || a == "-l" { + r.lazyFlagSeen = true + } + } + case "rm": + r.rmCalled = true + } + return nil, nil +} + +// TestEnsureSocketAccessSudoBranchUsesChownNoFollow pins the +// symlink-defence on the local-priv (non-root) path: a follow-symlink +// chown on a daemon-uid attacker-planted symlink is the same arbitrary +// file-ownership primitive we close in the root branch via +// O_PATH|O_NOFOLLOW. Test only runs as non-root (the syscall branch is +// taken when euid == 0, which CI doesn't see). +func TestEnsureSocketAccessSudoBranchUsesChownNoFollow(t *testing.T) { + if os.Geteuid() == 0 { + t.Skip("euid 0 takes the syscall branch; the sudo branch is only reachable as a regular user") + } + socketPath := filepath.Join(t.TempDir(), "present.sock") + if err := os.WriteFile(socketPath, []byte{}, 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + runner := &recordingRunner{} + mgr := New(runner, Config{}, slog.Default()) + + if err := mgr.EnsureSocketAccess(context.Background(), socketPath, "api socket"); err != nil { + t.Fatalf("EnsureSocketAccess: %v", err) + } + if len(runner.sudos) != 2 { + t.Fatalf("got %d sudo calls, want 2 (chmod, chown)", len(runner.sudos)) + } + chown := runner.sudos[1] + if len(chown) < 2 || chown[0] != "chown" { + t.Fatalf("second sudo call = %v, want chown", chown) + } + hasNoFollow := false + for _, arg := range chown[1:] { + if arg == "-h" { + hasNoFollow = true + break + } + } + if !hasNoFollow { + t.Fatalf("chown args = %v, missing the -h symlink-no-follow flag", chown) + } +} + +func contains(s, sub string) bool { + for i := 0; i+len(sub) <= len(s); i++ { + if s[i:i+len(sub)] == sub { + return true + } + } + return false +} diff --git a/internal/daemon/fcproc/findpid_jailer_test.go b/internal/daemon/fcproc/findpid_jailer_test.go new file mode 100644 index 0000000..ae89deb --- /dev/null +++ b/internal/daemon/fcproc/findpid_jailer_test.go @@ -0,0 +1,173 @@ +package fcproc + +import ( + "errors" + "fmt" + "os" + "path/filepath" + "testing" +) + +// pidfileFixture builds the on-disk shape findByJailerPidfile inspects: +// a /proc-like tree (one entry per pid with comm), an api-sock symlink +// pointing into a faux chroot, and the chroot's firecracker.pid file. +type pidfileFixture struct { + root string + proc string + runtime string + chroots string +} + +func newPidfileFixture(t *testing.T) *pidfileFixture { + t.Helper() + root := t.TempDir() + f := &pidfileFixture{ + root: root, + proc: filepath.Join(root, "proc"), + runtime: filepath.Join(root, "runtime"), + chroots: filepath.Join(root, "chroots"), + } + for _, dir := range []string{f.proc, f.runtime, f.chroots} { + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatalf("mkdir %s: %v", dir, err) + } + } + prev := procDir + procDir = f.proc + t.Cleanup(func() { procDir = prev }) + return f +} + +// addProc writes /proc//comm. Mirrors the real /proc shape (comm +// has a trailing newline; production code TrimSpaces it). +func (f *pidfileFixture) addProc(t *testing.T, pid int, comm string) { + t.Helper() + pidDir := filepath.Join(f.proc, fmt.Sprint(pid)) + if err := os.MkdirAll(pidDir, 0o755); err != nil { + t.Fatalf("mkdir %s: %v", pidDir, err) + } + if err := os.WriteFile(filepath.Join(pidDir, "comm"), []byte(comm+"\n"), 0o644); err != nil { + t.Fatalf("write comm: %v", err) + } +} + +// buildVMSocket lays out the chroot for a VM and returns the api-sock +// path the test points findByJailerPidfile at. pidfileContent is what +// `cat /firecracker.pid` will return — pass an empty string to +// skip writing the pidfile. +func (f *pidfileFixture) buildVMSocket(t *testing.T, vmid, pidfileContent string) (apiSock string) { + t.Helper() + chroot := filepath.Join(f.chroots, vmid, "root") + if err := os.MkdirAll(chroot, 0o755); err != nil { + t.Fatalf("mkdir chroot: %v", err) + } + socketTarget := filepath.Join(chroot, "firecracker.socket") + if err := os.WriteFile(socketTarget, nil, 0o600); err != nil { + t.Fatalf("write socket placeholder: %v", err) + } + if pidfileContent != "" { + if err := os.WriteFile(filepath.Join(chroot, "firecracker.pid"), []byte(pidfileContent), 0o600); err != nil { + t.Fatalf("write pidfile: %v", err) + } + } + apiSock = filepath.Join(f.runtime, "fc-"+vmid+".sock") + if err := os.Symlink(socketTarget, apiSock); err != nil { + t.Fatalf("symlink api sock: %v", err) + } + return apiSock +} + +func TestFindByJailerPidfileHappyPath(t *testing.T) { + f := newPidfileFixture(t) + apiSock := f.buildVMSocket(t, "abc", "100\n") + f.addProc(t, 100, "firecracker") + + got, err := findByJailerPidfile(apiSock) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if got != 100 { + t.Fatalf("pid = %d, want 100", got) + } +} + +func TestFindByJailerPidfileMissingPidfile(t *testing.T) { + f := newPidfileFixture(t) + // VM exists in the chroot layout but no pidfile (e.g. VM was created + // but never started, or stopped and pidfile cleared). + apiSock := f.buildVMSocket(t, "abc", "") + + _, err := findByJailerPidfile(apiSock) + if !errors.Is(err, errFirecrackerPIDNotFound) { + t.Fatalf("err = %v, want errFirecrackerPIDNotFound", err) + } +} + +func TestFindByJailerPidfileStalePID(t *testing.T) { + f := newPidfileFixture(t) + // Pidfile points at a PID with no /proc entry — the FC died but the + // pidfile was left behind. Reconcile must treat this as "not running" + // so the rediscoverHandles path can mark the VM stopped cleanly. + apiSock := f.buildVMSocket(t, "abc", "100\n") + // Deliberately don't addProc(100, ...). + + _, err := findByJailerPidfile(apiSock) + if !errors.Is(err, errFirecrackerPIDNotFound) { + t.Fatalf("err = %v, want errFirecrackerPIDNotFound", err) + } +} + +func TestFindByJailerPidfileWrongComm(t *testing.T) { + f := newPidfileFixture(t) + // PID was recycled by the kernel and now belongs to some other + // process. The comm check is what catches this — pidfile content is + // untrusted across reboots / PID-wraparound. + apiSock := f.buildVMSocket(t, "abc", "100\n") + f.addProc(t, 100, "bash") + + _, err := findByJailerPidfile(apiSock) + if !errors.Is(err, errFirecrackerPIDNotFound) { + t.Fatalf("err = %v, want errFirecrackerPIDNotFound", err) + } +} + +func TestFindByJailerPidfileGarbageContent(t *testing.T) { + f := newPidfileFixture(t) + apiSock := f.buildVMSocket(t, "abc", "not-a-pid\n") + + _, err := findByJailerPidfile(apiSock) + if !errors.Is(err, errFirecrackerPIDNotFound) { + t.Fatalf("err = %v, want errFirecrackerPIDNotFound", err) + } +} + +func TestFindByJailerPidfileNonSymlinkApiSock(t *testing.T) { + f := newPidfileFixture(t) + // Direct (non-jailer) launches produce a regular-file api sock with + // no chroot beside it. Pidfile lookup can't help; fall through cleanly. + apiSock := filepath.Join(f.runtime, "direct-launch.sock") + if err := os.WriteFile(apiSock, nil, 0o600); err != nil { + t.Fatalf("write apiSock: %v", err) + } + + _, err := findByJailerPidfile(apiSock) + if !errors.Is(err, errFirecrackerPIDNotFound) { + t.Fatalf("err = %v, want errFirecrackerPIDNotFound", err) + } +} + +func TestFindByJailerPidfileTrimsWhitespace(t *testing.T) { + f := newPidfileFixture(t) + // Some FC versions write the pidfile with stray whitespace; the + // parser must tolerate it. + apiSock := f.buildVMSocket(t, "abc", " 100 \n\n") + f.addProc(t, 100, "firecracker") + + got, err := findByJailerPidfile(apiSock) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if got != 100 { + t.Fatalf("pid = %d, want 100", got) + } +} diff --git a/internal/daemon/guest_ssh.go b/internal/daemon/guest_ssh.go new file mode 100644 index 0000000..de05991 --- /dev/null +++ b/internal/daemon/guest_ssh.go @@ -0,0 +1,35 @@ +package daemon + +import ( + "context" + "io" + "os" + "time" + + "banger/internal/guest" +) + +// guestSSHClient is the narrow guest-SSH surface the daemon uses for +// workspace prepare / export and ad-hoc guest interactions. +type guestSSHClient interface { + Close() error + RunScript(context.Context, string, io.Writer) error + RunScriptOutput(context.Context, string) ([]byte, error) + UploadFile(context.Context, string, os.FileMode, []byte, io.Writer) error + StreamTar(context.Context, string, string, io.Writer) error + StreamTarEntries(context.Context, string, []string, string, io.Writer) error +} + +func (d *Daemon) waitForGuestSSH(ctx context.Context, address string, interval time.Duration) error { + if d != nil && d.guestWaitForSSH != nil { + return d.guestWaitForSSH(ctx, address, d.config.SSHKeyPath, interval) + } + return guest.WaitForSSH(ctx, address, d.config.SSHKeyPath, d.layout.KnownHostsPath, interval) +} + +func (d *Daemon) dialGuest(ctx context.Context, address string) (guestSSHClient, error) { + if d != nil && d.guestDial != nil { + return d.guestDial(ctx, address, d.config.SSHKeyPath) + } + return guest.Dial(ctx, address, d.config.SSHKeyPath, d.layout.KnownHostsPath) +} diff --git a/internal/daemon/host_network.go b/internal/daemon/host_network.go new file mode 100644 index 0000000..9d1aa26 --- /dev/null +++ b/internal/daemon/host_network.go @@ -0,0 +1,252 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "log/slog" + "net" + "path/filepath" + "strings" + "time" + + "banger/internal/daemon/fcproc" + "banger/internal/firecracker" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" + "banger/internal/vmdns" + "banger/internal/vsockagent" +) + +// HostNetwork owns the daemon's side of host networking: the TAP +// interface pool, the bridge, per-VM tap/NAT/DNS wiring, and the +// firecracker-process primitives (bridge setup, socket access, +// pgrep-based PID resolution, ctrl-alt-del, wait/kill) plus DM +// snapshot helpers. The Daemon holds one *HostNetwork and routes +// lifecycle calls through it instead of reaching into host-state +// directly. +// +// Fields stay unexported so peer services (VMService, etc.) access +// HostNetwork only through consumer-defined interfaces, not by +// fishing around in its struct. Construction goes through +// newHostNetwork with an explicit dependency bag so the wiring is +// auditable. +type HostNetwork struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + closing chan struct{} + priv privilegedOps + + tapPool tapPool + vmDNS *vmdns.Server + + // Test seams. Default to real implementations at construction; + // tests build HostNetwork with stubs instead of mutating package + // globals, so parallel tests can't race each other's fake state. + lookupExecutable func(name string) (string, error) + vmDNSAddr func(server *vmdns.Server) string +} + +// hostNetworkDeps is the explicit wiring bag newHostNetwork expects. +// Keeping the deps in a dedicated struct rather than positional args +// makes the construction site in Daemon.Open read like a declaration. +type hostNetworkDeps struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + closing chan struct{} + priv privilegedOps +} + +func newHostNetwork(deps hostNetworkDeps) *HostNetwork { + return &HostNetwork{ + runner: deps.runner, + logger: deps.logger, + config: deps.config, + layout: deps.layout, + closing: deps.closing, + priv: deps.priv, + lookupExecutable: system.LookupExecutable, + vmDNSAddr: func(server *vmdns.Server) string { return server.Addr() }, + } +} + +// --- DNS server lifecycle ------------------------------------------- + +func (n *HostNetwork) startVMDNS(addr string) error { + server, err := vmdns.New(addr, n.logger) + if err != nil { + return err + } + n.vmDNS = server + if n.logger != nil { + n.logger.Info("vm dns serving", "dns_addr", server.Addr()) + } + return nil +} + +func (n *HostNetwork) stopVMDNS() error { + if n == nil || n.vmDNS == nil { + return nil + } + err := n.vmDNS.Close() + n.vmDNS = nil + return err +} + +func (n *HostNetwork) setDNS(ctx context.Context, vmName, guestIP string) error { + if n.vmDNS == nil { + return nil + } + if err := n.vmDNS.Set(vmdns.RecordName(vmName), guestIP); err != nil { + return err + } + n.ensureVMDNSResolverRouting(ctx) + return nil +} + +func (n *HostNetwork) removeDNS(dnsName string) error { + if dnsName == "" || n.vmDNS == nil { + return nil + } + return n.vmDNS.Remove(dnsName) +} + +// replaceDNS replaces the DNS server's full record set. Callers +// (Daemon.rebuildDNS) filter by vm-alive first; HostNetwork just +// takes the pre-filtered map. +func (n *HostNetwork) replaceDNS(records map[string]string) error { + if n.vmDNS == nil { + return nil + } + return n.vmDNS.Replace(records) +} + +// --- Firecracker process helpers ------------------------------------ + +// fc builds a fresh fcproc.Manager from the HostNetwork's current +// runner, config, and layout. Manager is stateless beyond those +// handles, so constructing per call keeps tests that build literals +// working without extra wiring. +func (n *HostNetwork) fc() *fcproc.Manager { + return fcproc.New(n.runner, fcproc.Config{ + FirecrackerBin: n.config.FirecrackerBin, + BridgeName: n.config.BridgeName, + BridgeIP: n.config.BridgeIP, + CIDR: n.config.CIDR, + RuntimeDir: n.layout.RuntimeDir, + }, n.logger) +} + +func (n *HostNetwork) ensureBridge(ctx context.Context) error { + return n.privOps().EnsureBridge(ctx) +} + +func (n *HostNetwork) ensureSocketDir() error { + return n.fc().EnsureSocketDir() +} + +func (n *HostNetwork) createTap(ctx context.Context, tap string) error { + return n.privOps().CreateTap(ctx, tap) +} + +func (n *HostNetwork) firecrackerBinary(ctx context.Context) (string, error) { + return n.privOps().ResolveFirecrackerBinary(ctx, n.config.FirecrackerBin) +} + +func (n *HostNetwork) ensureSocketAccess(ctx context.Context, socketPath, label string) error { + return n.privOps().EnsureSocketAccess(ctx, socketPath, label) +} + +func (n *HostNetwork) findFirecrackerPID(ctx context.Context, apiSock string) (int, error) { + return n.privOps().FindFirecrackerPID(ctx, apiSock) +} + +func (n *HostNetwork) resolveFirecrackerPID(ctx context.Context, machine *firecracker.Machine, apiSock string) int { + return n.fc().ResolvePID(ctx, machine, apiSock) +} + +func (n *HostNetwork) sendCtrlAltDel(ctx context.Context, apiSockPath string) error { + if err := n.ensureSocketAccess(ctx, apiSockPath, "firecracker api socket"); err != nil { + return err + } + return firecracker.New(apiSockPath, n.logger).SendCtrlAltDel(ctx) +} + +func (n *HostNetwork) waitForExit(ctx context.Context, pid int, apiSock string, timeout time.Duration) error { + deadline := time.Now().Add(timeout) + for { + running, err := n.privOps().ProcessRunning(ctx, pid, apiSock) + if err != nil { + return err + } + if !running { + return nil + } + if time.Now().After(deadline) { + return errWaitForExitTimeout + } + select { + case <-ctx.Done(): + return ctx.Err() + case <-time.After(100 * time.Millisecond): + } + } +} + +func (n *HostNetwork) killVMProcess(ctx context.Context, pid int) error { + return n.privOps().KillProcess(ctx, pid) +} + +// waitForGuestVSockAgent is a HostNetwork helper because it's +// fundamentally about waiting for a vsock socket the firecracker +// process is serving on. No daemon state needed. +func (n *HostNetwork) waitForGuestVSockAgent(ctx context.Context, socketPath string, timeout time.Duration) error { + if strings.TrimSpace(socketPath) == "" { + return errors.New("vsock path is required") + } + + waitCtx, cancel := context.WithTimeout(ctx, timeout) + defer cancel() + + ticker := time.NewTicker(vsockReadyPoll) + defer ticker.Stop() + + var lastErr error + for { + pingCtx, pingCancel := context.WithTimeout(waitCtx, 3*time.Second) + err := vsockagent.Health(pingCtx, n.logger, socketPath) + pingCancel() + if err == nil { + return nil + } + lastErr = err + + select { + case <-waitCtx.Done(): + if lastErr != nil { + return fmt.Errorf("guest vsock agent not ready: %w", lastErr) + } + return errors.New("guest vsock agent not ready before timeout") + case <-ticker.C: + } + } +} + +// --- Utilities used across networking ------------------------------ + +func defaultVSockPath(runtimeDir, vmID string) string { + return filepath.Join(runtimeDir, "fc-"+system.ShortID(vmID)+".vsock") +} + +func defaultVSockCID(guestIP string) (uint32, error) { + ip := net.ParseIP(strings.TrimSpace(guestIP)).To4() + if ip == nil { + return 0, fmt.Errorf("guest IP is not IPv4: %q", guestIP) + } + return 10000 + uint32(ip[3]), nil +} diff --git a/internal/daemon/image_build_ops.go b/internal/daemon/image_build_ops.go deleted file mode 100644 index 813a7a2..0000000 --- a/internal/daemon/image_build_ops.go +++ /dev/null @@ -1,218 +0,0 @@ -package daemon - -import ( - "context" - "fmt" - "strings" - "sync" - "time" - - "banger/internal/api" - "banger/internal/model" -) - -type imageBuildProgressKey struct{} - -type imageBuildOperationState struct { - mu sync.Mutex - cancel context.CancelFunc - op api.ImageBuildOperation -} - -func newImageBuildOperationState() (*imageBuildOperationState, error) { - id, err := model.NewID() - if err != nil { - return nil, err - } - now := model.Now() - return &imageBuildOperationState{ - op: api.ImageBuildOperation{ - ID: id, - Stage: "queued", - Detail: "waiting to start", - StartedAt: now, - UpdatedAt: now, - }, - }, nil -} - -func withImageBuildProgress(ctx context.Context, op *imageBuildOperationState) context.Context { - if op == nil { - return ctx - } - return context.WithValue(ctx, imageBuildProgressKey{}, op) -} - -func imageBuildProgressFromContext(ctx context.Context) *imageBuildOperationState { - if ctx == nil { - return nil - } - op, _ := ctx.Value(imageBuildProgressKey{}).(*imageBuildOperationState) - return op -} - -func imageBuildStage(ctx context.Context, stage, detail string) { - if op := imageBuildProgressFromContext(ctx); op != nil { - op.stage(stage, detail) - } -} - -func imageBuildBindImage(ctx context.Context, image model.Image) { - if op := imageBuildProgressFromContext(ctx); op != nil { - op.bindImage(image) - } -} - -func imageBuildSetLogPath(ctx context.Context, path string) { - if op := imageBuildProgressFromContext(ctx); op != nil { - op.setLogPath(path) - } -} - -func (op *imageBuildOperationState) setCancel(cancel context.CancelFunc) { - op.mu.Lock() - defer op.mu.Unlock() - op.cancel = cancel -} - -func (op *imageBuildOperationState) setLogPath(path string) { - op.mu.Lock() - defer op.mu.Unlock() - op.op.BuildLogPath = strings.TrimSpace(path) - op.op.UpdatedAt = model.Now() -} - -func (op *imageBuildOperationState) bindImage(image model.Image) { - op.mu.Lock() - defer op.mu.Unlock() - op.op.ImageID = image.ID - op.op.ImageName = image.Name -} - -func (op *imageBuildOperationState) stage(stage, detail string) { - op.mu.Lock() - defer op.mu.Unlock() - stage = strings.TrimSpace(stage) - detail = strings.TrimSpace(detail) - if stage == "" { - stage = op.op.Stage - } - if stage == op.op.Stage && detail == op.op.Detail { - return - } - op.op.Stage = stage - op.op.Detail = detail - op.op.UpdatedAt = model.Now() -} - -func (op *imageBuildOperationState) done(image model.Image) { - op.mu.Lock() - defer op.mu.Unlock() - imageCopy := image - op.op.ImageID = image.ID - op.op.ImageName = image.Name - op.op.Stage = "ready" - op.op.Detail = "image is ready" - op.op.Done = true - op.op.Success = true - op.op.Error = "" - op.op.Image = &imageCopy - op.op.UpdatedAt = model.Now() -} - -func (op *imageBuildOperationState) fail(err error) { - op.mu.Lock() - defer op.mu.Unlock() - op.op.Done = true - op.op.Success = false - if err != nil { - op.op.Error = err.Error() - } - if strings.TrimSpace(op.op.Detail) == "" { - op.op.Detail = "image build failed" - } - op.op.UpdatedAt = model.Now() -} - -func (op *imageBuildOperationState) snapshot() api.ImageBuildOperation { - op.mu.Lock() - defer op.mu.Unlock() - snapshot := op.op - if snapshot.Image != nil { - imageCopy := *snapshot.Image - snapshot.Image = &imageCopy - } - return snapshot -} - -func (op *imageBuildOperationState) cancelOperation() { - op.mu.Lock() - cancel := op.cancel - op.mu.Unlock() - if cancel != nil { - cancel() - } -} - -func (d *Daemon) BeginImageBuild(_ context.Context, params api.ImageBuildParams) (api.ImageBuildOperation, error) { - op, err := newImageBuildOperationState() - if err != nil { - return api.ImageBuildOperation{}, err - } - buildCtx, cancel := context.WithCancel(context.Background()) - op.setCancel(cancel) - - d.imageBuildOpsMu.Lock() - if d.imageBuildOps == nil { - d.imageBuildOps = map[string]*imageBuildOperationState{} - } - d.imageBuildOps[op.op.ID] = op - d.imageBuildOpsMu.Unlock() - - go d.runImageBuildOperation(withImageBuildProgress(buildCtx, op), op, params) - return op.snapshot(), nil -} - -func (d *Daemon) runImageBuildOperation(ctx context.Context, op *imageBuildOperationState, params api.ImageBuildParams) { - image, err := d.BuildImage(ctx, params) - if err != nil { - op.fail(err) - return - } - op.done(image) -} - -func (d *Daemon) ImageBuildStatus(_ context.Context, id string) (api.ImageBuildOperation, error) { - d.imageBuildOpsMu.Lock() - op, ok := d.imageBuildOps[strings.TrimSpace(id)] - d.imageBuildOpsMu.Unlock() - if !ok { - return api.ImageBuildOperation{}, fmt.Errorf("image build operation not found: %s", id) - } - return op.snapshot(), nil -} - -func (d *Daemon) CancelImageBuild(_ context.Context, id string) error { - d.imageBuildOpsMu.Lock() - op, ok := d.imageBuildOps[strings.TrimSpace(id)] - d.imageBuildOpsMu.Unlock() - if !ok { - return fmt.Errorf("image build operation not found: %s", id) - } - op.cancelOperation() - return nil -} - -func (d *Daemon) pruneImageBuildOperations(olderThan time.Time) { - d.imageBuildOpsMu.Lock() - defer d.imageBuildOpsMu.Unlock() - for id, op := range d.imageBuildOps { - snapshot := op.snapshot() - if !snapshot.Done { - continue - } - if snapshot.UpdatedAt.Before(olderThan) { - delete(d.imageBuildOps, id) - } - } -} diff --git a/internal/daemon/image_cache.go b/internal/daemon/image_cache.go new file mode 100644 index 0000000..fd2049f --- /dev/null +++ b/internal/daemon/image_cache.go @@ -0,0 +1,112 @@ +package daemon + +import ( + "context" + crand "crypto/rand" + "encoding/hex" + "fmt" + "io/fs" + "os" + "path/filepath" + + "banger/internal/api" +) + +// PruneOCICache removes every blob under the OCI layer cache. The +// cache is purely a re-pull-avoidance (every flattened image is +// independent of the blobs that sourced it), so the worst-case +// outcome of pruning is "next pull of the same ref re-downloads its +// layers" — a reasonable disk-hygiene knob. +// +// DryRun=true walks the cache and returns the size that WOULD be +// freed without touching anything; tests and CLI consumers print +// that summary so the operator can decide. +// +// Concurrent in-flight pulls may break if they're mid-fetch when +// the rename happens. That tradeoff is documented in the CLI help +// and docs/oci-import.md; the prune is an operator action, not a +// background sweep. +func (s *ImageService) PruneOCICache(_ context.Context, params api.ImageCachePruneParams) (api.ImageCachePruneResult, error) { + cacheDir := s.layout.OCICacheDir + bytes, blobs, err := walkCacheUsage(cacheDir) + if err != nil { + return api.ImageCachePruneResult{}, fmt.Errorf("inspect oci cache: %w", err) + } + res := api.ImageCachePruneResult{ + BytesFreed: bytes, + BlobsFreed: blobs, + DryRun: params.DryRun, + CacheDir: cacheDir, + } + if params.DryRun || blobs == 0 { + return res, nil + } + // Atomic rename aside so a follow-up pull doesn't see a half- + // removed tree, then rm -rf the renamed dir, then recreate the + // empty cache so future pulls find their write target. + aside, err := renameAside(cacheDir) + if err != nil { + if os.IsNotExist(err) { + return res, nil + } + return api.ImageCachePruneResult{}, fmt.Errorf("rename oci cache aside: %w", err) + } + if err := os.MkdirAll(cacheDir, 0o755); err != nil { + // Best-effort restore: try to rename back so the caller + // isn't left with a vanished cache dir. If both moves + // failed, surface both — the operator needs to know. + if restoreErr := os.Rename(aside, cacheDir); restoreErr != nil { + return api.ImageCachePruneResult{}, fmt.Errorf("recreate oci cache: %w (also failed to restore from %s: %v)", err, aside, restoreErr) + } + return api.ImageCachePruneResult{}, fmt.Errorf("recreate oci cache: %w", err) + } + if err := os.RemoveAll(aside); err != nil { + return api.ImageCachePruneResult{}, fmt.Errorf("remove old oci cache (%s): %w", aside, err) + } + return res, nil +} + +func walkCacheUsage(cacheDir string) (int64, int, error) { + var bytes int64 + var blobs int + err := filepath.WalkDir(cacheDir, func(path string, d fs.DirEntry, err error) error { + if err != nil { + // Cache dir doesn't exist yet (fresh install, no OCI + // pulls so far) — that's not a prune error, it's a + // 0-byte / 0-blob result. + if os.IsNotExist(err) && path == cacheDir { + return filepath.SkipAll + } + return err + } + if d.IsDir() { + return nil + } + info, err := d.Info() + if err != nil { + return err + } + bytes += info.Size() + blobs++ + return nil + }) + if err != nil { + return 0, 0, err + } + return bytes, blobs, nil +} + +// renameAside moves cacheDir to a sibling temp path so the prune can +// rm-rf it without racing against fresh writes. Returns the aside +// path on success. +func renameAside(cacheDir string) (string, error) { + var suffix [8]byte + if _, err := crand.Read(suffix[:]); err != nil { + return "", err + } + aside := cacheDir + ".pruning-" + hex.EncodeToString(suffix[:]) + if err := os.Rename(cacheDir, aside); err != nil { + return "", err + } + return aside, nil +} diff --git a/internal/daemon/image_cache_test.go b/internal/daemon/image_cache_test.go new file mode 100644 index 0000000..89b96c7 --- /dev/null +++ b/internal/daemon/image_cache_test.go @@ -0,0 +1,125 @@ +package daemon + +import ( + "context" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/paths" +) + +// seedFakeOCICache drops a few fixed-size files that mimic an OCI +// layer cache layout (blobs/sha256/) so tests don't depend on +// real registry round-trips. +func seedFakeOCICache(t *testing.T, cacheDir string) (totalBytes int64, blobCount int) { + t.Helper() + blobsDir := filepath.Join(cacheDir, "blobs", "sha256") + if err := os.MkdirAll(blobsDir, 0o755); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + for i, payload := range []string{"layer-a", "layer-b-bigger", "layer-c"} { + name := strings.Repeat("ab", 32) // 64 hex chars stand-in + path := filepath.Join(blobsDir, name+"-"+string(rune('0'+i))) + if err := os.WriteFile(path, []byte(payload), 0o644); err != nil { + t.Fatalf("write blob: %v", err) + } + totalBytes += int64(len(payload)) + blobCount++ + } + return totalBytes, blobCount +} + +func TestPruneOCICacheDryRunReportsSizeWithoutDeleting(t *testing.T) { + cacheRoot := t.TempDir() + cacheDir := filepath.Join(cacheRoot, "oci") + wantBytes, wantBlobs := seedFakeOCICache(t, cacheDir) + + d := &Daemon{layout: paths.Layout{OCICacheDir: cacheDir}} + wireServices(d) + + res, err := d.img.PruneOCICache(context.Background(), api.ImageCachePruneParams{DryRun: true}) + if err != nil { + t.Fatalf("PruneOCICache: %v", err) + } + if res.BytesFreed != wantBytes { + t.Fatalf("BytesFreed = %d, want %d", res.BytesFreed, wantBytes) + } + if res.BlobsFreed != wantBlobs { + t.Fatalf("BlobsFreed = %d, want %d", res.BlobsFreed, wantBlobs) + } + if !res.DryRun { + t.Error("result.DryRun = false, want true") + } + // Blobs must still exist. + entries, _ := os.ReadDir(filepath.Join(cacheDir, "blobs", "sha256")) + if len(entries) != wantBlobs { + t.Fatalf("blobs dir: got %d entries, want %d (dry-run must not delete)", len(entries), wantBlobs) + } +} + +func TestPruneOCICacheRemovesAllBlobs(t *testing.T) { + cacheRoot := t.TempDir() + cacheDir := filepath.Join(cacheRoot, "oci") + wantBytes, wantBlobs := seedFakeOCICache(t, cacheDir) + + d := &Daemon{layout: paths.Layout{OCICacheDir: cacheDir}} + wireServices(d) + + res, err := d.img.PruneOCICache(context.Background(), api.ImageCachePruneParams{}) + if err != nil { + t.Fatalf("PruneOCICache: %v", err) + } + if res.BytesFreed != wantBytes { + t.Fatalf("BytesFreed = %d, want %d", res.BytesFreed, wantBytes) + } + if res.BlobsFreed != wantBlobs { + t.Fatalf("BlobsFreed = %d, want %d", res.BlobsFreed, wantBlobs) + } + if res.DryRun { + t.Error("result.DryRun = true on a real prune") + } + // Cache dir must exist (recreated empty) so the next pull has a + // place to write blobs. + info, err := os.Stat(cacheDir) + if err != nil { + t.Fatalf("cache dir gone after prune: %v", err) + } + if !info.IsDir() { + t.Fatal("cache path is not a directory after prune") + } + // Blobs subdir is gone (the rename took everything aside; the + // recreate left only the bare cache dir). + if _, err := os.Stat(filepath.Join(cacheDir, "blobs")); !os.IsNotExist(err) { + t.Fatalf("blobs dir survived prune: %v", err) + } + // Aside dirs must have been cleaned up too. + roots, _ := os.ReadDir(cacheRoot) + for _, e := range roots { + if strings.Contains(e.Name(), ".pruning-") { + t.Errorf("aside dir leaked: %s", e.Name()) + } + } +} + +// TestPruneOCICacheMissingDirIsZeroResult covers the fresh-install +// case: no OCI pulls have ever happened, so the cache dir doesn't +// exist. Prune must report zero, not error. +func TestPruneOCICacheMissingDirIsZeroResult(t *testing.T) { + cacheRoot := t.TempDir() + cacheDir := filepath.Join(cacheRoot, "oci") + // Don't create cacheDir. + + d := &Daemon{layout: paths.Layout{OCICacheDir: cacheDir}} + wireServices(d) + + res, err := d.img.PruneOCICache(context.Background(), api.ImageCachePruneParams{}) + if err != nil { + t.Fatalf("PruneOCICache(missing): %v", err) + } + if res.BytesFreed != 0 || res.BlobsFreed != 0 { + t.Fatalf("missing cache should be zero; got %+v", res) + } +} diff --git a/internal/daemon/image_seed.go b/internal/daemon/image_seed.go index 97f6c34..0b12d97 100644 --- a/internal/daemon/image_seed.go +++ b/internal/daemon/image_seed.go @@ -4,83 +4,85 @@ import ( "context" "fmt" "os" - "path/filepath" "strings" + "time" "banger/internal/guest" "banger/internal/model" "banger/internal/system" ) -func (d *Daemon) seedAuthorizedKeyOnExt4Image(ctx context.Context, imagePath string) (string, error) { - if strings.TrimSpace(d.config.SSHKeyPath) == "" { +func (s *ImageService) seedAuthorizedKeyOnExt4Image(ctx context.Context, imagePath string) (string, error) { + if strings.TrimSpace(s.config.SSHKeyPath) == "" { return "", nil } - fingerprint, err := guest.AuthorizedPublicKeyFingerprint(d.config.SSHKeyPath) + fingerprint, err := guest.AuthorizedPublicKeyFingerprint(s.config.SSHKeyPath) if err != nil { return "", fmt.Errorf("derive authorized ssh key fingerprint: %w", err) } - publicKey, err := guest.AuthorizedPublicKey(d.config.SSHKeyPath) + publicKey, err := guest.AuthorizedPublicKey(s.config.SSHKeyPath) if err != nil { return "", fmt.Errorf("derive authorized ssh key: %w", err) } - mountDir, cleanup, err := system.MountTempDir(ctx, d.runner, imagePath, false) - if err != nil { - return "", err - } - defer cleanup() - - if err := d.flattenNestedWorkHome(ctx, mountDir); err != nil { - return "", err - } - - sshDir := filepath.Join(mountDir, ".ssh") - if _, err := d.runner.RunSudo(ctx, "mkdir", "-p", sshDir); err != nil { - return "", err - } - if _, err := d.runner.RunSudo(ctx, "chmod", "700", sshDir); err != nil { - return "", err - } - - authorizedKeysPath := filepath.Join(sshDir, "authorized_keys") - existing, err := d.runner.RunSudo(ctx, "cat", authorizedKeysPath) - if err != nil { - existing = nil - } - merged := mergeAuthorizedKey(existing, publicKey) - tmpFile, err := os.CreateTemp("", "banger-image-authorized-keys-*") - if err != nil { - return "", err - } - tmpPath := tmpFile.Name() - if _, err := tmpFile.Write(merged); err != nil { - _ = tmpFile.Close() - _ = os.Remove(tmpPath) - return "", err - } - if err := tmpFile.Close(); err != nil { - _ = os.Remove(tmpPath) - return "", err - } - defer os.Remove(tmpPath) - if _, err := d.runner.RunSudo(ctx, "install", "-m", "600", tmpPath, authorizedKeysPath); err != nil { + if err := provisionAuthorizedKey(ctx, s.runner, imagePath, publicKey); err != nil { return "", err } return fingerprint, nil } -func (d *Daemon) refreshManagedWorkSeedFingerprint(ctx context.Context, image model.Image, fingerprint string) error { +// refreshManagedWorkSeedFingerprint re-seeds work-seed.ext4 with the +// daemon's current SSH key when a previously-stored fingerprint has +// gone stale (host key rotated, image rebuilt without a new seed). +// +// This path is reachable from concurrent vm.create RPCs: each one +// reads the same stale image.SeededSSHPublicKeyFingerprint from the +// store and races into here. Modifying the seed in place via +// e2rm/e2cp is not concurrent-read-safe — peer vm.create calls doing +// `MaterializeWorkDisk` in parallel `RdumpExt4Dir` the seed and +// observe a torn ext4 image ("Superblock checksum does not match"). +// +// Fix: stage the rewrite on a sibling tmpfile and atomic-rename. A +// concurrent reader either has the file open (kernel keeps the +// pre-rename inode alive) or opens after the rename (sees the new +// inode) — never observes a partial state. Two concurrent refreshes +// are idempotent (same daemon, same SSH key) so unique tmp suffixes +// are enough; whichever rename lands last wins, with identical +// content. UpsertImage runs after the rename so the recorded +// fingerprint always matches what's actually on disk for any reader +// that picks up the image record after this point. +func (s *ImageService) refreshManagedWorkSeedFingerprint(ctx context.Context, image model.Image, fingerprint string) error { if !image.Managed || strings.TrimSpace(image.WorkSeedPath) == "" || strings.TrimSpace(fingerprint) == "" { return nil } - seededFingerprint, err := d.seedAuthorizedKeyOnExt4Image(ctx, image.WorkSeedPath) + + // Unique sibling tmp path: same dir guarantees a same-FS rename. + // Two concurrent refreshes get distinct paths so they don't clobber + // each other's tmpfile mid-write. + tmpPath := fmt.Sprintf("%s.refresh.%d-%d.tmp", image.WorkSeedPath, os.Getpid(), time.Now().UnixNano()) + if err := system.CopyFilePreferClone(image.WorkSeedPath, tmpPath); err != nil { + return fmt.Errorf("stage seed for refresh: %w", err) + } + committed := false + defer func() { + if !committed { + _ = os.Remove(tmpPath) + } + }() + + seededFingerprint, err := s.seedAuthorizedKeyOnExt4Image(ctx, tmpPath) if err != nil { return err } if seededFingerprint == "" || seededFingerprint == image.SeededSSHPublicKeyFingerprint { return nil } + + if err := os.Rename(tmpPath, image.WorkSeedPath); err != nil { + return fmt.Errorf("commit seed refresh: %w", err) + } + committed = true + image.SeededSSHPublicKeyFingerprint = seededFingerprint image.UpdatedAt = model.Now() - return d.store.UpsertImage(ctx, image) + return s.store.UpsertImage(ctx, image) } diff --git a/internal/daemon/image_service.go b/internal/daemon/image_service.go new file mode 100644 index 0000000..fd0de12 --- /dev/null +++ b/internal/daemon/image_service.go @@ -0,0 +1,194 @@ +package daemon + +import ( + "context" + "fmt" + "log/slog" + "strings" + "sync" + + "banger/internal/imagecat" + "banger/internal/imagepull" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/store" + "banger/internal/system" +) + +// ImageService owns everything image-registry-related: register / +// promote / delete / pull (bundle + OCI), plus the kernel catalog +// operations that share the same lifecycle primitives. The publication +// lock imageOpsMu lives here so its scope is obvious at the field +// definition, and the three OCI-pull test seams (pullAndFlatten, +// finalizePulledRootfs, bundleFetch) are fields on the service rather +// than mutable globals on Daemon. +// +// Kept unexported except where peer services (VMService) need it, and +// peer access goes through consumer-defined interfaces, not direct +// struct poking. +type ImageService struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + store *store.Store + + // imageOpsMu is the publication-window lock: held only across the + // "recheck name free + atomic rename + UpsertImage" commit. See + // internal/daemon/ARCHITECTURE.md. + imageOpsMu sync.Mutex + + // kernelPullLocksMu guards the kernelPullLocks map itself. Per-name + // channel locks inside the map serialise concurrent pulls of the + // same kernel ref. Without this, two parallel `vm run` callers + // that auto-pull the same kernel race on + // /var/lib/banger/kernels//manifest.json: one is mid-write + // from kernelcat.Fetch's WriteLocal while the other is reading it + // back, yielding "unexpected end of JSON input". The map keeps + // pulls of *different* kernels parallel. + // + // chan struct{} (cap 1) instead of sync.Mutex: acquire is a + // `select` that respects ctx.Done(), so a peer waiting behind a + // pull whose RPC was cancelled can bail out instead of blocking + // forever on a pull that nobody is consuming. + kernelPullLocksMu sync.Mutex + kernelPullLocks map[string]chan struct{} + + // imagePullLocksMu / imagePullLocks: same per-name pattern for + // image auto-pulls. Without this, parallel `vm.create` callers + // resolving a missing image both run the full OCI fetch + ext4 + // build (each ~minutes), and the loser hits the "image already + // exists" recheck inside publishImage and fails after doing all + // the work for nothing. Locking around the FindImage-recheck + + // PullImage section means only one caller does the heavy work + // per image name; peers see the freshly-published image on the + // post-lock recheck. + imagePullLocksMu sync.Mutex + imagePullLocks map[string]chan struct{} + + // Test seams; nil → real implementation. + pullAndFlatten func(ctx context.Context, ref, cacheDir, destDir string) (imagepull.Metadata, error) + finalizePulledRootfs func(ctx context.Context, ext4File string, meta imagepull.Metadata) error + bundleFetch func(ctx context.Context, destDir string, entry imagecat.CatEntry) (imagecat.Manifest, error) + workSeedBuilder func(ctx context.Context, rootfsExt4, outPath string) error + + // beginOperation is a test seam used by a couple of image ops that + // want structured operation logging. Nil → Daemon's beginOperation, + // injected at construction. + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog +} + +// imageServiceDeps names every handle ImageService needs from the +// Daemon composition root. Using a struct (rather than positional args) +// makes the wiring site in Daemon.Open read as a declaration. +type imageServiceDeps struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + store *store.Store + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog +} + +func newImageService(deps imageServiceDeps) *ImageService { + return &ImageService{ + runner: deps.runner, + logger: deps.logger, + config: deps.config, + layout: deps.layout, + store: deps.store, + beginOperation: deps.beginOperation, + } +} + +// acquireKernelPullLock blocks until the per-name lock for `name` is +// free or ctx is cancelled. On success returns a release func that +// the caller must invoke (typically via defer). On ctx cancellation +// returns ctx.Err() and a nil release. The map entry is created on +// first access and lives for the daemon's lifetime — kernels rarely +// churn and keeping the entry around keeps the second-acquire path +// branchless. +func (s *ImageService) acquireKernelPullLock(ctx context.Context, name string) (func(), error) { + ch := s.kernelPullLockChan(name) + select { + case ch <- struct{}{}: + return func() { <-ch }, nil + case <-ctx.Done(): + return nil, ctx.Err() + } +} + +func (s *ImageService) kernelPullLockChan(name string) chan struct{} { + s.kernelPullLocksMu.Lock() + defer s.kernelPullLocksMu.Unlock() + if s.kernelPullLocks == nil { + s.kernelPullLocks = make(map[string]chan struct{}) + } + ch, ok := s.kernelPullLocks[name] + if !ok { + ch = make(chan struct{}, 1) + s.kernelPullLocks[name] = ch + } + return ch +} + +// acquireImagePullLock is the image-name peer of acquireKernelPullLock; +// same semantics and lifetime. +func (s *ImageService) acquireImagePullLock(ctx context.Context, name string) (func(), error) { + ch := s.imagePullLockChan(name) + select { + case ch <- struct{}{}: + return func() { <-ch }, nil + case <-ctx.Done(): + return nil, ctx.Err() + } +} + +func (s *ImageService) imagePullLockChan(name string) chan struct{} { + s.imagePullLocksMu.Lock() + defer s.imagePullLocksMu.Unlock() + if s.imagePullLocks == nil { + s.imagePullLocks = make(map[string]chan struct{}) + } + ch, ok := s.imagePullLocks[name] + if !ok { + ch = make(chan struct{}, 1) + s.imagePullLocks[name] = ch + } + return ch +} + +// FindImage is the service-owned lookup helper. It falls back from +// exact-name → exact-id → prefix match, matching the historical +// daemon.FindImage behaviour. Kept on ImageService because image +// lookup is inherently a service concern. +func (s *ImageService) FindImage(ctx context.Context, idOrName string) (model.Image, error) { + if idOrName == "" { + return model.Image{}, fmt.Errorf("image id or name is required") + } + if image, err := s.store.GetImageByName(ctx, idOrName); err == nil { + return image, nil + } + if image, err := s.store.GetImageByID(ctx, idOrName); err == nil { + return image, nil + } + images, err := s.store.ListImages(ctx) + if err != nil { + return model.Image{}, err + } + matchCount := 0 + var match model.Image + for _, image := range images { + if strings.HasPrefix(image.ID, idOrName) || strings.HasPrefix(image.Name, idOrName) { + match = image + matchCount++ + } + } + if matchCount == 1 { + return match, nil + } + if matchCount > 1 { + return model.Image{}, fmt.Errorf("multiple images match %q", idOrName) + } + return model.Image{}, fmt.Errorf("image %q not found", idOrName) +} diff --git a/internal/daemon/imagebuild.go b/internal/daemon/imagebuild.go deleted file mode 100644 index fbff27b..0000000 --- a/internal/daemon/imagebuild.go +++ /dev/null @@ -1,432 +0,0 @@ -package daemon - -import ( - "bytes" - "context" - "errors" - "fmt" - "io" - "os" - "path/filepath" - "strings" - "time" - - "banger/internal/firecracker" - "banger/internal/guest" - "banger/internal/guestnet" - "banger/internal/hostnat" - "banger/internal/imagepreset" - "banger/internal/model" - "banger/internal/opencode" - "banger/internal/system" - "banger/internal/vsockagent" -) - -const ( - defaultMiseVersion = "v2025.12.0" - defaultMiseInstallPath = "/usr/local/bin/mise" - defaultMiseActivateLine = `eval "$(/usr/local/bin/mise activate bash)"` - defaultOpenCodeTool = "github:anomalyco/opencode" - defaultTPMRepo = "https://github.com/tmux-plugins/tpm" - defaultResurrectRepo = "https://github.com/tmux-plugins/tmux-resurrect" - defaultContinuumRepo = "https://github.com/tmux-plugins/tmux-continuum" - defaultTMUXPluginDir = "/root/.tmux/plugins" - defaultTMUXResurrectDir = "/root/.tmux/resurrect" - tmuxManagedBlockStart = "# >>> banger tmux plugins >>>" - tmuxManagedBlockEnd = "# <<< banger tmux plugins <<<" -) - -type imageBuildSpec struct { - ID string - Name string - SourceRootfs string - RootfsPath string - BuildLog io.Writer - KernelPath string - InitrdPath string - ModulesDir string - Packages []string - InstallDocker bool - Size string -} - -type imageBuildVM struct { - Name string - GuestIP string - TapDevice string - APISock string - PID int -} - -func (d *Daemon) runImageBuild(ctx context.Context, spec imageBuildSpec) error { - if d.imageBuild != nil { - return d.imageBuild(ctx, spec) - } - return d.runImageBuildNative(ctx, spec) -} - -func (d *Daemon) runImageBuildNative(ctx context.Context, spec imageBuildSpec) (err error) { - if err := system.CopyFilePreferClone(spec.SourceRootfs, spec.RootfsPath); err != nil { - return err - } - if spec.Size != "" { - if err := resizeRootfs(spec.SourceRootfs, spec.RootfsPath, spec.Size); err != nil { - return err - } - } - - vm, cleanup, err := d.startImageBuildVM(ctx, spec) - if err != nil { - return err - } - defer func() { - cleanupErr := cleanup(context.Background()) - if cleanupErr != nil { - err = errors.Join(err, cleanupErr) - } - }() - - sshAddress := vm.GuestIP + ":22" - if _, err := fmt.Fprintf(spec.BuildLog, "[image.build] waiting for ssh on %s\n", sshAddress); err != nil { - return err - } - waitCtx, cancel := context.WithTimeout(ctx, 60*time.Second) - defer cancel() - if err := guest.WaitForSSH(waitCtx, sshAddress, d.config.SSHKeyPath, time.Second); err != nil { - return err - } - - client, err := guest.Dial(ctx, sshAddress, d.config.SSHKeyPath) - if err != nil { - return err - } - defer client.Close() - authorizedKey, err := guest.AuthorizedPublicKey(d.config.SSHKeyPath) - if err != nil { - return err - } - - vsockAgentPath, err := d.vsockAgentBinary() - if err != nil { - return err - } - helperBytes, err := os.ReadFile(vsockAgentPath) - if err != nil { - return err - } - if err := writeBuildLog(spec.BuildLog, "installing vsock agent"); err != nil { - return err - } - if err := client.UploadFile(ctx, vsockagent.GuestInstallPath, 0o755, helperBytes, spec.BuildLog); err != nil { - return err - } - if err := writeBuildLog(spec.BuildLog, "configuring guest"); err != nil { - return err - } - if err := client.RunScript(ctx, buildProvisionScript(vm.Name, d.config.DefaultDNS, string(authorizedKey), spec.Packages, spec.InstallDocker), spec.BuildLog); err != nil { - return err - } - if strings.TrimSpace(spec.ModulesDir) != "" { - if err := writeBuildLog(spec.BuildLog, "copying kernel modules"); err != nil { - return err - } - if err := client.StreamTar(ctx, spec.ModulesDir, buildModulesCommand(filepath.Base(spec.ModulesDir)), spec.BuildLog); err != nil { - return err - } - } - if err := writeBuildLog(spec.BuildLog, "shutting down guest"); err != nil { - return err - } - if err := client.RunScript(ctx, "set -e\nsync\n", spec.BuildLog); err != nil { - return err - } - return d.shutdownImageBuildVM(ctx, vm) -} - -func resizeRootfs(baseRootfs, rootfsPath, sizeSpec string) error { - sizeBytes, err := model.ParseSize(sizeSpec) - if err != nil { - return err - } - info, err := os.Stat(baseRootfs) - if err != nil { - return err - } - if sizeBytes < info.Size() { - return fmt.Errorf("size must be >= base image size") - } - return system.ResizeExt4Image(context.Background(), system.NewRunner(), rootfsPath, sizeBytes) -} - -func (d *Daemon) startImageBuildVM(ctx context.Context, spec imageBuildSpec) (imageBuildVM, func(context.Context) error, error) { - if err := d.ensureBridge(ctx); err != nil { - return imageBuildVM{}, nil, err - } - if err := d.ensureSocketDir(); err != nil { - return imageBuildVM{}, nil, err - } - fcPath, err := d.firecrackerBinary() - if err != nil { - return imageBuildVM{}, nil, err - } - - shortID := system.ShortID(spec.ID) - guestIP, err := d.store.NextGuestIP(ctx, bridgePrefix(d.config.BridgeIP)) - if err != nil { - return imageBuildVM{}, nil, err - } - vm := imageBuildVM{ - Name: "image-build-" + shortID, - GuestIP: guestIP, - TapDevice: "tap-img-" + shortID, - APISock: filepath.Join(d.layout.RuntimeDir, "img-"+shortID+".sock"), - } - if err := os.RemoveAll(vm.APISock); err != nil && !os.IsNotExist(err) { - return imageBuildVM{}, nil, err - } - if err := d.createTap(ctx, vm.TapDevice); err != nil { - return imageBuildVM{}, nil, err - } - if err := hostnat.Ensure(ctx, d.runner, vm.GuestIP, vm.TapDevice, true); err != nil { - _, _ = d.runner.RunSudo(ctx, "ip", "link", "del", vm.TapDevice) - return imageBuildVM{}, nil, err - } - - firecrackerCtx := context.Background() - machine, err := firecracker.NewMachine(firecrackerCtx, firecracker.MachineConfig{ - BinaryPath: fcPath, - VMID: spec.ID, - SocketPath: vm.APISock, - LogPath: spec.RootfsPath + ".firecracker.log", - MetricsPath: filepath.Join(filepath.Dir(spec.RootfsPath), "metrics.json"), - KernelImagePath: spec.KernelPath, - InitrdPath: spec.InitrdPath, - KernelArgs: system.BuildBootArgsWithKernelIP(vm.Name, vm.GuestIP, d.config.BridgeIP, d.config.DefaultDNS), - Drives: []firecracker.DriveConfig{{ - ID: "rootfs", - Path: spec.RootfsPath, - ReadOnly: false, - IsRoot: true, - }}, - TapDevice: vm.TapDevice, - VCPUCount: model.DefaultVCPUCount, - MemoryMiB: model.DefaultMemoryMiB, - Logger: d.logger, - }) - if err != nil { - _ = hostnat.Ensure(ctx, d.runner, vm.GuestIP, vm.TapDevice, false) - _, _ = d.runner.RunSudo(ctx, "ip", "link", "del", vm.TapDevice) - return imageBuildVM{}, nil, err - } - if err := machine.Start(firecrackerCtx); err != nil { - _ = hostnat.Ensure(ctx, d.runner, vm.GuestIP, vm.TapDevice, false) - _, _ = d.runner.RunSudo(ctx, "ip", "link", "del", vm.TapDevice) - return imageBuildVM{}, nil, err - } - vm.PID = d.resolveFirecrackerPID(firecrackerCtx, machine, vm.APISock) - if err := d.ensureSocketAccess(ctx, vm.APISock, "firecracker api socket"); err != nil { - _ = d.killVMProcess(context.Background(), vm.PID) - _ = hostnat.Ensure(ctx, d.runner, vm.GuestIP, vm.TapDevice, false) - _, _ = d.runner.RunSudo(ctx, "ip", "link", "del", vm.TapDevice) - return imageBuildVM{}, nil, err - } - - cleanup := func(cleanupCtx context.Context) error { - if vm.PID > 0 && system.ProcessRunning(vm.PID, vm.APISock) { - _ = d.killVMProcess(cleanupCtx, vm.PID) - _ = d.waitForExit(cleanupCtx, vm.PID, vm.APISock, 10*time.Second) - } - _ = hostnat.Ensure(cleanupCtx, d.runner, vm.GuestIP, vm.TapDevice, false) - if vm.TapDevice != "" { - _, _ = d.runner.RunSudo(cleanupCtx, "ip", "link", "del", vm.TapDevice) - } - if vm.APISock != "" { - _ = os.Remove(vm.APISock) - } - return nil - } - return vm, cleanup, nil -} - -func (d *Daemon) shutdownImageBuildVM(ctx context.Context, vm imageBuildVM) error { - buildVM := model.VMRecord{Runtime: model.VMRuntime{APISockPath: vm.APISock}} - if err := d.sendCtrlAltDel(ctx, buildVM); err != nil { - return err - } - return d.waitForExit(ctx, vm.PID, vm.APISock, 15*time.Second) -} - -func buildProvisionScript(vmName, dnsServer, authorizedKey string, packages []string, installDocker bool) string { - var script bytes.Buffer - script.WriteString("set -euo pipefail\n") - fmt.Fprintf(&script, "printf 'nameserver %%s\\n' %s > /etc/resolv.conf\n", shellQuote(dnsServer)) - fmt.Fprintf(&script, "printf '%%s\\n' %s > /etc/hostname\n", shellQuote(vmName)) - fmt.Fprintf(&script, "printf '127.0.0.1 localhost\\n127.0.1.1 %%s\\n' %s > /etc/hosts\n", shellQuote(vmName)) - script.WriteString("touch /etc/fstab\n") - script.WriteString("sed -i '\\|^/dev/vdb[[:space:]]\\+/home[[:space:]]|d; \\|^/dev/vdc[[:space:]]\\+/var[[:space:]]|d' /etc/fstab\n") - script.WriteString("if ! grep -q '^tmpfs /run ' /etc/fstab; then echo 'tmpfs /run tmpfs defaults,nodev,nosuid,mode=0755 0 0' >> /etc/fstab; fi\n") - script.WriteString("if ! grep -q '^tmpfs /tmp ' /etc/fstab; then echo 'tmpfs /tmp tmpfs defaults,nodev,nosuid,mode=1777 0 0' >> /etc/fstab; fi\n") - appendAuthorizedKeySetup(&script, authorizedKey) - script.WriteString("apt-get update\n") - script.WriteString("DEBIAN_FRONTEND=noninteractive apt-get -y upgrade\n") - fmt.Fprintf(&script, "PACKAGES=%s\n", shellArray(packages)) - script.WriteString("DEBIAN_FRONTEND=noninteractive apt-get -y install \"${PACKAGES[@]}\"\n") - appendGuestNetworkSetup(&script) - appendMiseSetup(&script) - appendOpenCodeServiceSetup(&script) - appendTmuxSetup(&script) - appendVSockPingSetup(&script) - if installDocker { - script.WriteString("DEBIAN_FRONTEND=noninteractive apt-get -y remove containerd || true\n") - script.WriteString("if ! DEBIAN_FRONTEND=noninteractive apt-get -y install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin; then\n") - script.WriteString(" DEBIAN_FRONTEND=noninteractive apt-get -y install docker.io\n") - script.WriteString("fi\n") - script.WriteString("if command -v systemctl >/dev/null 2>&1; then systemctl enable --now docker || true; fi\n") - } - appendGuestCleanup(&script) - script.WriteString("git config --system init.defaultBranch main\n") - return script.String() -} - -func appendAuthorizedKeySetup(script *bytes.Buffer, authorizedKey string) { - script.WriteString("mkdir -p /root/.ssh\n") - script.WriteString("chmod 700 /root/.ssh\n") - script.WriteString("cat > /root/.ssh/authorized_keys <<'EOF'\n") - script.WriteString(strings.TrimSpace(authorizedKey)) - script.WriteString("\nEOF\n") - script.WriteString("chmod 600 /root/.ssh/authorized_keys\n") -} - -func buildModulesCommand(modulesBase string) string { - return fmt.Sprintf("bash -se <<'EOF'\nset -euo pipefail\nmkdir -p /lib/modules\ntar -C /lib/modules -xf -\ndepmod -a %s\nmkdir -p /etc/modules-load.d\nprintf 'nf_tables\\nnft_chain_nat\\nveth\\nbr_netfilter\\noverlay\\n' > /etc/modules-load.d/docker-netfilter.conf\nmkdir -p /etc/sysctl.d\ncat > /etc/sysctl.d/99-docker.conf <<'SYSCTL'\nnet.bridge.bridge-nf-call-iptables = 1\nnet.bridge.bridge-nf-call-ip6tables = 1\nnet.ipv4.ip_forward = 1\nSYSCTL\nsysctl --system >/dev/null 2>&1 || true\nEOF", shellQuote(modulesBase)) -} - -func appendMiseSetup(script *bytes.Buffer) { - fmt.Fprintf(script, "curl -fsSL https://mise.run | MISE_INSTALL_PATH=%s MISE_VERSION=%s sh\n", shellQuote(defaultMiseInstallPath), shellQuote(defaultMiseVersion)) - fmt.Fprintf(script, "%s use -g %s\n", shellQuote(defaultMiseInstallPath), shellQuote(defaultOpenCodeTool)) - fmt.Fprintf(script, "%s reshim\n", shellQuote(defaultMiseInstallPath)) - fmt.Fprintf(script, "if [[ ! -e %s ]]; then echo 'opencode shim not found after mise install' >&2; exit 1; fi\n", shellQuote(opencode.ShimPath)) - fmt.Fprintf(script, "ln -snf %s %s\n", shellQuote(opencode.ShimPath), shellQuote(opencode.GuestBinaryPath)) - script.WriteString("mkdir -p /etc/profile.d\n") - script.WriteString("cat > /etc/profile.d/mise.sh <<'EOF'\n") - fmt.Fprintf(script, "if [ -n \"${BASH_VERSION:-}\" ] && [ -x %s ]; then\n", shellQuote(defaultMiseInstallPath)) - fmt.Fprintf(script, " %s\n", defaultMiseActivateLine) - script.WriteString("fi\n") - script.WriteString("EOF\n") - script.WriteString("chmod 0644 /etc/profile.d/mise.sh\n") - appendLineIfMissing(script, "/etc/bash.bashrc", defaultMiseActivateLine) -} - -func appendGuestNetworkSetup(script *bytes.Buffer) { - script.WriteString("mkdir -p /usr/local/libexec /etc/systemd/system\n") - script.WriteString("cat > " + guestnet.GuestScriptPath + " <<'EOF'\n") - script.WriteString(guestnet.BootstrapScript()) - script.WriteString("EOF\n") - script.WriteString("chmod 0755 " + guestnet.GuestScriptPath + "\n") - script.WriteString("cat > /etc/systemd/system/" + guestnet.SystemdServiceName + " <<'EOF'\n") - script.WriteString(guestnet.SystemdServiceUnit()) - script.WriteString("EOF\n") - script.WriteString("chmod 0644 /etc/systemd/system/" + guestnet.SystemdServiceName + "\n") - script.WriteString("if command -v systemctl >/dev/null 2>&1; then systemctl daemon-reload || true; systemctl enable --now " + guestnet.SystemdServiceName + " || true; fi\n") -} - -func appendOpenCodeServiceSetup(script *bytes.Buffer) { - script.WriteString("mkdir -p /etc/systemd/system\n") - script.WriteString("cat > /etc/systemd/system/" + opencode.ServiceName + " <<'EOF'\n") - script.WriteString(opencode.ServiceUnit()) - script.WriteString("EOF\n") - script.WriteString("chmod 0644 /etc/systemd/system/" + opencode.ServiceName + "\n") - script.WriteString("if command -v systemctl >/dev/null 2>&1; then systemctl daemon-reload || true; systemctl enable --now " + opencode.ServiceName + " || true; fi\n") -} - -func appendTmuxSetup(script *bytes.Buffer) { - fmt.Fprintf(script, "TMUX_PLUGIN_DIR=%s\n", shellQuote(defaultTMUXPluginDir)) - fmt.Fprintf(script, "TMUX_RESURRECT_DIR=%s\n", shellQuote(defaultTMUXResurrectDir)) - script.WriteString("mkdir -p \"$TMUX_PLUGIN_DIR\" \"$TMUX_RESURRECT_DIR\"\n") - appendGitRepo(script, "$TMUX_PLUGIN_DIR/tpm", defaultTPMRepo) - appendGitRepo(script, "$TMUX_PLUGIN_DIR/tmux-resurrect", defaultResurrectRepo) - appendGitRepo(script, "$TMUX_PLUGIN_DIR/tmux-continuum", defaultContinuumRepo) - script.WriteString("TMUX_CONF=/root/.tmux.conf\n") - fmt.Fprintf(script, "TMUX_MANAGED_START=%s\n", shellQuote(tmuxManagedBlockStart)) - fmt.Fprintf(script, "TMUX_MANAGED_END=%s\n", shellQuote(tmuxManagedBlockEnd)) - script.WriteString("tmp_tmux_conf=$(mktemp)\n") - script.WriteString("if [[ -f \"$TMUX_CONF\" ]]; then\n") - script.WriteString(" awk -v begin=\"$TMUX_MANAGED_START\" -v end=\"$TMUX_MANAGED_END\" '$0 == begin { skip = 1; next } $0 == end { skip = 0; next } !skip { print }' \"$TMUX_CONF\" > \"$tmp_tmux_conf\"\n") - script.WriteString("else\n") - script.WriteString(" : > \"$tmp_tmux_conf\"\n") - script.WriteString("fi\n") - script.WriteString("if [[ -s \"$tmp_tmux_conf\" ]]; then\n") - script.WriteString(" printf '\\n' >> \"$tmp_tmux_conf\"\n") - script.WriteString("fi\n") - script.WriteString("cat >> \"$tmp_tmux_conf\" <<'EOF'\n") - script.WriteString(tmuxManagedBlockStart + "\n") - script.WriteString("set -g @plugin 'tmux-plugins/tpm'\n") - script.WriteString("set -g @plugin 'tmux-plugins/tmux-resurrect'\n") - script.WriteString("set -g @plugin 'tmux-plugins/tmux-continuum'\n") - script.WriteString("set -g @continuum-save-interval '15'\n") - script.WriteString("set -g @continuum-restore 'off'\n") - script.WriteString("set -g @resurrect-dir '/root/.tmux/resurrect'\n") - script.WriteString("run '~/.tmux/plugins/tpm/tpm'\n") - script.WriteString(tmuxManagedBlockEnd + "\n") - script.WriteString("EOF\n") - script.WriteString("mv \"$tmp_tmux_conf\" \"$TMUX_CONF\"\n") - script.WriteString("chmod 0644 \"$TMUX_CONF\"\n") -} - -func appendVSockPingSetup(script *bytes.Buffer) { - script.WriteString("mkdir -p /etc/modules-load.d /etc/systemd/system\n") - script.WriteString("cat > /etc/modules-load.d/banger-vsock.conf <<'EOF'\n") - script.WriteString(vsockagent.ModulesLoadConfig()) - script.WriteString("EOF\n") - script.WriteString("chmod 0644 /etc/modules-load.d/banger-vsock.conf\n") - script.WriteString("cat > /etc/systemd/system/" + vsockagent.ServiceName + " <<'EOF'\n") - script.WriteString(vsockagent.ServiceUnit()) - script.WriteString("EOF\n") - script.WriteString("chmod 0644 /etc/systemd/system/" + vsockagent.ServiceName + "\n") - script.WriteString("if command -v systemctl >/dev/null 2>&1; then systemctl daemon-reload || true; systemctl enable --now " + vsockagent.ServiceName + " || true; fi\n") -} - -func appendGitRepo(script *bytes.Buffer, dir, repo string) { - fmt.Fprintf(script, "if [[ -d \"%s/.git\" ]]; then\n", dir) - fmt.Fprintf(script, " git -C \"%s\" fetch --depth 1 origin\n", dir) - fmt.Fprintf(script, " git -C \"%s\" reset --hard FETCH_HEAD\n", dir) - script.WriteString("else\n") - fmt.Fprintf(script, " rm -rf \"%s\"\n", dir) - fmt.Fprintf(script, " git clone --depth 1 %s \"%s\"\n", shellQuote(repo), dir) - script.WriteString("fi\n") -} - -func appendGuestCleanup(script *bytes.Buffer) { - script.WriteString("rm -f /root/get-docker /root/get-docker.sh /tmp/get-docker /tmp/get-docker.sh\n") -} - -func appendLineIfMissing(script *bytes.Buffer, path, line string) { - fmt.Fprintf(script, "touch %s\n", shellQuote(path)) - fmt.Fprintf(script, "if ! grep -Fqx %s %s; then\n", shellQuote(line), shellQuote(path)) - fmt.Fprintf(script, " printf '\\n%%s\\n' %s >> %s\n", shellQuote(line), shellQuote(path)) - script.WriteString("fi\n") -} - -func shellArray(values []string) string { - quoted := make([]string, 0, len(values)) - for _, value := range values { - quoted = append(quoted, shellQuote(value)) - } - return "(" + strings.Join(quoted, " ") + ")" -} - -func shellQuote(value string) string { - return "'" + strings.ReplaceAll(value, "'", `'"'"'`) + "'" -} - -func writeBuildLog(w io.Writer, message string) error { - if w == nil { - return nil - } - _, err := fmt.Fprintf(w, "[image.build] %s\n", message) - return err -} - -func packagesHash(lines []string) string { - return imagepreset.Hash(lines) -} diff --git a/internal/daemon/imagebuild_test.go b/internal/daemon/imagebuild_test.go deleted file mode 100644 index 3a42612..0000000 --- a/internal/daemon/imagebuild_test.go +++ /dev/null @@ -1,54 +0,0 @@ -package daemon - -import ( - "strings" - "testing" -) - -func TestBuildProvisionScriptInstallsDefaultTools(t *testing.T) { - t.Parallel() - - script := buildProvisionScript("devbox", "1.1.1.1", "ssh-ed25519 AAAATESTKEY banger", []string{"git", "curl"}, false) - for _, snippet := range []string{ - "mkdir -p /root/.ssh", - "cat > /root/.ssh/authorized_keys <<'EOF'", - "ssh-ed25519 AAAATESTKEY banger", - "cat > /usr/local/libexec/banger-network-bootstrap <<'EOF'", - "ip addr replace \"$guest_ip/$prefix\" dev \"$iface\"", - "cat > /etc/systemd/system/banger-network.service <<'EOF'", - "systemctl enable --now banger-network.service || true", - "curl -fsSL https://mise.run | MISE_INSTALL_PATH='/usr/local/bin/mise' MISE_VERSION='v2025.12.0' sh", - "'/usr/local/bin/mise' use -g 'github:anomalyco/opencode'", - "'/usr/local/bin/mise' reshim", - "if [[ ! -e '/root/.local/share/mise/shims/opencode' ]]; then echo 'opencode shim not found after mise install' >&2; exit 1; fi", - "ln -snf '/root/.local/share/mise/shims/opencode' '/usr/local/bin/opencode'", - "cat > /etc/profile.d/mise.sh <<'EOF'", - "if [ -n \"${BASH_VERSION:-}\" ] && [ -x '/usr/local/bin/mise' ]; then", - `eval "$(/usr/local/bin/mise activate bash)"`, - `if ! grep -Fqx 'eval "$(/usr/local/bin/mise activate bash)"' '/etc/bash.bashrc'; then`, - "cat > /etc/systemd/system/banger-opencode.service <<'EOF'", - "RequiresMountsFor=/root", - "ExecStart=/usr/local/bin/opencode serve --hostname 0.0.0.0 --port 4096", - "systemctl enable --now banger-opencode.service || true", - `git clone --depth 1 'https://github.com/tmux-plugins/tpm' "$TMUX_PLUGIN_DIR/tpm"`, - `git clone --depth 1 'https://github.com/tmux-plugins/tmux-resurrect' "$TMUX_PLUGIN_DIR/tmux-resurrect"`, - `git clone --depth 1 'https://github.com/tmux-plugins/tmux-continuum' "$TMUX_PLUGIN_DIR/tmux-continuum"`, - "# >>> banger tmux plugins >>>", - "set -g @plugin 'tmux-plugins/tmux-resurrect'", - "set -g @plugin 'tmux-plugins/tmux-continuum'", - "set -g @continuum-save-interval '15'", - "set -g @continuum-restore 'off'", - "set -g @resurrect-dir '/root/.tmux/resurrect'", - "run '~/.tmux/plugins/tpm/tpm'", - "cat > /etc/modules-load.d/banger-vsock.conf <<'EOF'", - "vmw_vsock_virtio_transport", - "cat > /etc/systemd/system/banger-vsock-agent.service <<'EOF'", - "ExecStart=/usr/local/bin/banger-vsock-agent", - "systemctl enable --now banger-vsock-agent.service || true", - "rm -f /root/get-docker /root/get-docker.sh /tmp/get-docker /tmp/get-docker.sh", - } { - if !strings.Contains(script, snippet) { - t.Fatalf("buildProvisionScript missing snippet %q\nscript:\n%s", snippet, script) - } - } -} diff --git a/internal/daemon/imagemgr/paths.go b/internal/daemon/imagemgr/paths.go new file mode 100644 index 0000000..22f4b03 --- /dev/null +++ b/internal/daemon/imagemgr/paths.go @@ -0,0 +1,145 @@ +// Package imagemgr contains the pure helpers of the banger image subsystem: +// path validators, artifact staging, managed-image metadata, and the guest +// provisioning script generator used by image build. +// +// The orchestrator methods (BuildImage, RegisterImage, PromoteImage, +// DeleteImage) still live in the daemon package and compose these helpers. +package imagemgr + +import ( + "context" + "crypto/sha256" + "fmt" + "os" + "path/filepath" + "strings" + + "banger/internal/system" +) + +// debianBasePackages is the apt package list applied by +// `image build --from-image` to Debian-based managed rootfses. Small +// curated set: most of the developer tooling the golden image ships +// lives in the Dockerfile, not here. +var debianBasePackages = []string{ + "make", + "git", + "less", + "tree", + "ca-certificates", + "curl", + "wget", + "iproute2", + "vim", + "tmux", +} + +// DebianBasePackages returns a copy of the base package set. +func DebianBasePackages() []string { + return append([]string(nil), debianBasePackages...) +} + +// hashPackages returns the hex sha256 of the package list, used as +// drift-detection metadata alongside a built rootfs. +func hashPackages(lines []string) string { + sum := sha256.Sum256([]byte(strings.Join(lines, "\n") + "\n")) + return fmt.Sprintf("%x", sum) +} + +// ValidateRegisterPaths checks that rootfs + kernel exist and that optional +// artifacts, when provided, also exist. +func ValidateRegisterPaths(rootfsPath, workSeedPath, kernelPath, initrdPath, modulesDir string) error { + checks := system.NewPreflight() + checks.RequireFile(rootfsPath, "rootfs image", `pass --rootfs `) + if workSeedPath != "" { + checks.RequireFile(workSeedPath, "work-seed image", `pass --work-seed or rebuild the image with a work seed`) + } + addKernelChecks(checks, kernelPath, initrdPath, modulesDir) + return checks.Err("image register failed") +} + +// ValidateKernelPaths checks the kernel triple alone, used by flows +// (e.g. image pull) that produce the rootfs themselves. +func ValidateKernelPaths(kernelPath, initrdPath, modulesDir string) error { + checks := system.NewPreflight() + addKernelChecks(checks, kernelPath, initrdPath, modulesDir) + return checks.Err("kernel preflight failed") +} + +func addKernelChecks(checks *system.Preflight, kernelPath, initrdPath, modulesDir string) { + checks.RequireFile(kernelPath, "kernel image", `pass --kernel `) + if initrdPath != "" { + checks.RequireFile(initrdPath, "initrd image", `pass --initrd `) + } + if modulesDir != "" { + checks.RequireDir(modulesDir, "kernel modules dir", `pass --modules `) + } +} + +// ValidatePromotePaths checks that an existing registered image's artifacts +// are still present before promoting it to daemon-owned storage. +func ValidatePromotePaths(rootfsPath, kernelPath, initrdPath, modulesDir string) error { + checks := system.NewPreflight() + checks.RequireFile(rootfsPath, "rootfs image", `re-register the image with a valid rootfs`) + checks.RequireFile(kernelPath, "kernel image", `re-register the image with a valid kernel`) + if initrdPath != "" { + checks.RequireFile(initrdPath, "initrd image", `re-register the image with a valid initrd`) + } + if modulesDir != "" { + checks.RequireDir(modulesDir, "kernel modules dir", `re-register the image with a valid modules dir`) + } + return checks.Err("image promote failed") +} + +// StageBootArtifacts copies kernel/initrd/modules into artifactDir and +// returns the staged paths. initrd and modules are optional; an empty source +// returns an empty staged path. +func StageBootArtifacts(ctx context.Context, runner system.CommandRunner, artifactDir, kernelSource, initrdSource, modulesSource string) (string, string, string, error) { + kernelPath := filepath.Join(artifactDir, "kernel") + if err := system.CopyFilePreferClone(kernelSource, kernelPath); err != nil { + return "", "", "", err + } + initrdPath := "" + if strings.TrimSpace(initrdSource) != "" { + initrdPath = filepath.Join(artifactDir, "initrd.img") + if err := system.CopyFilePreferClone(initrdSource, initrdPath); err != nil { + return "", "", "", err + } + } + modulesDir := "" + if strings.TrimSpace(modulesSource) != "" { + modulesDir = filepath.Join(artifactDir, "modules") + if err := os.MkdirAll(modulesDir, 0o755); err != nil { + return "", "", "", err + } + if err := system.CopyDirContents(ctx, runner, modulesSource, modulesDir, false); err != nil { + return "", "", "", err + } + } + return kernelPath, initrdPath, modulesDir, nil +} + +// StageOptionalArtifactPath returns the destination path for an optional +// artifact in artifactDir, or "" when stagedPath is empty (artifact absent). +func StageOptionalArtifactPath(artifactDir, stagedPath, name string) string { + if strings.TrimSpace(stagedPath) == "" { + return "" + } + return filepath.Join(artifactDir, name) +} + +// BuildMetadataPackages returns the canonical package set recorded for a +// managed image build. +func BuildMetadataPackages() []string { + return DebianBasePackages() +} + +// WritePackagesMetadata writes the hash of packages next to rootfsPath so +// future builds can detect drift. Empty packages or rootfsPath is a no-op. +func WritePackagesMetadata(rootfsPath string, packages []string) error { + if rootfsPath == "" || len(packages) == 0 { + return nil + } + metadataPath := rootfsPath + ".packages.sha256" + return os.WriteFile(metadataPath, []byte(hashPackages(packages)+"\n"), 0o644) +} diff --git a/internal/daemon/imagemgr/paths_test.go b/internal/daemon/imagemgr/paths_test.go new file mode 100644 index 0000000..668eb8a --- /dev/null +++ b/internal/daemon/imagemgr/paths_test.go @@ -0,0 +1,169 @@ +package imagemgr + +import ( + "crypto/sha256" + "fmt" + "os" + "path/filepath" + "strings" + "testing" +) + +// TestDebianBasePackagesReturnsCopy pins the contract that mutating the +// slice returned by DebianBasePackages() can't poison subsequent calls. +// hashPackages digests this list, so a caller that sorts or appends in +// place would silently change every image's package metadata. +func TestDebianBasePackagesReturnsCopy(t *testing.T) { + t.Parallel() + first := DebianBasePackages() + original := append([]string(nil), first...) + if len(first) == 0 { + t.Fatal("DebianBasePackages returned empty slice") + } + first[0] = "tampered" + second := DebianBasePackages() + if second[0] == "tampered" { + t.Fatalf("DebianBasePackages leaks internal state; second[0] = %q after first[0] mutation", second[0]) + } + for i := range original { + if second[i] != original[i] { + t.Fatalf("DebianBasePackages drifted at %d: got %q, want %q", i, second[i], original[i]) + } + } +} + +// TestBuildMetadataPackagesMatchesDebianBase confirms the metadata +// packages used for image-drift detection are the same set we apply +// during build. If these diverge the hash recorded next to a rootfs +// stops matching the actual installed package set. +func TestBuildMetadataPackagesMatchesDebianBase(t *testing.T) { + t.Parallel() + build := BuildMetadataPackages() + debian := DebianBasePackages() + if len(build) != len(debian) { + t.Fatalf("BuildMetadataPackages len = %d, DebianBasePackages len = %d", len(build), len(debian)) + } + for i := range build { + if build[i] != debian[i] { + t.Fatalf("BuildMetadataPackages[%d] = %q, want %q", i, build[i], debian[i]) + } + } +} + +func TestHashPackagesStableForSameInput(t *testing.T) { + t.Parallel() + pkgs := []string{"git", "make", "vim"} + first := hashPackages(pkgs) + second := hashPackages(append([]string(nil), pkgs...)) + if first != second { + t.Fatalf("hashPackages drifted between identical calls: %q vs %q", first, second) + } + // Sanity: hash differs when input differs. + if first == hashPackages([]string{"git", "make"}) { + t.Fatal("hashPackages collapsed two distinct inputs to the same hash") + } + // Verify the format is hex sha256 of "git\nmake\nvim\n" — pin the + // concrete digest so a future refactor that changes joining (e.g. + // drops the trailing newline) trips this test. + want := fmt.Sprintf("%x", sha256.Sum256([]byte("git\nmake\nvim\n"))) + if first != want { + t.Fatalf("hashPackages format drifted: got %q, want %q", first, want) + } +} + +func TestStageOptionalArtifactPathEmptyStaysEmpty(t *testing.T) { + t.Parallel() + if got := StageOptionalArtifactPath("/tmp/artifacts", "", "initrd.img"); got != "" { + t.Fatalf("StageOptionalArtifactPath(empty staged) = %q, want empty", got) + } + if got := StageOptionalArtifactPath("/tmp/artifacts", " ", "initrd.img"); got != "" { + t.Fatalf("StageOptionalArtifactPath(whitespace staged) = %q, want empty", got) + } +} + +func TestStageOptionalArtifactPathJoinsName(t *testing.T) { + t.Parallel() + got := StageOptionalArtifactPath("/tmp/artifacts", "/host/path/initrd.img", "initrd.img") + want := filepath.Join("/tmp/artifacts", "initrd.img") + if got != want { + t.Fatalf("StageOptionalArtifactPath = %q, want %q", got, want) + } +} + +func TestWritePackagesMetadataWritesHashFile(t *testing.T) { + t.Parallel() + dir := t.TempDir() + rootfs := filepath.Join(dir, "rootfs.ext4") + if err := os.WriteFile(rootfs, []byte("rootfs"), 0o644); err != nil { + t.Fatalf("write rootfs: %v", err) + } + pkgs := []string{"git", "vim"} + if err := WritePackagesMetadata(rootfs, pkgs); err != nil { + t.Fatalf("WritePackagesMetadata: %v", err) + } + got, err := os.ReadFile(rootfs + ".packages.sha256") + if err != nil { + t.Fatalf("read metadata: %v", err) + } + want := hashPackages(pkgs) + "\n" + if string(got) != want { + t.Fatalf("metadata content = %q, want %q", got, want) + } +} + +func TestWritePackagesMetadataNoOpOnEmptyInputs(t *testing.T) { + t.Parallel() + dir := t.TempDir() + rootfs := filepath.Join(dir, "rootfs.ext4") + if err := os.WriteFile(rootfs, []byte("rootfs"), 0o644); err != nil { + t.Fatalf("write rootfs: %v", err) + } + + // Empty package list is the "managed-image build skipped apt" case. + if err := WritePackagesMetadata(rootfs, nil); err != nil { + t.Fatalf("WritePackagesMetadata(nil packages): %v", err) + } + if _, err := os.Stat(rootfs + ".packages.sha256"); !os.IsNotExist(err) { + t.Fatalf("metadata file was created for empty packages; err = %v", err) + } + + // Empty rootfs path is a no-op too — callers pass "" when they + // haven't decided where to write yet. + if err := WritePackagesMetadata("", []string{"git"}); err != nil { + t.Fatalf("WritePackagesMetadata(empty rootfs): %v", err) + } +} + +// TestHashPackagesIgnoresOrder confirms the canonical join is +// strict-order-sensitive: callers must keep the ordering they want the +// hash to digest. Pin this so a future "convenience" sort doesn't +// silently invalidate every recorded image hash on disk. +func TestHashPackagesOrderSensitive(t *testing.T) { + t.Parallel() + a := hashPackages([]string{"git", "make"}) + b := hashPackages([]string{"make", "git"}) + if a == b { + t.Fatal("hashPackages collapsed two orderings to the same hash; metadata-on-disk would be ambiguous") + } + // Trailing newlines must be normalised by the joiner, not the + // caller. If callers had to remember to add their own, every + // historical hash on disk would be a footgun. + withTrailing := hashPackages([]string{"git", "make", ""}) + if withTrailing == a { + t.Fatalf("hashPackages tolerated an empty trailing element silently; got %q == %q", withTrailing, a) + } +} + +// TestDebianBasePackagesContainsCriticalEntries pins the small core of +// packages every managed image must have. Stops a future refactor +// from dropping (say) ca-certificates without the owner noticing — a +// rebuilt image without it can't talk to TLS endpoints. +func TestDebianBasePackagesContainsCriticalEntries(t *testing.T) { + t.Parallel() + pkgs := strings.Join(DebianBasePackages(), " ") + for _, must := range []string{"ca-certificates", "curl", "git"} { + if !strings.Contains(pkgs, must) { + t.Errorf("DebianBasePackages missing critical entry %q; got %q", must, pkgs) + } + } +} diff --git a/internal/daemon/images.go b/internal/daemon/images.go index b20873e..0f806d8 100644 --- a/internal/daemon/images.go +++ b/internal/daemon/images.go @@ -10,155 +10,17 @@ import ( "strings" "banger/internal/api" - "banger/internal/imagepreset" + "banger/internal/daemon/imagemgr" + "banger/internal/kernelcat" "banger/internal/model" "banger/internal/system" ) -func (d *Daemon) BuildImage(ctx context.Context, params api.ImageBuildParams) (image model.Image, err error) { - d.mu.Lock() - defer d.mu.Unlock() - op := d.beginOperation("image.build") - buildLogPath := "" - defer func() { - if err != nil { - err = annotateLogPath(err, buildLogPath) - op.fail(err, imageLogAttrs(image)...) - return - } - op.done(imageLogAttrs(image)...) - }() - - name := params.Name - imageBuildStage(ctx, "resolve_image", "resolving image build inputs") - if name == "" { - name = fmt.Sprintf("image-%d", model.Now().Unix()) - } - if _, err := d.FindImage(ctx, name); err == nil { - return model.Image{}, fmt.Errorf("image name already exists: %s", name) - } - fromImage := strings.TrimSpace(params.FromImage) - if fromImage == "" { - return model.Image{}, fmt.Errorf("from-image is required") - } - baseImage, err := d.FindImage(ctx, fromImage) - if err != nil { - return model.Image{}, err - } - id, err := model.NewID() - if err != nil { - return model.Image{}, err - } - now := model.Now() - artifactDir := filepath.Join(d.layout.ImagesDir, id) - buildLogDir := filepath.Join(d.layout.StateDir, "image-build") - if err := os.MkdirAll(buildLogDir, 0o755); err != nil { - return model.Image{}, err - } - buildLogPath = filepath.Join(buildLogDir, id+".log") - imageBuildSetLogPath(ctx, buildLogPath) - logFile, err := os.OpenFile(buildLogPath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0o644) - if err != nil { - return model.Image{}, err - } - defer logFile.Close() - stageDir, err := os.MkdirTemp(d.layout.ImagesDir, id+".build-") - if err != nil { - return model.Image{}, err - } - cleanupStage := true - defer func() { - if cleanupStage { - _ = os.RemoveAll(stageDir) - } - }() - rootfsPath := filepath.Join(stageDir, "rootfs.ext4") - workSeedPath := filepath.Join(stageDir, "work-seed.ext4") - kernelSource := firstNonEmpty(params.KernelPath, baseImage.KernelPath) - initrdSource := firstNonEmpty(params.InitrdPath, baseImage.InitrdPath) - modulesSource := firstNonEmpty(params.ModulesDir, baseImage.ModulesDir) - if err := d.validateImageBuildPrereqs(ctx, baseImage.RootfsPath, kernelSource, initrdSource, modulesSource, params.Size); err != nil { - return model.Image{}, err - } - kernelPath, initrdPath, modulesDir, err := stageManagedBootArtifacts(ctx, d.runner, stageDir, kernelSource, initrdSource, modulesSource) - if err != nil { - return model.Image{}, err - } - packages := imagepreset.DebianBasePackages() - metadataPackages := imageBuildMetadataPackages(params.Docker) - spec := imageBuildSpec{ - ID: id, - Name: name, - SourceRootfs: baseImage.RootfsPath, - RootfsPath: rootfsPath, - BuildLog: logFile, - KernelPath: kernelPath, - InitrdPath: initrdPath, - ModulesDir: modulesDir, - Packages: packages, - InstallDocker: params.Docker, - Size: params.Size, - } - op.stage("launch_builder", "build_log_path", buildLogPath, "artifact_dir", artifactDir, "from_image", baseImage.Name) - imageBuildStage(ctx, "launch_builder", "building rootfs from base image") - if err := d.runImageBuild(ctx, spec); err != nil { - _ = logFile.Sync() - return model.Image{}, err - } - imageBuildStage(ctx, "prepare_work_seed", "building reusable work seed") - if err := system.BuildWorkSeedImage(ctx, d.runner, rootfsPath, workSeedPath); err != nil { - _ = logFile.Sync() - return model.Image{}, err - } - imageBuildStage(ctx, "seed_ssh", "seeding runtime SSH access") - seededSSHPublicKeyFingerprint, err := d.seedAuthorizedKeyOnExt4Image(ctx, workSeedPath) - if err != nil { - _ = logFile.Sync() - return model.Image{}, err - } - imageBuildStage(ctx, "write_metadata", "writing image metadata") - if err := writePackagesMetadata(rootfsPath, metadataPackages); err != nil { - _ = logFile.Sync() - return model.Image{}, err - } - op.stage("activate_artifacts", "artifact_dir", artifactDir) - if err := os.Rename(stageDir, artifactDir); err != nil { - return model.Image{}, err - } - cleanupStage = false - image = model.Image{ - ID: id, - Name: name, - Managed: true, - ArtifactDir: artifactDir, - RootfsPath: filepath.Join(artifactDir, "rootfs.ext4"), - WorkSeedPath: filepath.Join(artifactDir, "work-seed.ext4"), - KernelPath: filepath.Join(artifactDir, "kernel"), - InitrdPath: stageOptionalArtifactPath(artifactDir, initrdPath, "initrd.img"), - ModulesDir: stageOptionalArtifactPath(artifactDir, modulesDir, "modules"), - BuildSize: params.Size, - SeededSSHPublicKeyFingerprint: seededSSHPublicKeyFingerprint, - Docker: params.Docker, - CreatedAt: now, - UpdatedAt: now, - } - imageBuildBindImage(ctx, image) - if err := d.store.UpsertImage(ctx, image); err != nil { - return model.Image{}, err - } - op.stage("persisted", "build_log_path", buildLogPath) - imageBuildStage(ctx, "persisted", "image metadata saved") - if d.logger != nil { - d.logger.Info("image build log preserved", append(imageLogAttrs(image), "build_log_path", buildLogPath)...) - } - _ = logFile.Sync() - return image, nil -} - -func (d *Daemon) RegisterImage(ctx context.Context, params api.ImageRegisterParams) (image model.Image, err error) { - d.mu.Lock() - defer d.mu.Unlock() - +// RegisterImage creates or updates an unmanaged image row. Path +// validation + kernel resolution run without imageOpsMu — only the +// lookup-then-upsert atom is held under the lock so concurrent +// registers of the same name don't race. +func (s *ImageService) RegisterImage(ctx context.Context, params api.ImageRegisterParams) (image model.Image, err error) { name := strings.TrimSpace(params.Name) if name == "" { return model.Image{}, fmt.Errorf("image name is required") @@ -177,19 +39,20 @@ func (d *Daemon) RegisterImage(ctx context.Context, params api.ImageRegisterPara } } } - kernelPath := strings.TrimSpace(params.KernelPath) - if kernelPath == "" { - return model.Image{}, fmt.Errorf("kernel path is required") - } - initrdPath := strings.TrimSpace(params.InitrdPath) - modulesDir := strings.TrimSpace(params.ModulesDir) - - if err := validateImageRegisterPaths(rootfsPath, workSeedPath, kernelPath, initrdPath, modulesDir); err != nil { + kernelPath, initrdPath, modulesDir, err := s.resolveKernelInputs(ctx, params.KernelRef, params.KernelPath, params.InitrdPath, params.ModulesDir) + if err != nil { return model.Image{}, err } + if err := imagemgr.ValidateRegisterPaths(rootfsPath, workSeedPath, kernelPath, initrdPath, modulesDir); err != nil { + return model.Image{}, err + } + + s.imageOpsMu.Lock() + defer s.imageOpsMu.Unlock() + now := model.Now() - existing, lookupErr := d.store.GetImageByName(ctx, name) + existing, lookupErr := s.store.GetImageByName(ctx, name) switch { case lookupErr == nil: if existing.Managed { @@ -201,7 +64,6 @@ func (d *Daemon) RegisterImage(ctx context.Context, params api.ImageRegisterPara image.KernelPath = kernelPath image.InitrdPath = initrdPath image.ModulesDir = modulesDir - image.Docker = params.Docker image.UpdatedAt = now case errors.Is(lookupErr, sql.ErrNoRows): id, idErr := model.NewID() @@ -217,7 +79,6 @@ func (d *Daemon) RegisterImage(ctx context.Context, params api.ImageRegisterPara KernelPath: kernelPath, InitrdPath: initrdPath, ModulesDir: modulesDir, - Docker: params.Docker, CreatedAt: now, UpdatedAt: now, } @@ -225,17 +86,19 @@ func (d *Daemon) RegisterImage(ctx context.Context, params api.ImageRegisterPara return model.Image{}, lookupErr } - if err := d.store.UpsertImage(ctx, image); err != nil { + if err := s.store.UpsertImage(ctx, image); err != nil { return model.Image{}, err } return image, nil } -func (d *Daemon) PromoteImage(ctx context.Context, idOrName string) (image model.Image, err error) { - d.mu.Lock() - defer d.mu.Unlock() - - op := d.beginOperation("image.promote") +// PromoteImage copies an unmanaged image's files into the managed +// artifacts dir and flips its managed bit. The expensive file copy, +// SSH-key seeding, and boot-artifact staging all happen outside +// imageOpsMu — only the find/rename/upsert commit atom holds the +// lock. +func (s *ImageService) PromoteImage(ctx context.Context, idOrName string) (image model.Image, err error) { + op := s.beginOperation(ctx, "image.promote") defer func() { if err != nil { op.fail(err, imageLogAttrs(image)...) @@ -244,31 +107,31 @@ func (d *Daemon) PromoteImage(ctx context.Context, idOrName string) (image model op.done(imageLogAttrs(image)...) }() - image, err = d.FindImage(ctx, idOrName) + image, err = s.FindImage(ctx, idOrName) if err != nil { return model.Image{}, err } if image.Managed { return model.Image{}, fmt.Errorf("image %s is already managed", image.Name) } - if err := validateImagePromotePaths(image.RootfsPath, image.KernelPath, image.InitrdPath, image.ModulesDir); err != nil { + if err := imagemgr.ValidatePromotePaths(image.RootfsPath, image.KernelPath, image.InitrdPath, image.ModulesDir); err != nil { return model.Image{}, err } - if strings.TrimSpace(d.layout.ImagesDir) == "" { + if strings.TrimSpace(s.layout.ImagesDir) == "" { return model.Image{}, errors.New("images dir is not configured") } - if err := os.MkdirAll(d.layout.ImagesDir, 0o755); err != nil { + if err := os.MkdirAll(s.layout.ImagesDir, 0o755); err != nil { return model.Image{}, err } - artifactDir := filepath.Join(d.layout.ImagesDir, image.ID) + artifactDir := filepath.Join(s.layout.ImagesDir, image.ID) if _, statErr := os.Stat(artifactDir); statErr == nil { return model.Image{}, fmt.Errorf("artifact dir already exists: %s", artifactDir) } else if !os.IsNotExist(statErr) { return model.Image{}, statErr } - stageDir, err := os.MkdirTemp(d.layout.ImagesDir, image.ID+".promote-") + stageDir, err := os.MkdirTemp(s.layout.ImagesDir, image.ID+".promote-") if err != nil { return model.Image{}, err } @@ -302,24 +165,18 @@ func (d *Daemon) PromoteImage(ctx context.Context, idOrName string) (image model if err := system.CopyFilePreferClone(image.WorkSeedPath, workSeedPath); err != nil { return model.Image{}, err } - image.SeededSSHPublicKeyFingerprint, err = d.seedAuthorizedKeyOnExt4Image(ctx, workSeedPath) + image.SeededSSHPublicKeyFingerprint, err = s.seedAuthorizedKeyOnExt4Image(ctx, workSeedPath) if err != nil { return model.Image{}, err } } else { image.SeededSSHPublicKeyFingerprint = "" } - _, initrdPath, modulesDir, err := stageManagedBootArtifacts(ctx, d.runner, stageDir, image.KernelPath, image.InitrdPath, image.ModulesDir) + _, initrdPath, modulesDir, err := imagemgr.StageBootArtifacts(ctx, s.runner, stageDir, image.KernelPath, image.InitrdPath, image.ModulesDir) if err != nil { return model.Image{}, err } - op.stage("activate_artifacts", "artifact_dir", artifactDir) - if err := os.Rename(stageDir, artifactDir); err != nil { - return model.Image{}, err - } - cleanupStage = false - image.Managed = true image.ArtifactDir = artifactDir image.RootfsPath = filepath.Join(artifactDir, "rootfs.ext4") @@ -327,71 +184,51 @@ func (d *Daemon) PromoteImage(ctx context.Context, idOrName string) (image model image.WorkSeedPath = filepath.Join(artifactDir, "work-seed.ext4") } image.KernelPath = filepath.Join(artifactDir, "kernel") - image.InitrdPath = stageOptionalArtifactPath(artifactDir, initrdPath, "initrd.img") - image.ModulesDir = stageOptionalArtifactPath(artifactDir, modulesDir, "modules") + image.InitrdPath = imagemgr.StageOptionalArtifactPath(artifactDir, initrdPath, "initrd.img") + image.ModulesDir = imagemgr.StageOptionalArtifactPath(artifactDir, modulesDir, "modules") image.UpdatedAt = model.Now() - if err := d.store.UpsertImage(ctx, image); err != nil { + + op.stage("activate_artifacts", "artifact_dir", artifactDir) + s.imageOpsMu.Lock() + defer s.imageOpsMu.Unlock() + if err := os.Rename(stageDir, artifactDir); err != nil { + return model.Image{}, err + } + cleanupStage = false + if err := s.store.UpsertImage(ctx, image); err != nil { _ = os.RemoveAll(artifactDir) return model.Image{}, err } return image, nil } -func validateImageRegisterPaths(rootfsPath, workSeedPath, kernelPath, initrdPath, modulesDir string) error { - checks := system.NewPreflight() - checks.RequireFile(rootfsPath, "rootfs image", `pass --rootfs `) - checks.RequireFile(kernelPath, "kernel image", `pass --kernel `) - if workSeedPath != "" { - checks.RequireFile(workSeedPath, "work-seed image", `pass --work-seed or rebuild the image with a work seed`) - } - if initrdPath != "" { - checks.RequireFile(initrdPath, "initrd image", `pass --initrd `) - } - if modulesDir != "" { - checks.RequireDir(modulesDir, "kernel modules dir", `pass --modules `) - } - return checks.Err("image register failed") -} - -func validateImagePromotePaths(rootfsPath, kernelPath, initrdPath, modulesDir string) error { - checks := system.NewPreflight() - checks.RequireFile(rootfsPath, "rootfs image", `re-register the image with a valid rootfs`) - checks.RequireFile(kernelPath, "kernel image", `re-register the image with a valid kernel`) - if initrdPath != "" { - checks.RequireFile(initrdPath, "initrd image", `re-register the image with a valid initrd`) - } - if modulesDir != "" { - checks.RequireDir(modulesDir, "kernel modules dir", `re-register the image with a valid modules dir`) - } - return checks.Err("image promote failed") -} - -func writePackagesMetadata(rootfsPath string, packages []string) error { - if rootfsPath == "" || len(packages) == 0 { - return nil - } - metadataPath := rootfsPath + ".packages.sha256" - return os.WriteFile(metadataPath, []byte(packagesHash(packages)+"\n"), 0o644) -} - -func (d *Daemon) DeleteImage(ctx context.Context, idOrName string) (model.Image, error) { - d.mu.Lock() - defer d.mu.Unlock() - - image, err := d.FindImage(ctx, idOrName) +// DeleteImage runs the lookup + reference check + store delete under +// imageOpsMu so a concurrent CreateVM can't slip an image_id reference +// in between the check and the delete. File cleanup happens after the +// lock is released — the store row is the authoritative handle. +func (s *ImageService) DeleteImage(ctx context.Context, idOrName string) (model.Image, error) { + image, err := func() (model.Image, error) { + s.imageOpsMu.Lock() + defer s.imageOpsMu.Unlock() + img, err := s.FindImage(ctx, idOrName) + if err != nil { + return model.Image{}, err + } + vms, err := s.store.FindVMsUsingImage(ctx, img.ID) + if err != nil { + return model.Image{}, err + } + if len(vms) > 0 { + return model.Image{}, fmt.Errorf("image %s is still referenced by %d VM(s)", img.Name, len(vms)) + } + if err := s.store.DeleteImage(ctx, img.ID); err != nil { + return model.Image{}, err + } + return img, nil + }() if err != nil { return model.Image{}, err } - vms, err := d.store.FindVMsUsingImage(ctx, image.ID) - if err != nil { - return model.Image{}, err - } - if len(vms) > 0 { - return model.Image{}, fmt.Errorf("image %s is still referenced by %d VM(s)", image.Name, len(vms)) - } - if err := d.store.DeleteImage(ctx, image.ID); err != nil { - return model.Image{}, err - } if image.Managed && image.ArtifactDir != "" { if err := os.RemoveAll(image.ArtifactDir); err != nil { return model.Image{}, err @@ -400,46 +237,6 @@ func (d *Daemon) DeleteImage(ctx context.Context, idOrName string) (model.Image, return image, nil } -func stageManagedBootArtifacts(ctx context.Context, runner system.CommandRunner, artifactDir, kernelSource, initrdSource, modulesSource string) (string, string, string, error) { - kernelPath := filepath.Join(artifactDir, "kernel") - if err := system.CopyFilePreferClone(kernelSource, kernelPath); err != nil { - return "", "", "", err - } - initrdPath := "" - if strings.TrimSpace(initrdSource) != "" { - initrdPath = filepath.Join(artifactDir, "initrd.img") - if err := system.CopyFilePreferClone(initrdSource, initrdPath); err != nil { - return "", "", "", err - } - } - modulesDir := "" - if strings.TrimSpace(modulesSource) != "" { - modulesDir = filepath.Join(artifactDir, "modules") - if err := os.MkdirAll(modulesDir, 0o755); err != nil { - return "", "", "", err - } - if err := system.CopyDirContents(ctx, runner, modulesSource, modulesDir, false); err != nil { - return "", "", "", err - } - } - return kernelPath, initrdPath, modulesDir, nil -} - -func imageBuildMetadataPackages(docker bool) []string { - packages := imagepreset.DebianBasePackages() - if docker { - packages = append(packages, "#feature:docker") - } - return packages -} - -func stageOptionalArtifactPath(artifactDir, stagedPath, name string) string { - if strings.TrimSpace(stagedPath) == "" { - return "" - } - return filepath.Join(artifactDir, name) -} - func firstNonEmpty(values ...string) string { for _, value := range values { if strings.TrimSpace(value) != "" { @@ -448,3 +245,73 @@ func firstNonEmpty(values ...string) string { } return "" } + +// resolveKernelInputs canonicalises user-supplied kernel info: either direct +// paths or a kernel-catalog ref. Shared by RegisterImage and PullImage. +// When kernelRef is given but not yet pulled locally, an auto-pull from the +// embedded kernelcat catalog fires so the caller doesn't have to manage +// kernel/image ordering by hand. +func (s *ImageService) resolveKernelInputs(ctx context.Context, kernelRef, kernelPath, initrdPath, modulesDir string) (string, string, string, error) { + kernelRef = strings.TrimSpace(kernelRef) + kernelPath = strings.TrimSpace(kernelPath) + initrdPath = strings.TrimSpace(initrdPath) + modulesDir = strings.TrimSpace(modulesDir) + + if kernelRef != "" { + if kernelPath != "" || initrdPath != "" || modulesDir != "" { + return "", "", "", fmt.Errorf("--kernel-ref is mutually exclusive with --kernel/--initrd/--modules") + } + entry, err := s.readOrAutoPullKernel(ctx, kernelRef) + if err != nil { + return "", "", "", err + } + return entry.KernelPath, entry.InitrdPath, entry.ModulesDir, nil + } + + if kernelPath == "" { + return "", "", "", fmt.Errorf("kernel path is required (pass --kernel or --kernel-ref )") + } + return kernelPath, initrdPath, modulesDir, nil +} + +// readOrAutoPullKernel tries the local kernelcat first; on miss, checks +// the embedded catalog and auto-pulls the bundle. +// +// Concurrency-safe: takes the same per-name pull lock as KernelPull and +// re-checks ReadLocal after acquiring it. If a peer finished the pull +// while we were waiting, the re-check returns the freshly-pulled entry +// — we explicitly do NOT call s.KernelPull from here because that path +// errors with "already pulled" on a successful peer-pull. Auto-pull's +// contract is "make sure this kernel is local"; "someone beat me to it" +// is success, not failure. +func (s *ImageService) readOrAutoPullKernel(ctx context.Context, kernelRef string) (kernelcat.Entry, error) { + if entry, err := kernelcat.ReadLocal(s.layout.KernelsDir, kernelRef); err == nil { + return entry, nil + } else if !os.IsNotExist(err) { + return kernelcat.Entry{}, fmt.Errorf("resolve kernel %q: %w", kernelRef, err) + } + catalog, loadErr := kernelcat.LoadEmbedded() + if loadErr != nil { + return kernelcat.Entry{}, fmt.Errorf("kernel %q not found locally: %w", kernelRef, loadErr) + } + catEntry, lookupErr := catalog.Lookup(kernelRef) + if lookupErr != nil { + return kernelcat.Entry{}, fmt.Errorf("kernel %q not found in catalog; run 'banger kernel list --available' to browse", kernelRef) + } + + release, err := s.acquireKernelPullLock(ctx, kernelRef) + if err != nil { + return kernelcat.Entry{}, err + } + defer release() + if entry, err := kernelcat.ReadLocal(s.layout.KernelsDir, kernelRef); err == nil { + return entry, nil + } + + vmCreateStage(ctx, "auto_pull_kernel", fmt.Sprintf("pulling kernel %s from catalog", kernelRef)) + stored, err := kernelcat.Fetch(ctx, nil, s.layout.KernelsDir, catEntry) + if err != nil { + return kernelcat.Entry{}, fmt.Errorf("auto-pull kernel %q: %w", kernelRef, err) + } + return stored, nil +} diff --git a/internal/daemon/images_helpers_test.go b/internal/daemon/images_helpers_test.go new file mode 100644 index 0000000..0615820 --- /dev/null +++ b/internal/daemon/images_helpers_test.go @@ -0,0 +1,24 @@ +package daemon + +import "testing" + +func TestFirstNonEmpty(t *testing.T) { + cases := []struct { + name string + values []string + want string + }{ + {"all empty", []string{"", " ", "\t"}, ""}, + {"first wins", []string{"a", "b"}, "a"}, + {"skips blanks", []string{"", " ", "first", "second"}, "first"}, + {"nil input", nil, ""}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := firstNonEmpty(tc.values...) + if got != tc.want { + t.Errorf("firstNonEmpty(%v) = %q, want %q", tc.values, got, tc.want) + } + }) + } +} diff --git a/internal/daemon/images_pull.go b/internal/daemon/images_pull.go new file mode 100644 index 0000000..3f8a8d4 --- /dev/null +++ b/internal/daemon/images_pull.go @@ -0,0 +1,412 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "io/fs" + "os" + "path/filepath" + "regexp" + "strings" + + "banger/internal/api" + "banger/internal/daemon/imagemgr" + "banger/internal/imagecat" + "banger/internal/imagepull" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" + + "github.com/google/go-containerregistry/pkg/name" +) + +// minPullExt4Size keeps the floor consistent with imagepull.MinExt4Size +// when the caller doesn't override --size and the OCI tree is tiny. +const minPullExt4Size int64 = 1 << 30 // 1 GiB + +// PullImage downloads an image and registers it as a managed banger +// image. Two paths: +// +// - Bundle path: `ref` matches an entry in the embedded imagecat +// catalog. The `.tar.zst` bundle is fetched, `rootfs.ext4` is +// already flattened + ownership-fixed + agent-injected at build +// time, so this path is strictly faster than the OCI one. +// - OCI path: otherwise treat `ref` as an OCI reference, pull its +// layers, flatten, fix ownership, inject agents. +// +// Kernel info falls back through: `params.KernelRef` → catalog entry's +// `kernel_ref` (bundle path only) → `params.Kernel/Initrd/ModulesDir`. +// +// Concurrency: the slow staging work (network fetch, ext4 build, +// ownership fixup, guest-agent injection) runs WITHOUT imageOpsMu so +// parallel pulls of different images interleave. imageOpsMu is taken +// only for the publish window — recheck name is free, rename the +// staging dir to the final artifact dir, insert the store row. If two +// pulls race to the same name, the loser fails fast at the recheck +// and its staging dir is cleaned up via defer. +func (s *ImageService) PullImage(ctx context.Context, params api.ImagePullParams) (model.Image, error) { + ref := strings.TrimSpace(params.Ref) + if ref == "" { + return model.Image{}, errors.New("reference is required") + } + + catalog, err := imagecat.LoadEmbedded() + if err != nil { + return model.Image{}, fmt.Errorf("load image catalog: %w", err) + } + if entry, lookupErr := catalog.Lookup(ref); lookupErr == nil { + return s.pullFromBundle(ctx, params, entry) + } + return s.pullFromOCI(ctx, params) +} + +// publishImage is the narrow critical section shared by every image- +// creation path (pull bundle/OCI, register, promote). It re-verifies +// that `image.Name` is still free, atomically renames the staging +// directory to its final home (when applicable), and persists the row. +// The caller owns stagingDir cleanup on failure via its own defer; on +// success, publishImage unsets it so the defer is a no-op. +// +// finalDir == "" means "already published" (the caller built artifacts +// in place, e.g. RegisterImage which only touches the store). When +// non-empty the rename is the publication atom: finalDir must not +// already exist before the rename fires. +func (s *ImageService) publishImage(ctx context.Context, image model.Image, stagingDir, finalDir string) (model.Image, error) { + s.imageOpsMu.Lock() + defer s.imageOpsMu.Unlock() + + if existing, err := s.store.GetImageByName(ctx, image.Name); err == nil { + return model.Image{}, fmt.Errorf("image %q already exists (id=%s); pick a different --name or delete it first", image.Name, existing.ID) + } + if finalDir != "" { + if err := os.Rename(stagingDir, finalDir); err != nil { + return model.Image{}, fmt.Errorf("publish artifact dir: %w", err) + } + } + if err := s.store.UpsertImage(ctx, image); err != nil { + if finalDir != "" { + _ = os.RemoveAll(finalDir) + } + return model.Image{}, err + } + return image, nil +} + +// pullFromOCI is the original OCI-registry-pull path. See PullImage for +// the intent. +func (s *ImageService) pullFromOCI(ctx context.Context, params api.ImagePullParams) (image model.Image, err error) { + ref := strings.TrimSpace(params.Ref) + parsed, err := name.ParseReference(ref) + if err != nil { + return model.Image{}, fmt.Errorf("parse oci ref %q: %w", ref, err) + } + + imgName := strings.TrimSpace(params.Name) + if imgName == "" { + imgName = defaultImageNameFromRef(parsed) + if imgName == "" { + return model.Image{}, errors.New("could not derive image name from ref; pass --name") + } + } + if existing, lookupErr := s.store.GetImageByName(ctx, imgName); lookupErr == nil { + return model.Image{}, fmt.Errorf("image %q already exists (id=%s); pick a different --name or delete it first", imgName, existing.ID) + } + + kernelPath, initrdPath, modulesDir, err := s.resolveKernelInputs(ctx, params.KernelRef, params.KernelPath, params.InitrdPath, params.ModulesDir) + if err != nil { + return model.Image{}, err + } + if err := imagemgr.ValidateKernelPaths(kernelPath, initrdPath, modulesDir); err != nil { + return model.Image{}, err + } + + id, err := model.NewID() + if err != nil { + return model.Image{}, err + } + finalDir := filepath.Join(s.layout.ImagesDir, id) + stagingDir := finalDir + ".staging" + if err := os.MkdirAll(stagingDir, 0o755); err != nil { + return model.Image{}, err + } + cleanupStaging := true + defer func() { + if cleanupStaging { + _ = os.RemoveAll(stagingDir) + } + }() + + // Extract OCI layers into a working tree under TempDir so the + // state filesystem doesn't temporarily double in size. + rootfsTree, err := os.MkdirTemp("", "banger-pull-") + if err != nil { + return model.Image{}, err + } + defer os.RemoveAll(rootfsTree) + + meta, err := s.runPullAndFlatten(ctx, ref, s.layout.OCICacheDir, rootfsTree) + if err != nil { + return model.Image{}, fmt.Errorf("pull oci image: %w", err) + } + + sizeBytes := params.SizeBytes + if sizeBytes <= 0 { + treeSize, err := dirSizeBytes(rootfsTree) + if err != nil { + return model.Image{}, fmt.Errorf("size oci tree: %w", err) + } + sizeBytes = treeSize + treeSize/4 // +25% headroom + if sizeBytes < minPullExt4Size { + sizeBytes = minPullExt4Size + } + } + + rootfsExt4 := filepath.Join(stagingDir, "rootfs.ext4") + if err := imagepull.BuildExt4(ctx, s.runner, rootfsTree, rootfsExt4, sizeBytes); err != nil { + return model.Image{}, fmt.Errorf("build rootfs ext4: %w", err) + } + if err := s.runFinalizePulledRootfs(ctx, rootfsExt4, meta); err != nil { + return model.Image{}, err + } + workSeedExt4 := s.runBuildWorkSeed(ctx, rootfsExt4, stagingDir) + + stagedKernel, stagedInitrd, stagedModules, err := imagemgr.StageBootArtifacts(ctx, s.runner, stagingDir, kernelPath, initrdPath, modulesDir) + if err != nil { + return model.Image{}, fmt.Errorf("stage boot artifacts: %w", err) + } + + now := model.Now() + image = model.Image{ + ID: id, + Name: imgName, + Managed: true, + ArtifactDir: finalDir, + RootfsPath: filepath.Join(finalDir, filepath.Base(rootfsExt4)), + KernelPath: rebaseUnder(stagedKernel, stagingDir, finalDir), + InitrdPath: rebaseUnder(stagedInitrd, stagingDir, finalDir), + ModulesDir: rebaseUnder(stagedModules, stagingDir, finalDir), + CreatedAt: now, + UpdatedAt: now, + } + if workSeedExt4 != "" { + image.WorkSeedPath = filepath.Join(finalDir, filepath.Base(workSeedExt4)) + } + published, err := s.publishImage(ctx, image, stagingDir, finalDir) + if err != nil { + return model.Image{}, err + } + cleanupStaging = false + return published, nil +} + +// pullFromBundle is the imagecat-backed path: download a ready-to-boot +// bundle (rootfs.ext4 already flattened + ownership-fixed + agent- +// injected at build time), verify its sha256, and register the result +// as a managed image. No flatten / mkfs / debugfs work on the daemon +// host. +func (s *ImageService) pullFromBundle(ctx context.Context, params api.ImagePullParams, entry imagecat.CatEntry) (image model.Image, err error) { + imgName := strings.TrimSpace(params.Name) + if imgName == "" { + imgName = entry.Name + } + if existing, lookupErr := s.store.GetImageByName(ctx, imgName); lookupErr == nil { + return model.Image{}, fmt.Errorf("image %q already exists (id=%s); pick a different --name or delete it first", imgName, existing.ID) + } + + // Kernel resolution precedence: params > catalog entry's kernel_ref. + kernelRef := strings.TrimSpace(params.KernelRef) + if kernelRef == "" && strings.TrimSpace(params.KernelPath) == "" { + kernelRef = strings.TrimSpace(entry.KernelRef) + } + kernelPath, initrdPath, modulesDir, err := s.resolveKernelInputs(ctx, kernelRef, params.KernelPath, params.InitrdPath, params.ModulesDir) + if err != nil { + return model.Image{}, err + } + if err := imagemgr.ValidateKernelPaths(kernelPath, initrdPath, modulesDir); err != nil { + return model.Image{}, err + } + + id, err := model.NewID() + if err != nil { + return model.Image{}, err + } + finalDir := filepath.Join(s.layout.ImagesDir, id) + stagingDir := finalDir + ".staging" + if err := os.MkdirAll(stagingDir, 0o755); err != nil { + return model.Image{}, err + } + cleanupStaging := true + defer func() { + if cleanupStaging { + _ = os.RemoveAll(stagingDir) + } + }() + + if _, err := s.runBundleFetch(ctx, stagingDir, entry); err != nil { + return model.Image{}, fmt.Errorf("fetch bundle: %w", err) + } + // manifest.json is metadata we only need at fetch time; strip it + // so the final artifact dir contains only boot-relevant files. + _ = os.Remove(filepath.Join(stagingDir, imagecat.ManifestFilename)) + rootfsExt4 := filepath.Join(stagingDir, imagecat.RootfsFilename) + workSeedExt4 := s.runBuildWorkSeed(ctx, rootfsExt4, stagingDir) + + stagedKernel, stagedInitrd, stagedModules, err := imagemgr.StageBootArtifacts(ctx, s.runner, stagingDir, kernelPath, initrdPath, modulesDir) + if err != nil { + return model.Image{}, fmt.Errorf("stage boot artifacts: %w", err) + } + + now := model.Now() + image = model.Image{ + ID: id, + Name: imgName, + Managed: true, + ArtifactDir: finalDir, + RootfsPath: filepath.Join(finalDir, filepath.Base(rootfsExt4)), + KernelPath: rebaseUnder(stagedKernel, stagingDir, finalDir), + InitrdPath: rebaseUnder(stagedInitrd, stagingDir, finalDir), + ModulesDir: rebaseUnder(stagedModules, stagingDir, finalDir), + CreatedAt: now, + UpdatedAt: now, + } + if workSeedExt4 != "" { + image.WorkSeedPath = filepath.Join(finalDir, filepath.Base(workSeedExt4)) + } + published, err := s.publishImage(ctx, image, stagingDir, finalDir) + if err != nil { + return model.Image{}, err + } + cleanupStaging = false + return published, nil +} + +// runBundleFetch is the seam tests substitute. nil → real implementation. +func (s *ImageService) runBundleFetch(ctx context.Context, destDir string, entry imagecat.CatEntry) (imagecat.Manifest, error) { + if s.bundleFetch != nil { + return s.bundleFetch(ctx, destDir, entry) + } + return imagecat.Fetch(ctx, nil, destDir, entry) +} + +// runPullAndFlatten is the seam tests substitute. nil → real implementation. +func (s *ImageService) runPullAndFlatten(ctx context.Context, ref, cacheDir, destDir string) (imagepull.Metadata, error) { + if s.pullAndFlatten != nil { + return s.pullAndFlatten(ctx, ref, cacheDir, destDir) + } + pulled, err := imagepull.Pull(ctx, ref, cacheDir) + if err != nil { + return imagepull.Metadata{}, err + } + return imagepull.Flatten(ctx, pulled, destDir) +} + +// runFinalizePulledRootfs applies ownership fixup and injects banger's +// guest agents. Tests substitute via s.finalizePulledRootfs; nil → +// real implementation using debugfs + the companion vsock-agent +// binary resolved via paths.CompanionBinaryPath. +func (s *ImageService) runFinalizePulledRootfs(ctx context.Context, ext4File string, meta imagepull.Metadata) error { + if s.finalizePulledRootfs != nil { + return s.finalizePulledRootfs(ctx, ext4File, meta) + } + if err := imagepull.ApplyOwnership(ctx, s.runner, ext4File, meta); err != nil { + return fmt.Errorf("apply ownership: %w", err) + } + vsockBin, err := paths.CompanionBinaryPath("banger-vsock-agent") + if err != nil { + return fmt.Errorf("locate vsock agent binary: %w", err) + } + if err := imagepull.InjectGuestAgents(ctx, s.runner, ext4File, imagepull.GuestAgentAssets{ + VsockAgentBin: vsockBin, + }); err != nil { + return fmt.Errorf("inject guest agents: %w", err) + } + return nil +} + +// runBuildWorkSeed extracts /root from the pulled rootfs into a +// sibling work-seed ext4 image. Any failure is treated as non-fatal: +// the image is still publishable without a seed, and VM create falls +// back to the empty-work-disk path (losing distro dotfiles but keeping +// every other guarantee). Returns the work-seed path on success, "" on +// failure (with a warn logged). Tests substitute via s.workSeedBuilder. +func (s *ImageService) runBuildWorkSeed(ctx context.Context, rootfsExt4, stagingDir string) string { + outPath := filepath.Join(stagingDir, "work-seed.ext4") + var err error + if s.workSeedBuilder != nil { + err = s.workSeedBuilder(ctx, rootfsExt4, outPath) + } else { + err = system.BuildWorkSeedImage(ctx, s.runner, rootfsExt4, outPath) + } + if err != nil { + if s.logger != nil { + s.logger.Warn("work-seed build failed; VMs using this image will start with an empty /root", "rootfs", rootfsExt4, "error", err.Error()) + } + _ = os.Remove(outPath) + return "" + } + return outPath +} + +// nameSanitize keeps lowercase alphanumerics + hyphens, collapses runs. +var nameSanitizeRE = regexp.MustCompile(`[^a-z0-9]+`) + +// defaultImageNameFromRef derives a friendly name like "debian-bookworm" +// from "docker.io/library/debian:bookworm". Returns "" if it can't. +func defaultImageNameFromRef(ref name.Reference) string { + repo := ref.Context().RepositoryStr() // e.g. library/debian + parts := strings.Split(repo, "/") + base := parts[len(parts)-1] + + suffix := "" + switch r := ref.(type) { + case name.Tag: + if t := r.TagStr(); t != "" && t != "latest" { + suffix = "-" + t + } + case name.Digest: + // take the first 12 hex chars after sha256: + d := r.DigestStr() + if i := strings.Index(d, ":"); i >= 0 && len(d) >= i+13 { + suffix = "-" + d[i+1:i+13] + } + } + + out := nameSanitizeRE.ReplaceAllString(strings.ToLower(base+suffix), "-") + out = strings.Trim(out, "-") + return out +} + +// rebaseUnder rewrites a path that points inside oldRoot to point inside +// newRoot. Empty input returns empty (kept by StageBootArtifacts when an +// optional artifact is absent). +func rebaseUnder(path, oldRoot, newRoot string) string { + if path == "" { + return "" + } + if rel, err := filepath.Rel(oldRoot, path); err == nil && !strings.HasPrefix(rel, "..") { + return filepath.Join(newRoot, rel) + } + return path +} + +// dirSizeBytes returns the sum of regular-file sizes under root, following +// no symlinks (lstat). Suitable for sizing an ext4 image. +func dirSizeBytes(root string) (int64, error) { + var total int64 + err := filepath.WalkDir(root, func(_ string, d fs.DirEntry, err error) error { + if err != nil { + return err + } + if !d.Type().IsRegular() { + return nil + } + info, err := d.Info() + if err != nil { + return err + } + total += info.Size() + return nil + }) + return total, err +} diff --git a/internal/daemon/images_pull_bundle_test.go b/internal/daemon/images_pull_bundle_test.go new file mode 100644 index 0000000..2e2ea29 --- /dev/null +++ b/internal/daemon/images_pull_bundle_test.go @@ -0,0 +1,289 @@ +package daemon + +import ( + "context" + "encoding/json" + "errors" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/imagecat" + "banger/internal/imagepull" + "banger/internal/kernelcat" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" +) + +// stubBundleFetch writes a valid-enough rootfs.ext4 + manifest.json +// into destDir, simulating a successful bundle download + extract. +// The returned manifest echoes the entry's declared kernel_ref so the +// orchestration sees the same hints it would from a real fetch. +func stubBundleFetch(manifest imagecat.Manifest) func(context.Context, string, imagecat.CatEntry) (imagecat.Manifest, error) { + return func(_ context.Context, destDir string, entry imagecat.CatEntry) (imagecat.Manifest, error) { + if err := os.WriteFile(filepath.Join(destDir, imagecat.RootfsFilename), []byte("rootfs-bytes"), 0o644); err != nil { + return imagecat.Manifest{}, err + } + m := manifest + if m.Name == "" { + m.Name = entry.Name + } + data, err := json.Marshal(m) + if err != nil { + return imagecat.Manifest{}, err + } + if err := os.WriteFile(filepath.Join(destDir, imagecat.ManifestFilename), data, 0o644); err != nil { + return imagecat.Manifest{}, err + } + return m, nil + } +} + +func seedKernel(t *testing.T, kernelsDir, name string) { + t.Helper() + if err := kernelcat.WriteLocal(kernelsDir, kernelcat.Entry{ + Name: name, + Distro: "generic", + Arch: "x86_64", + Source: "test", + }); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(kernelsDir, name, "vmlinux"), []byte("kernel"), 0o644); err != nil { + t.Fatal(err) + } +} + +func TestPullImageBundlePathRegistersFromCatalog(t *testing.T) { + imagesDir := t.TempDir() + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "generic-6.12") + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, KernelsDir: kernelsDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: stubBundleFetch(imagecat.Manifest{KernelRef: "generic-6.12"}), + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + + entry := imagecat.CatEntry{ + Name: "debian-bookworm", + Distro: "debian", + Arch: "x86_64", + KernelRef: "generic-6.12", + TarballURL: "https://example.com/x.tar.zst", + TarballSHA256: "abc", + } + image, err := d.img.pullFromBundle(context.Background(), api.ImagePullParams{Ref: "debian-bookworm"}, entry) + if err != nil { + t.Fatalf("pullFromBundle: %v", err) + } + if image.Name != "debian-bookworm" { + t.Errorf("Name = %q, want debian-bookworm", image.Name) + } + if !strings.HasPrefix(image.ArtifactDir, imagesDir) { + t.Errorf("ArtifactDir = %q, want under %q", image.ArtifactDir, imagesDir) + } + for _, rel := range []string{"rootfs.ext4", "kernel"} { + if _, err := os.Stat(filepath.Join(image.ArtifactDir, rel)); err != nil { + t.Errorf("missing artifact %s: %v", rel, err) + } + } + // manifest.json should not leak into the published artifact dir. + if _, err := os.Stat(filepath.Join(image.ArtifactDir, imagecat.ManifestFilename)); !os.IsNotExist(err) { + t.Errorf("manifest.json should be stripped, got err=%v", err) + } +} + +func TestPullImageBundlePathOverrideNameAndKernelRef(t *testing.T) { + imagesDir := t.TempDir() + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "custom-kernel") + // Overwrite the vmlinux with recognisable bytes so we can verify + // the staged kernel came from the --kernel-ref entry, not the + // catalog's kernel_ref. + customBytes := []byte("custom-kernel-marker") + if err := os.WriteFile(filepath.Join(kernelsDir, "custom-kernel", "vmlinux"), customBytes, 0o644); err != nil { + t.Fatal(err) + } + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, KernelsDir: kernelsDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: stubBundleFetch(imagecat.Manifest{KernelRef: "generic-6.12"}), + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + + entry := imagecat.CatEntry{ + Name: "debian-bookworm", Arch: "x86_64", + KernelRef: "generic-6.12", + TarballURL: "https://example.com/x.tar.zst", + TarballSHA256: "abc", + } + image, err := d.img.pullFromBundle(context.Background(), api.ImagePullParams{ + Ref: "debian-bookworm", Name: "my-sandbox", KernelRef: "custom-kernel", + }, entry) + if err != nil { + t.Fatalf("pullFromBundle: %v", err) + } + if image.Name != "my-sandbox" { + t.Errorf("Name = %q, want my-sandbox", image.Name) + } + staged, err := os.ReadFile(image.KernelPath) + if err != nil { + t.Fatalf("read staged kernel: %v", err) + } + if !strings.Contains(string(staged), "custom-kernel-marker") { + t.Errorf("staged kernel = %q, want custom-kernel bytes", staged) + } +} + +func TestPullImageBundlePathRejectsExistingName(t *testing.T) { + imagesDir := t.TempDir() + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "generic-6.12") + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, KernelsDir: kernelsDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: stubBundleFetch(imagecat.Manifest{KernelRef: "generic-6.12"}), + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + id, _ := model.NewID() + if err := d.store.UpsertImage(context.Background(), model.Image{ + ID: id, Name: "debian-bookworm", + CreatedAt: model.Now(), UpdatedAt: model.Now(), + }); err != nil { + t.Fatal(err) + } + + _, err := d.img.pullFromBundle(context.Background(), api.ImagePullParams{Ref: "debian-bookworm"}, imagecat.CatEntry{ + Name: "debian-bookworm", KernelRef: "generic-6.12", + TarballURL: "https://example.com/x.tar.zst", TarballSHA256: "abc", + }) + if err == nil || !strings.Contains(err.Error(), "already exists") { + t.Fatalf("expected already-exists, got %v", err) + } +} + +func TestPullImageBundlePathRequiresSomeKernelSource(t *testing.T) { + d := &Daemon{ + layout: paths.Layout{ImagesDir: t.TempDir(), KernelsDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: stubBundleFetch(imagecat.Manifest{}), + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + // Catalog entry has no kernel_ref, no --kernel-ref/--kernel passed. + _, err := d.img.pullFromBundle(context.Background(), api.ImagePullParams{Ref: "x"}, imagecat.CatEntry{ + Name: "x", TarballURL: "https://example.com/x.tar.zst", TarballSHA256: "abc", + }) + if err == nil || !strings.Contains(err.Error(), "kernel") { + t.Fatalf("expected kernel-required error, got %v", err) + } +} + +func TestPullImageBundleFetchFailurePropagates(t *testing.T) { + imagesDir := t.TempDir() + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "generic-6.12") + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, KernelsDir: kernelsDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + bundleFetch: func(_ context.Context, _ string, _ imagecat.CatEntry) (imagecat.Manifest, error) { + return imagecat.Manifest{}, errors.New("r2 exploded") + }, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + _, err := d.img.pullFromBundle(context.Background(), api.ImagePullParams{Ref: "x"}, imagecat.CatEntry{ + Name: "x", KernelRef: "generic-6.12", + TarballURL: "https://example.com/x.tar.zst", TarballSHA256: "abc", + }) + if err == nil || !strings.Contains(err.Error(), "r2 exploded") { + t.Fatalf("expected fetch failure propagated, got %v", err) + } + // Staging dir cleaned up. + stagings, _ := filepath.Glob(filepath.Join(imagesDir, "*.staging")) + if len(stagings) != 0 { + t.Errorf("staging dirs left behind: %v", stagings) + } +} + +func TestPullImageDispatchFallsThroughToOCIWhenNoCatalogHit(t *testing.T) { + imagesDir := t.TempDir() + kernelsDir := t.TempDir() + seedKernel(t, kernelsDir, "generic-6.12") + + ociCalled := false + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, KernelsDir: kernelsDir, OCICacheDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: func(_ context.Context, ref, _ string, destDir string) (imagepull.Metadata, error) { + ociCalled = true + if err := os.WriteFile(filepath.Join(destDir, "marker"), []byte("x"), 0o644); err != nil { + return imagepull.Metadata{}, err + } + return imagepull.Metadata{}, errors.New("stop here") + }, + finalizePulledRootfs: stubFinalizePulledRootfs, + bundleFetch: stubBundleFetch(imagecat.Manifest{}), + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + + _, err := d.img.PullImage(context.Background(), api.ImagePullParams{ + // Not a catalog name (catalog is empty in the embedded default). + Ref: "docker.io/library/debian:bookworm", + KernelRef: "generic-6.12", + }) + if err == nil || !strings.Contains(err.Error(), "stop here") { + t.Fatalf("expected OCI path to be taken, got %v", err) + } + if !ociCalled { + t.Fatal("OCI seam was not invoked") + } +} diff --git a/internal/daemon/images_pull_test.go b/internal/daemon/images_pull_test.go new file mode 100644 index 0000000..f65c046 --- /dev/null +++ b/internal/daemon/images_pull_test.go @@ -0,0 +1,244 @@ +package daemon + +import ( + "context" + "errors" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/imagepull" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/system" + + "github.com/google/go-containerregistry/pkg/name" +) + +func writeFakeKernelTriple(t *testing.T) (kernelPath, initrdPath, modulesDir string) { + t.Helper() + dir := t.TempDir() + kernelPath = filepath.Join(dir, "vmlinux") + if err := os.WriteFile(kernelPath, []byte("kernel"), 0o644); err != nil { + t.Fatal(err) + } + initrdPath = filepath.Join(dir, "initrd.img") + if err := os.WriteFile(initrdPath, []byte("initrd"), 0o644); err != nil { + t.Fatal(err) + } + modulesDir = filepath.Join(dir, "modules") + if err := os.MkdirAll(modulesDir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(modulesDir, "modules.dep"), []byte(""), 0o644); err != nil { + t.Fatal(err) + } + return +} + +// stubFinalizePulledRootfs is a no-op seam substitute that skips the real +// debugfs + vsock-agent-binary injection machinery during daemon tests. +func stubFinalizePulledRootfs(_ context.Context, _ string, _ imagepull.Metadata) error { + return nil +} + +// stubWorkSeedBuilder returns an error so runBuildWorkSeed treats +// the step as non-fatal and proceeds without a work-seed. Keeps tests +// off sudo mount without asserting on WorkSeedPath. +func stubWorkSeedBuilder(_ context.Context, _ string, _ string) error { + return errWorkSeedBuilderStub +} + +var errWorkSeedBuilderStub = errors.New("work-seed builder stubbed in tests") + +// stubPullAndFlatten writes a fixed file tree into destDir, simulating a +// successful OCI pull without the network or tarball machinery. +func stubPullAndFlatten(_ context.Context, _ string, _ string, destDir string) (imagepull.Metadata, error) { + if err := os.MkdirAll(filepath.Join(destDir, "etc"), 0o755); err != nil { + return imagepull.Metadata{}, err + } + if err := os.WriteFile(filepath.Join(destDir, "etc", "hello"), []byte("world"), 0o644); err != nil { + return imagepull.Metadata{}, err + } + if err := os.WriteFile(filepath.Join(destDir, "marker"), []byte("ok"), 0o644); err != nil { + return imagepull.Metadata{}, err + } + // Tiny synthetic metadata — daemon-level tests exercise the seam + // plumbing, not the ownership pass itself. + return imagepull.Metadata{Entries: map[string]imagepull.FileMeta{}}, nil +} + +func TestPullImageHappyPath(t *testing.T) { + if _, err := exec.LookPath("mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not available; skipping") + } + imagesDir := t.TempDir() + cacheDir := t.TempDir() + kernel, initrd, modules := writeFakeKernelTriple(t) + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, OCICacheDir: cacheDir}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: stubPullAndFlatten, + finalizePulledRootfs: stubFinalizePulledRootfs, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + + image, err := d.img.PullImage(context.Background(), api.ImagePullParams{ + Ref: "docker.io/library/debian:bookworm", + KernelPath: kernel, + InitrdPath: initrd, + ModulesDir: modules, + }) + if err != nil { + t.Fatalf("PullImage: %v", err) + } + + if image.Name != "debian-bookworm" { + t.Errorf("Name = %q, want debian-bookworm", image.Name) + } + if !image.Managed { + t.Errorf("expected Managed=true") + } + if image.ArtifactDir == "" || !strings.HasPrefix(image.ArtifactDir, imagesDir) { + t.Errorf("ArtifactDir = %q, want under %q", image.ArtifactDir, imagesDir) + } + + for _, rel := range []string{"rootfs.ext4", "kernel", "initrd.img", "modules"} { + if _, err := os.Stat(filepath.Join(image.ArtifactDir, rel)); err != nil { + t.Errorf("missing artifact %s: %v", rel, err) + } + } + + // Staging dir should be gone after publish. + stagings, _ := filepath.Glob(filepath.Join(imagesDir, "*.staging")) + if len(stagings) != 0 { + t.Errorf("staging dirs left behind: %v", stagings) + } +} + +func TestPullImageRejectsExistingName(t *testing.T) { + imagesDir := t.TempDir() + kernel, _, _ := writeFakeKernelTriple(t) + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, OCICacheDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: stubPullAndFlatten, + finalizePulledRootfs: stubFinalizePulledRootfs, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + // Seed a preexisting image with the would-be derived name. + id, _ := model.NewID() + if err := d.store.UpsertImage(context.Background(), model.Image{ + ID: id, + Name: "debian-bookworm", + CreatedAt: model.Now(), + UpdatedAt: model.Now(), + }); err != nil { + t.Fatal(err) + } + + _, err := d.img.PullImage(context.Background(), api.ImagePullParams{ + Ref: "docker.io/library/debian:bookworm", + KernelPath: kernel, + }) + if err == nil || !strings.Contains(err.Error(), "already exists") { + t.Fatalf("expected already-exists error, got %v", err) + } +} + +func TestPullImageRequiresKernel(t *testing.T) { + d := &Daemon{ + layout: paths.Layout{ImagesDir: t.TempDir(), OCICacheDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: stubPullAndFlatten, + finalizePulledRootfs: stubFinalizePulledRootfs, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + _, err := d.img.PullImage(context.Background(), api.ImagePullParams{ + Ref: "docker.io/library/debian:bookworm", + }) + if err == nil || !strings.Contains(err.Error(), "kernel") { + t.Fatalf("expected kernel-required error, got %v", err) + } +} + +func TestPullImageCleansStagingOnFailure(t *testing.T) { + imagesDir := t.TempDir() + kernel, _, _ := writeFakeKernelTriple(t) + failureSeam := func(_ context.Context, _ string, _ string, _ string) (imagepull.Metadata, error) { + return imagepull.Metadata{}, errors.New("network borked") + } + + d := &Daemon{ + layout: paths.Layout{ImagesDir: imagesDir, OCICacheDir: t.TempDir()}, + store: openDaemonStore(t), + runner: system.NewRunner(), + } + d.img = &ImageService{ + layout: d.layout, + store: d.store, + runner: d.runner, + pullAndFlatten: failureSeam, + finalizePulledRootfs: stubFinalizePulledRootfs, + workSeedBuilder: stubWorkSeedBuilder, + } + wireServices(d) + _, err := d.img.PullImage(context.Background(), api.ImagePullParams{ + Ref: "docker.io/library/debian:bookworm", + KernelPath: kernel, + }) + if err == nil || !strings.Contains(err.Error(), "network borked") { + t.Fatalf("expected propagated pull error, got %v", err) + } + stagings, _ := filepath.Glob(filepath.Join(imagesDir, "*.staging")) + if len(stagings) != 0 { + t.Errorf("staging dir left behind on failure: %v", stagings) + } +} + +func TestDefaultImageNameFromRef(t *testing.T) { + cases := []struct { + in string + want string + }{ + {"docker.io/library/debian:bookworm", "debian-bookworm"}, + {"alpine:3.20", "alpine-3-20"}, + {"docker.io/library/debian", "debian"}, + {"ghcr.io/some/org/my-image:v2.1", "my-image-v2-1"}, + } + for _, tc := range cases { + ref, err := name.ParseReference(tc.in) + if err != nil { + t.Fatalf("parse %s: %v", tc.in, err) + } + if got := defaultImageNameFromRef(ref); got != tc.want { + t.Errorf("defaultImageNameFromRef(%s) = %q, want %q", tc.in, got, tc.want) + } + } +} diff --git a/internal/daemon/kernels.go b/internal/daemon/kernels.go new file mode 100644 index 0000000..19a5d47 --- /dev/null +++ b/internal/daemon/kernels.go @@ -0,0 +1,243 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "os" + "path/filepath" + "strings" + "time" + + "banger/internal/api" + "banger/internal/kernelcat" + "banger/internal/system" +) + +func (s *ImageService) KernelList(_ context.Context) (api.KernelListResult, error) { + entries, err := kernelcat.ListLocal(s.layout.KernelsDir) + if err != nil { + return api.KernelListResult{}, err + } + result := api.KernelListResult{Entries: make([]api.KernelEntry, 0, len(entries))} + for _, entry := range entries { + result.Entries = append(result.Entries, kernelEntryToAPI(entry)) + } + return result, nil +} + +func (s *ImageService) KernelShow(_ context.Context, name string) (api.KernelEntry, error) { + entry, err := kernelcat.ReadLocal(s.layout.KernelsDir, name) + if err != nil { + return api.KernelEntry{}, kernelNotFoundIfMissing(name, err) + } + return kernelEntryToAPI(entry), nil +} + +func (s *ImageService) KernelDelete(ctx context.Context, name string) error { + if err := kernelcat.ValidateName(name); err != nil { + return err + } + // Hold the same per-name lock KernelPull / readOrAutoPullKernel + // take. Without it, a delete racing a concurrent pull can land + // between the pull's manifest write and the entry's first use, + // or remove files the pull is still writing. + release, err := s.acquireKernelPullLock(ctx, name) + if err != nil { + return err + } + defer release() + return kernelcat.DeleteLocal(s.layout.KernelsDir, name) +} + +// KernelImport copies the kernel / initrd / modules artifacts produced by +// scripts/make-*-kernel.sh (under params.FromDir) into the local catalog +// under params.Name and writes the manifest. It is the primary bridge from +// "I built a kernel with the helper scripts" to "banger kernel list shows +// it and image register --kernel-ref works." +func (s *ImageService) KernelImport(ctx context.Context, params api.KernelImportParams) (api.KernelEntry, error) { + name := strings.TrimSpace(params.Name) + if err := kernelcat.ValidateName(name); err != nil { + return api.KernelEntry{}, err + } + fromDir := strings.TrimSpace(params.FromDir) + if fromDir == "" { + return api.KernelEntry{}, errors.New("--from is required") + } + + discovered, err := kernelcat.DiscoverPaths(fromDir) + if err != nil { + return api.KernelEntry{}, fmt.Errorf("discover artifacts under %s: %w", fromDir, err) + } + + targetDir := kernelcat.EntryDir(s.layout.KernelsDir, name) + // Overwrite-by-default: clear any prior entry so a re-import is clean. + if err := kernelcat.DeleteLocal(s.layout.KernelsDir, name); err != nil { + return api.KernelEntry{}, fmt.Errorf("clear prior catalog entry %q: %w", name, err) + } + if err := os.MkdirAll(targetDir, 0o755); err != nil { + return api.KernelEntry{}, err + } + + kernelTarget := filepath.Join(targetDir, "vmlinux") + if err := system.CopyFilePreferClone(discovered.KernelPath, kernelTarget); err != nil { + return api.KernelEntry{}, fmt.Errorf("copy kernel: %w", err) + } + if discovered.InitrdPath != "" { + initrdTarget := filepath.Join(targetDir, "initrd.img") + if err := system.CopyFilePreferClone(discovered.InitrdPath, initrdTarget); err != nil { + return api.KernelEntry{}, fmt.Errorf("copy initrd: %w", err) + } + } + if discovered.ModulesDir != "" { + modulesTarget := filepath.Join(targetDir, "modules") + if err := os.MkdirAll(modulesTarget, 0o755); err != nil { + return api.KernelEntry{}, err + } + if err := system.CopyDirContents(ctx, s.runner, discovered.ModulesDir, modulesTarget, false); err != nil { + return api.KernelEntry{}, fmt.Errorf("copy modules: %w", err) + } + } + + sum, err := kernelcat.SumFile(kernelTarget) + if err != nil { + return api.KernelEntry{}, fmt.Errorf("sha256 kernel: %w", err) + } + + entry := kernelcat.Entry{ + Name: name, + Distro: strings.TrimSpace(params.Distro), + Arch: strings.TrimSpace(params.Arch), + KernelVersion: inferKernelVersion(discovered.KernelPath, discovered.ModulesDir), + SHA256: sum, + Source: "import:" + fromDir, + ImportedAt: time.Now().UTC(), + } + if err := kernelcat.WriteLocal(s.layout.KernelsDir, entry); err != nil { + return api.KernelEntry{}, fmt.Errorf("write manifest: %w", err) + } + stored, err := kernelcat.ReadLocal(s.layout.KernelsDir, name) + if err != nil { + return api.KernelEntry{}, err + } + return kernelEntryToAPI(stored), nil +} + +// KernelPull downloads a catalog entry by name into the local catalog. It +// refuses to overwrite an existing entry unless params.Force is set. +// +// Held under a per-name mutex so concurrent callers (the auto-pull +// path inside vm.create, parallel `banger kernel pull` invocations, +// or a mix) can't tear each other's manifest.json or extracted +// tarball. Lock first, then re-check the local catalog: a peer that +// already finished the pull while we waited produces the same +// "already pulled" error a fully-serial run would. +func (s *ImageService) KernelPull(ctx context.Context, params api.KernelPullParams) (api.KernelEntry, error) { + name := strings.TrimSpace(params.Name) + if err := kernelcat.ValidateName(name); err != nil { + return api.KernelEntry{}, err + } + + release, err := s.acquireKernelPullLock(ctx, name) + if err != nil { + return api.KernelEntry{}, err + } + defer release() + + if !params.Force { + if _, err := kernelcat.ReadLocal(s.layout.KernelsDir, name); err == nil { + return api.KernelEntry{}, fmt.Errorf("kernel %q already pulled; pass --force to re-pull", name) + } else if !os.IsNotExist(err) { + return api.KernelEntry{}, err + } + } + + catalog, err := kernelcat.LoadEmbedded() + if err != nil { + return api.KernelEntry{}, err + } + catEntry, err := catalog.Lookup(name) + if err != nil { + return api.KernelEntry{}, fmt.Errorf("kernel %q not in catalog (run 'banger kernel list --available' to browse)", name) + } + + stored, err := kernelcat.Fetch(ctx, nil, s.layout.KernelsDir, catEntry) + if err != nil { + return api.KernelEntry{}, err + } + return kernelEntryToAPI(stored), nil +} + +// KernelCatalog returns every entry from the embedded catalog annotated +// with whether it has already been pulled locally. +func (s *ImageService) KernelCatalog(_ context.Context) (api.KernelCatalogResult, error) { + catalog, err := kernelcat.LoadEmbedded() + if err != nil { + return api.KernelCatalogResult{}, err + } + local, _ := kernelcat.ListLocal(s.layout.KernelsDir) + pulled := make(map[string]bool, len(local)) + for _, entry := range local { + pulled[entry.Name] = true + } + result := api.KernelCatalogResult{Entries: make([]api.KernelCatalogEntry, 0, len(catalog.Entries))} + for _, entry := range catalog.Entries { + result.Entries = append(result.Entries, api.KernelCatalogEntry{ + Name: entry.Name, + Distro: entry.Distro, + Arch: entry.Arch, + KernelVersion: entry.KernelVersion, + SizeBytes: entry.SizeBytes, + Description: entry.Description, + Pulled: pulled[entry.Name], + }) + } + return result, nil +} + +// inferKernelVersion makes a best-effort guess at the kernel version from +// the source filename (e.g. "vmlinux-6.12.79_1") or falls back to the +// modules directory basename. Returns "" if nothing looks useful. +func inferKernelVersion(kernelPath, modulesDir string) string { + if modulesDir != "" { + if base := filepath.Base(modulesDir); base != "." && base != string(filepath.Separator) { + return base + } + } + base := filepath.Base(kernelPath) + for _, prefix := range []string{"vmlinux-", "vmlinuz-"} { + if strings.HasPrefix(base, prefix) { + return strings.TrimPrefix(base, prefix) + } + } + return "" +} + +func kernelEntryToAPI(entry kernelcat.Entry) api.KernelEntry { + importedAt := "" + if !entry.ImportedAt.IsZero() { + importedAt = entry.ImportedAt.UTC().Format(time.RFC3339) + } + return api.KernelEntry{ + Name: entry.Name, + Distro: entry.Distro, + Arch: entry.Arch, + KernelVersion: entry.KernelVersion, + SHA256: entry.SHA256, + Source: entry.Source, + ImportedAt: importedAt, + KernelPath: entry.KernelPath, + InitrdPath: entry.InitrdPath, + ModulesDir: entry.ModulesDir, + } +} + +func kernelNotFoundIfMissing(name string, err error) error { + if err == nil { + return nil + } + if os.IsNotExist(err) { + return fmt.Errorf("kernel %q not found", name) + } + return err +} diff --git a/internal/daemon/kernels_test.go b/internal/daemon/kernels_test.go new file mode 100644 index 0000000..1ce708a --- /dev/null +++ b/internal/daemon/kernels_test.go @@ -0,0 +1,285 @@ +package daemon + +import ( + "context" + "encoding/json" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/kernelcat" + "banger/internal/paths" + "banger/internal/rpc" + "banger/internal/system" +) + +func seedKernelEntry(t *testing.T, kernelsDir, name string) { + t.Helper() + entry := kernelcat.Entry{ + Name: name, + Distro: "void", + Arch: "x86_64", + KernelVersion: "6.12.0", + Source: "test", + } + if err := kernelcat.WriteLocal(kernelsDir, entry); err != nil { + t.Fatalf("seed WriteLocal: %v", err) + } + if err := os.WriteFile(filepath.Join(kernelsDir, name, "vmlinux"), []byte("kernel-bytes"), 0o644); err != nil { + t.Fatalf("seed vmlinux: %v", err) + } +} + +func TestKernelListReturnsSeededEntries(t *testing.T) { + kernelsDir := t.TempDir() + seedKernelEntry(t, kernelsDir, "void-6.12") + seedKernelEntry(t, kernelsDir, "alpine-3.23") + + d := &Daemon{layout: paths.Layout{KernelsDir: kernelsDir}} + wireServices(d) + result, err := d.img.KernelList(context.Background()) + if err != nil { + t.Fatalf("KernelList: %v", err) + } + if len(result.Entries) != 2 { + t.Fatalf("entries = %d, want 2", len(result.Entries)) + } + // sorted alphabetically by kernelcat + if result.Entries[0].Name != "alpine-3.23" || result.Entries[1].Name != "void-6.12" { + t.Fatalf("entries order = %+v", result.Entries) + } + if result.Entries[0].KernelPath == "" || !strings.HasSuffix(result.Entries[0].KernelPath, "vmlinux") { + t.Fatalf("KernelPath not populated: %+v", result.Entries[0]) + } +} + +func TestKernelShowAndDeleteThroughDispatch(t *testing.T) { + kernelsDir := t.TempDir() + seedKernelEntry(t, kernelsDir, "void-6.12") + + d := &Daemon{layout: paths.Layout{KernelsDir: kernelsDir}} + wireServices(d) + + showParams, _ := json.Marshal(api.KernelRefParams{Name: "void-6.12"}) + resp := d.dispatch(context.Background(), rpc.Request{Version: rpc.Version, Method: "kernel.show", Params: showParams}) + if !resp.OK { + t.Fatalf("kernel.show dispatch failed: %+v", resp) + } + var show api.KernelShowResult + if err := json.Unmarshal(resp.Result, &show); err != nil { + t.Fatalf("unmarshal show: %v", err) + } + if show.Entry.Name != "void-6.12" || show.Entry.Distro != "void" { + t.Fatalf("show.Entry = %+v", show.Entry) + } + + delParams, _ := json.Marshal(api.KernelRefParams{Name: "void-6.12"}) + del := d.dispatch(context.Background(), rpc.Request{Version: rpc.Version, Method: "kernel.delete", Params: delParams}) + if !del.OK { + t.Fatalf("kernel.delete dispatch failed: %+v", del) + } + + if _, err := kernelcat.ReadLocal(kernelsDir, "void-6.12"); !os.IsNotExist(err) { + t.Fatalf("entry still present after delete: err=%v", err) + } +} + +func TestKernelShowMissingEntry(t *testing.T) { + d := &Daemon{layout: paths.Layout{KernelsDir: t.TempDir()}} + wireServices(d) + _, err := d.img.KernelShow(context.Background(), "nope") + if err == nil || !strings.Contains(err.Error(), "not found") { + t.Fatalf("KernelShow missing: err=%v", err) + } +} + +func TestKernelDeleteRejectsInvalidName(t *testing.T) { + d := &Daemon{layout: paths.Layout{KernelsDir: t.TempDir()}} + wireServices(d) + if err := d.img.KernelDelete(context.Background(), "../escape"); err == nil { + t.Fatalf("KernelDelete should reject traversal") + } +} + +func TestRegisterImageResolvesKernelRef(t *testing.T) { + kernelsDir := t.TempDir() + seedKernelEntry(t, kernelsDir, "void-6.12") + + rootfs := filepath.Join(t.TempDir(), "rootfs.ext4") + if err := os.WriteFile(rootfs, []byte("rootfs"), 0o644); err != nil { + t.Fatalf("write rootfs: %v", err) + } + + d := &Daemon{ + layout: paths.Layout{KernelsDir: kernelsDir}, + store: openDaemonStore(t), + } + wireServices(d) + + image, err := d.img.RegisterImage(context.Background(), api.ImageRegisterParams{ + Name: "testbox", + RootfsPath: rootfs, + KernelRef: "void-6.12", + }) + if err != nil { + t.Fatalf("RegisterImage: %v", err) + } + want := filepath.Join(kernelsDir, "void-6.12", "vmlinux") + if image.KernelPath != want { + t.Fatalf("image.KernelPath = %q, want %q", image.KernelPath, want) + } +} + +func TestRegisterImageRejectsKernelRefAndPath(t *testing.T) { + kernelsDir := t.TempDir() + seedKernelEntry(t, kernelsDir, "void-6.12") + rootfs := filepath.Join(t.TempDir(), "rootfs.ext4") + if err := os.WriteFile(rootfs, []byte("rootfs"), 0o644); err != nil { + t.Fatal(err) + } + + d := &Daemon{ + layout: paths.Layout{KernelsDir: kernelsDir}, + store: openDaemonStore(t), + } + wireServices(d) + _, err := d.img.RegisterImage(context.Background(), api.ImageRegisterParams{ + Name: "testbox", + RootfsPath: rootfs, + KernelRef: "void-6.12", + KernelPath: "/some/other/vmlinux", + }) + if err == nil || !strings.Contains(err.Error(), "mutually exclusive") { + t.Fatalf("RegisterImage kernel-ref+kernel: err=%v, want mutually-exclusive error", err) + } +} + +func TestKernelImportCopiesArtifactsAndWritesManifest(t *testing.T) { + src := t.TempDir() + if err := os.MkdirAll(filepath.Join(src, "boot"), 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(src, "boot", "vmlinux-6.12.79_1"), []byte("kernel-bytes"), 0o644); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(src, "boot", "initramfs-6.12.79_1"), []byte("initrd-bytes"), 0o644); err != nil { + t.Fatal(err) + } + modulesSource := filepath.Join(src, "lib", "modules", "6.12.79_1") + if err := os.MkdirAll(modulesSource, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(modulesSource, "modules.dep"), []byte(""), 0o644); err != nil { + t.Fatal(err) + } + + kernelsDir := t.TempDir() + d := &Daemon{ + layout: paths.Layout{KernelsDir: kernelsDir}, + runner: system.NewRunner(), + } + wireServices(d) + + entry, err := d.img.KernelImport(context.Background(), api.KernelImportParams{ + Name: "void-6.12", + FromDir: src, + Distro: "void", + Arch: "x86_64", + }) + if err != nil { + t.Fatalf("KernelImport: %v", err) + } + if entry.Name != "void-6.12" || entry.Distro != "void" || entry.Arch != "x86_64" { + t.Fatalf("entry metadata = %+v", entry) + } + if entry.KernelVersion != "6.12.79_1" { + t.Errorf("KernelVersion = %q, want 6.12.79_1 (from modules dir)", entry.KernelVersion) + } + if entry.SHA256 == "" { + t.Errorf("SHA256 not populated") + } + + if _, err := os.Stat(filepath.Join(kernelsDir, "void-6.12", "vmlinux")); err != nil { + t.Errorf("kernel not copied: %v", err) + } + if _, err := os.Stat(filepath.Join(kernelsDir, "void-6.12", "initrd.img")); err != nil { + t.Errorf("initrd not copied: %v", err) + } + if _, err := os.Stat(filepath.Join(kernelsDir, "void-6.12", "modules", "modules.dep")); err != nil { + t.Errorf("modules not copied: %v", err) + } +} + +func TestKernelPullRejectsUnknownCatalogEntry(t *testing.T) { + d := &Daemon{ + layout: paths.Layout{KernelsDir: t.TempDir()}, + runner: system.NewRunner(), + } + wireServices(d) + _, err := d.img.KernelPull(context.Background(), api.KernelPullParams{Name: "unknown"}) + if err == nil || !strings.Contains(err.Error(), "not in catalog") { + t.Fatalf("KernelPull unknown: err=%v", err) + } +} + +func TestKernelPullRefusesOverwriteWithoutForce(t *testing.T) { + kernelsDir := t.TempDir() + seedKernelEntry(t, kernelsDir, "void-6.12") + + d := &Daemon{ + layout: paths.Layout{KernelsDir: kernelsDir}, + runner: system.NewRunner(), + } + wireServices(d) + _, err := d.img.KernelPull(context.Background(), api.KernelPullParams{Name: "void-6.12"}) + if err == nil || !strings.Contains(err.Error(), "already pulled") { + t.Fatalf("KernelPull without --force: err=%v", err) + } +} + +func TestKernelCatalogReportsPulledStatus(t *testing.T) { + d := &Daemon{layout: paths.Layout{KernelsDir: t.TempDir()}} + wireServices(d) + result, err := d.img.KernelCatalog(context.Background()) + if err != nil { + t.Fatalf("KernelCatalog: %v", err) + } + // Embedded catalog ships empty; CI (phase 5) populates it. + if result.Entries == nil { + t.Fatalf("Entries should be non-nil even when catalog is empty") + } +} + +func TestKernelImportRejectsMissingFromDir(t *testing.T) { + d := &Daemon{ + layout: paths.Layout{KernelsDir: t.TempDir()}, + runner: system.NewRunner(), + } + wireServices(d) + _, err := d.img.KernelImport(context.Background(), api.KernelImportParams{Name: "x"}) + if err == nil || !strings.Contains(err.Error(), "--from") { + t.Fatalf("KernelImport without --from: err=%v", err) + } +} + +func TestRegisterImageMissingKernelRef(t *testing.T) { + rootfs := filepath.Join(t.TempDir(), "rootfs.ext4") + if err := os.WriteFile(rootfs, []byte("rootfs"), 0o644); err != nil { + t.Fatal(err) + } + d := &Daemon{ + layout: paths.Layout{KernelsDir: t.TempDir()}, + store: openDaemonStore(t), + } + wireServices(d) + _, err := d.img.RegisterImage(context.Background(), api.ImageRegisterParams{ + Name: "testbox", + RootfsPath: rootfs, + KernelRef: "never-imported", + }) + if err == nil || !strings.Contains(err.Error(), "not found in catalog") { + t.Fatalf("RegisterImage missing kernel-ref: err=%v", err) + } +} diff --git a/internal/daemon/lifecycle_flow_test.go b/internal/daemon/lifecycle_flow_test.go new file mode 100644 index 0000000..82e39e6 --- /dev/null +++ b/internal/daemon/lifecycle_flow_test.go @@ -0,0 +1,143 @@ +package daemon + +import ( + "context" + "errors" + "os" + "testing" + + "banger/internal/api" + "banger/internal/model" +) + +// TestVMCreateNoStartDeleteFlow is the end-to-end lifecycle harness +// test: one test that drives VMService.CreateVM → VMService.DeleteVM +// through the real code path, using newTestDaemon to stand up +// infrastructure. If a future refactor breaks store persistence, +// VM dir creation, or delete-side cleanup for a never-booted VM, +// this test fails. +// +// Scope: everything except the firecracker boot step. CreateVM is +// called with NoStart: true so we skip machine.Start (the upstream +// SDK boundary we can't cross without a real firecracker binary + +// KVM). The flow still exercises image resolution, name/IP +// reservation, VMDir creation, store round-trip, per-VM lock +// lifecycle, handle cache, and the delete-side cleanupRuntime path +// that runs against a never-started VM. +// +// This is the bar for "can we catch a full-lifecycle regression +// without real KVM?" — subsequent harness tests can exercise +// individual error branches (delete while running, create with +// duplicate name, etc.) against the same fixture. +func TestVMCreateNoStartDeleteFlow(t *testing.T) { + d := newTestDaemon(t) + ctx := context.Background() + + // Pre-seed an image record so findOrAutoPullImage finds it + // locally and doesn't try to hit the embedded catalog. + image := testImage("flow-img") + if err := d.store.UpsertImage(ctx, image); err != nil { + t.Fatalf("UpsertImage: %v", err) + } + + // CreateVM with NoStart → reserves name + IP, mkdirs VMDir, + // persists row in state Stopped. Returns the persisted record. + created, err := d.vm.CreateVM(ctx, api.VMCreateParams{ + Name: "flow-vm", + ImageName: image.Name, + NoStart: true, + }) + if err != nil { + t.Fatalf("CreateVM: %v", err) + } + + if created.Name != "flow-vm" { + t.Fatalf("created.Name = %q, want flow-vm", created.Name) + } + if created.ImageID != image.ID { + t.Fatalf("created.ImageID = %q, want %q", created.ImageID, image.ID) + } + if created.State != model.VMStateStopped || created.Runtime.State != model.VMStateStopped { + t.Fatalf("created states = (%q, %q), want both stopped", created.State, created.Runtime.State) + } + if created.Runtime.GuestIP == "" { + t.Fatal("created.Runtime.GuestIP empty — reservation didn't allocate an IP") + } + if created.Runtime.VMDir == "" { + t.Fatal("created.Runtime.VMDir empty — reservation didn't pick a per-VM dir") + } + + // VMDir must exist on disk — reserveVM creates it during the + // reservation window so subsequent lifecycle steps can drop + // handles.json, firecracker.log, etc. inside. + info, err := os.Stat(created.Runtime.VMDir) + if err != nil { + t.Fatalf("VMDir missing after CreateVM: %v", err) + } + if !info.IsDir() { + t.Fatalf("VMDir %q is not a directory", created.Runtime.VMDir) + } + + // Store round-trip: FindVM must return the same record. + found, err := d.vm.FindVM(ctx, created.ID) + if err != nil { + t.Fatalf("FindVM: %v", err) + } + if found.ID != created.ID || found.Name != created.Name { + t.Fatalf("FindVM mismatch: got %+v, created %+v", found, created) + } + + // Duplicate-name rejection: a second CreateVM with the same + // name must fail with a useful error, not persist a second row. + if _, err := d.vm.CreateVM(ctx, api.VMCreateParams{ + Name: "flow-vm", + ImageName: image.Name, + NoStart: true, + }); err == nil { + t.Fatal("second CreateVM with duplicate name succeeded; reserveVM's exact-name check didn't fire") + } + + // DeleteVM against a never-started VM: takes the per-VM lock, + // calls cleanupRuntime (no-op on zero handles), removes the + // store row and the VMDir. Because vmCaps is empty in the + // harness default, capability Cleanup hooks don't fire real + // side effects. + deleted, err := d.vm.DeleteVM(ctx, created.ID) + if err != nil { + t.Fatalf("DeleteVM: %v", err) + } + if deleted.ID != created.ID { + t.Fatalf("DeleteVM returned %+v, want ID %q", deleted, created.ID) + } + + // After delete: store has no record. + if _, err := d.vm.FindVM(ctx, created.ID); err == nil { + t.Fatal("FindVM succeeded after DeleteVM — store row wasn't removed") + } + + // VMDir is gone. + if _, err := os.Stat(created.Runtime.VMDir); !errors.Is(err, os.ErrNotExist) { + t.Fatalf("VMDir %q still present after DeleteVM (stat err = %v)", created.Runtime.VMDir, err) + } +} + +// TestVMCreateWithUnknownImageFails pins the error branch when the +// requested image isn't local and isn't in the embedded catalog. +// The failure must come before any state mutation — in particular, +// no VM row should linger. +func TestVMCreateWithUnknownImageFails(t *testing.T) { + d := newTestDaemon(t) + ctx := context.Background() + + if _, err := d.vm.CreateVM(ctx, api.VMCreateParams{ + Name: "ghostly", + ImageName: "nothing-called-this-image", + NoStart: true, + }); err == nil { + t.Fatal("CreateVM: want error for unknown image, got nil") + } + + if _, err := d.vm.FindVM(ctx, "ghostly"); err == nil { + t.Fatal("FindVM found a record for a VM that should never have been persisted") + } +} diff --git a/internal/daemon/logger.go b/internal/daemon/logger.go index abf1582..99ea3f5 100644 --- a/internal/daemon/logger.go +++ b/internal/daemon/logger.go @@ -9,6 +9,7 @@ import ( "time" "banger/internal/model" + "banger/internal/rpc" ) func newDaemonLogger(w io.Writer, rawLevel string) (*slog.Logger, string, error) { @@ -35,9 +36,37 @@ func parseLogLevel(raw string) (slog.Level, string, error) { } } -func (d *Daemon) beginOperation(name string, attrs ...any) *operationLog { +// WithOpID stores the per-RPC correlation id on ctx. Re-exported +// from rpc so daemon-side call sites don't have to import rpc just +// for context plumbing. The dispatch layer calls this on every +// incoming request; capability hooks, lifecycle steps, and the +// privileged-ops shim that crosses into the root helper all read +// the id back via OpIDFromContext so a single id stitches the +// whole chain together in journalctl. +func WithOpID(ctx context.Context, opID string) context.Context { + return rpc.WithOpID(ctx, opID) +} + +// OpIDFromContext returns the dispatch-assigned op id stored on +// ctx, or "" if none was set. +func OpIDFromContext(ctx context.Context) string { + return rpc.OpIDFromContext(ctx) +} + +// beginOperation starts a logged operation. When ctx carries a +// dispatch-assigned op id (see WithOpID) every log line emitted +// through the returned operationLog includes it as an "op_id" attr, +// so the daemon journal can be greppable by id from the user's CLI +// error all the way down through capability hooks and the root +// helper. +func (d *Daemon) beginOperation(ctx context.Context, name string, attrs ...any) *operationLog { + opID := OpIDFromContext(ctx) + allAttrs := append([]any(nil), attrs...) + if opID != "" { + allAttrs = append([]any{"op_id", opID}, allAttrs...) + } if d.logger != nil { - d.logger.Info("operation started", append([]any{"operation", name}, attrs...)...) + d.logger.Debug("operation started", append([]any{"operation", name}, allAttrs...)...) } now := time.Now() return &operationLog{ @@ -45,7 +74,8 @@ func (d *Daemon) beginOperation(name string, attrs ...any) *operationLog { name: name, started: now, last: now, - attrs: append([]any(nil), attrs...), + attrs: allAttrs, + opID: opID, } } @@ -55,6 +85,16 @@ type operationLog struct { started time.Time last time.Time attrs []any + opID string +} + +// OpID exposes the correlation id this operation was started with so +// dispatch can stamp it onto an outgoing error response. +func (o *operationLog) OpID() string { + if o == nil { + return "" + } + return o.opID } func (o *operationLog) stage(stage string, attrs ...any) { @@ -98,6 +138,10 @@ func (o operationLog) log(level slog.Level, msg string, attrs ...any) { o.logger.Log(context.Background(), level, msg, base...) } +// vmLogAttrs returns the durable identifying fields for a VM that +// are always safe to log. Transient handles (PID, tap device) moved +// off VMRecord when the schema was split; lifecycle ops log those +// explicitly on the events where they matter (e.g. wait_for_exit). func vmLogAttrs(vm model.VMRecord) []any { attrs := []any{ "vm_id", vm.ID, @@ -107,15 +151,9 @@ func vmLogAttrs(vm model.VMRecord) []any { if vm.Runtime.GuestIP != "" { attrs = append(attrs, "guest_ip", vm.Runtime.GuestIP) } - if vm.Runtime.TapDevice != "" { - attrs = append(attrs, "tap_device", vm.Runtime.TapDevice) - } if vm.Runtime.APISockPath != "" { attrs = append(attrs, "api_socket", vm.Runtime.APISockPath) } - if vm.Runtime.PID > 0 { - attrs = append(attrs, "pid", vm.Runtime.PID) - } if vm.Runtime.LogPath != "" { attrs = append(attrs, "log_path", vm.Runtime.LogPath) } diff --git a/internal/daemon/logger_test.go b/internal/daemon/logger_test.go index 4ad9e29..b9758df 100644 --- a/internal/daemon/logger_test.go +++ b/internal/daemon/logger_test.go @@ -11,7 +11,6 @@ import ( "strings" "testing" - "banger/internal/api" "banger/internal/model" "banger/internal/paths" ) @@ -43,11 +42,7 @@ func TestNewDaemonLoggerEmitsJSONAtConfiguredLevel(t *testing.T) { func TestStartVMLockedLogsBridgeFailure(t *testing.T) { ctx := context.Background() - origVsockHostDevicePath := vsockHostDevicePath - vsockHostDevicePath = filepath.Join(t.TempDir(), "vhost-vsock") - t.Cleanup(func() { - vsockHostDevicePath = origVsockHostDevicePath - }) + vsockDevicePath := filepath.Join(t.TempDir(), "vhost-vsock") binDir := t.TempDir() for _, name := range []string{ "sudo", "ip", "dmsetup", "losetup", "blockdev", "truncate", "pgrep", "ps", @@ -63,7 +58,7 @@ func TestStartVMLockedLogsBridgeFailure(t *testing.T) { if err := os.WriteFile(firecrackerBin, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { t.Fatalf("write firecracker: %v", err) } - if err := os.WriteFile(vsockHostDevicePath, []byte{}, 0o644); err != nil { + if err := os.WriteFile(vsockDevicePath, []byte{}, 0o644); err != nil { t.Fatalf("write vsock host device: %v", err) } if err := os.WriteFile(vsockHelper, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { @@ -115,8 +110,10 @@ func TestStartVMLockedLogsBridgeFailure(t *testing.T) { runner: runner, logger: logger, } + wireServices(d) + d.vm.vsockHostDevice = vsockDevicePath - _, err = d.startVMLocked(ctx, vm, image) + _, err = d.vm.startVMLocked(ctx, vm, image) if err == nil || !strings.Contains(err.Error(), "bridge up failed") { t.Fatalf("startVMLocked() error = %v, want bridge failure", err) } @@ -131,119 +128,6 @@ func TestStartVMLockedLogsBridgeFailure(t *testing.T) { } } -func TestBuildImagePreservesBuildLogOnFailure(t *testing.T) { - ctx := context.Background() - store := openDaemonStore(t) - stateDir := filepath.Join(t.TempDir(), "state") - imagesDir := filepath.Join(stateDir, "images") - if err := os.MkdirAll(imagesDir, 0o755); err != nil { - t.Fatalf("mkdir images dir: %v", err) - } - - binDir := t.TempDir() - for _, name := range []string{"sudo", "ip", "pgrep", "chown", "chmod", "kill", "iptables", "sysctl", "e2fsck", "resize2fs", "mkfs.ext4", "mount", "umount", "cp"} { - writeFakeExecutable(t, filepath.Join(binDir, name)) - } - t.Setenv("PATH", binDir) - - baseRootfs := filepath.Join(t.TempDir(), "base.ext4") - kernelPath := filepath.Join(t.TempDir(), "vmlinux") - sshKeyPath := filepath.Join(t.TempDir(), "id_ed25519") - firecrackerBin := filepath.Join(t.TempDir(), "firecracker") - vsockHelper := filepath.Join(t.TempDir(), "banger-vsock-agent") - for _, path := range []string{baseRootfs, kernelPath, sshKeyPath} { - if err := os.WriteFile(path, []byte("artifact"), 0o644); err != nil { - t.Fatalf("write %s: %v", path, err) - } - } - if err := os.WriteFile(vsockHelper, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { - t.Fatalf("write %s: %v", vsockHelper, err) - } - t.Setenv("BANGER_VSOCK_AGENT_BIN", vsockHelper) - if err := os.WriteFile(firecrackerBin, []byte("#!/bin/sh\nexit 0\n"), 0o755); err != nil { - t.Fatalf("write %s: %v", firecrackerBin, err) - } - runner := &scriptedRunner{ - t: t, - steps: []runnerStep{ - {call: runnerCall{name: "ip", args: []string{"route", "show", "default"}}, out: []byte("default via 192.0.2.1 dev eth0\n")}, - }, - } - - var buf bytes.Buffer - logger, _, err := newDaemonLogger(&buf, "info") - if err != nil { - t.Fatalf("newDaemonLogger: %v", err) - } - baseImage := model.Image{ - ID: "base-image", - Name: "base-image", - RootfsPath: baseRootfs, - KernelPath: kernelPath, - CreatedAt: model.Now(), - UpdatedAt: model.Now(), - } - if err := store.UpsertImage(ctx, baseImage); err != nil { - t.Fatalf("UpsertImage(base): %v", err) - } - d := &Daemon{ - layout: paths.Layout{ - StateDir: stateDir, - ImagesDir: imagesDir, - }, - config: model.DaemonConfig{ - DefaultImageName: "base-image", - SSHKeyPath: sshKeyPath, - FirecrackerBin: firecrackerBin, - }, - store: store, - runner: runner, - logger: logger, - imageBuild: func(ctx context.Context, spec imageBuildSpec) error { - if _, err := fmt.Fprintln(spec.BuildLog, "builder-stdout"); err != nil { - return err - } - if spec.SourceRootfs != baseRootfs || spec.KernelPath == kernelPath || len(spec.Packages) == 0 { - t.Fatalf("unexpected image build spec: %+v", spec) - } - return errors.New("builder failed") - }, - } - - _, err = d.BuildImage(ctx, api.ImageBuildParams{ - Name: "broken-image", - FromImage: baseImage.Name, - KernelPath: kernelPath, - }) - if err == nil || !strings.Contains(err.Error(), "inspect ") { - t.Fatalf("BuildImage() error = %v, want build log hint", err) - } - - buildLogs, globErr := filepath.Glob(filepath.Join(stateDir, "image-build", "*.log")) - if globErr != nil { - t.Fatalf("glob build logs: %v", globErr) - } - if len(buildLogs) != 1 { - t.Fatalf("build log count = %d, want 1", len(buildLogs)) - } - logData, readErr := os.ReadFile(buildLogs[0]) - if readErr != nil { - t.Fatalf("read build log: %v", readErr) - } - if !strings.Contains(string(logData), "builder-stdout") { - t.Fatalf("build log = %q, want builder output", string(logData)) - } - runner.assertExhausted() - - entries := parseLogEntries(t, buf.Bytes()) - if !hasLogEntry(entries, map[string]string{"msg": "operation stage", "operation": "image.build", "stage": "launch_builder"}) { - t.Fatalf("expected launch_builder log, got %v", entries) - } - if !strings.Contains(buf.String(), buildLogs[0]) { - t.Fatalf("daemon logs = %q, want build log path %s", buf.String(), buildLogs[0]) - } -} - func parseLogEntries(t *testing.T, data []byte) []map[string]any { t.Helper() lines := bytes.Split(bytes.TrimSpace(data), []byte("\n")) diff --git a/internal/daemon/nat.go b/internal/daemon/nat.go index e38f6a3..2b3a7f0 100644 --- a/internal/daemon/nat.go +++ b/internal/daemon/nat.go @@ -10,30 +10,55 @@ import ( type natRule = hostnat.Rule -func (d *Daemon) ensureNAT(ctx context.Context, vm model.VMRecord, enable bool) error { - return hostnat.Ensure(ctx, d.runner, vm.Runtime.GuestIP, vm.Runtime.TapDevice, enable) +// ensureNAT takes tap explicitly rather than reading from a handle +// cache so HostNetwork stays decoupled from VM-service state. +// Callers (vm_lifecycle) resolve the tap device from the handle cache +// themselves and pass it in. +func (n *HostNetwork) ensureNAT(ctx context.Context, guestIP, tap string, enable bool) error { + return n.privOps().EnsureNAT(ctx, guestIP, tap, enable) } -func (d *Daemon) validateNATPrereqs(ctx context.Context) (string, error) { +func (n *HostNetwork) validateNATPrereqs(ctx context.Context) (string, error) { checks := system.NewPreflight() checks.RequireCommand("ip", toolHint("ip")) - d.addNATPrereqs(ctx, checks) + n.addNATPrereqs(ctx, checks) if err := checks.Err("nat preflight failed"); err != nil { return "", err } - return d.defaultUplink(ctx) + return n.defaultUplink(ctx) } -func (d *Daemon) defaultUplink(ctx context.Context) (string, error) { - return hostnat.DefaultUplink(ctx, d.runner) +func (n *HostNetwork) addNATPrereqs(ctx context.Context, checks *system.Preflight) { + checks.RequireCommand("iptables", toolHint("iptables")) + checks.RequireCommand("sysctl", toolHint("sysctl")) + runner := n.runner + if runner == nil { + runner = system.NewRunner() + } + out, err := runner.Run(ctx, "ip", "route", "show", "default") + if err != nil { + checks.Addf("failed to inspect the default route for NAT: %v", err) + return + } + if _, err := parseDefaultUplink(string(out)); err != nil { + checks.Addf("failed to detect the uplink interface for NAT: %v", err) + } +} + +func (n *HostNetwork) defaultUplink(ctx context.Context) (string, error) { + return hostnat.DefaultUplink(ctx, n.runner) } func parseDefaultUplink(output string) (string, error) { return hostnat.ParseDefaultUplink(output) } -func natRulesForVM(vm model.VMRecord, uplink string) ([]natRule, error) { - return hostnat.Rules(vm.Runtime.GuestIP, vm.Runtime.TapDevice, uplink) +// natRulesForVM builds the iptables rule set for vm + tap + uplink. +// tap is passed explicitly (rather than read from a handle cache) +// because natRulesForVM has no Daemon receiver — it's usable from +// test helpers that build rule expectations without a daemon. +func natRulesForVM(vm model.VMRecord, tap, uplink string) ([]natRule, error) { + return hostnat.Rules(vm.Runtime.GuestIP, tap, uplink) } func natRuleArgs(action string, rule natRule) []string { diff --git a/internal/daemon/nat_capability_test.go b/internal/daemon/nat_capability_test.go new file mode 100644 index 0000000..f25ea1a --- /dev/null +++ b/internal/daemon/nat_capability_test.go @@ -0,0 +1,199 @@ +package daemon + +import ( + "context" + "path/filepath" + "sync/atomic" + "testing" + "time" + + "banger/internal/model" +) + +// waitForVMAlive polls until VMService.vmAlive reports true for vm or +// t fails out. Bounded so a broken fake can't hang the suite. +func waitForVMAlive(t *testing.T, svc *VMService, vm model.VMRecord) { + t.Helper() + deadline := time.Now().Add(2 * time.Second) + for { + if svc.vmAlive(vm) { + return + } + if time.Now().After(deadline) { + t.Fatal("fake firecracker never became alive per VMService.vmAlive") + } + time.Sleep(5 * time.Millisecond) + } +} + +// countingRunner records Run/RunSudo invocations without caring about +// the specific commands. Good enough for tests that want to assert +// "did the nat capability reach the host at all?" — hostnat.Ensure's +// exact iptables/sysctl sequence is covered in the hostnat package +// tests, so we don't re-enumerate it here. +type countingRunner struct { + runs atomic.Int32 + runSudos atomic.Int32 + out []byte + err error +} + +func (r *countingRunner) Run(_ context.Context, _ string, _ ...string) ([]byte, error) { + r.runs.Add(1) + return r.out, r.err +} + +func (r *countingRunner) RunSudo(_ context.Context, _ ...string) ([]byte, error) { + r.runSudos.Add(1) + return r.out, r.err +} + +func (r *countingRunner) total() int32 { return r.runs.Load() + r.runSudos.Load() } + +// natCapabilityFixture wires just enough daemon state for natCapability +// tests: a HostNetwork + VMService with a countingRunner, a VM record +// whose handles carry a tap device, and the capability itself. +type natCapabilityFixture struct { + cap natCapability + runner *countingRunner + d *Daemon + vm model.VMRecord +} + +func newNATCapabilityFixture(t *testing.T, natEnabled bool) natCapabilityFixture { + t.Helper() + runner := &countingRunner{out: []byte("default via 10.0.0.1 dev eth0 proto static\n")} + d := &Daemon{ + runner: runner, + config: model.DaemonConfig{BridgeName: model.DefaultBridgeName}, + } + wireServices(d) + d.net.runner = runner + + // A real firecracker-looking subprocess so VMService.vmAlive — which + // reads /proc//cmdline and checks for "firecracker" + the api + // socket path — returns true. Without this the ApplyConfigChange + // "alive vs not alive" branches can't be exercised. + apiSock := filepath.Join(t.TempDir(), "fc.sock") + fc := startFakeFirecracker(t, apiSock) + + vm := testVM("natbox", "image-nat", "172.16.0.42") + vm.Spec.NATEnabled = natEnabled + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{ + PID: fc.Process.Pid, + TapDevice: "tap-nat-42", + }) + + // startFakeFirecracker uses `exec -a firecracker ...` which renames + // the process after Start returns — on a loaded CI box vmAlive can + // observe the pre-exec cmdline ("bash") for a few ms and false- + // negative. Poll until /proc shows the firecracker name so the + // fixture hands back a VM that's definitely "alive" by banger's + // rules. + waitForVMAlive(t, d.vm, vm) + + return natCapabilityFixture{ + cap: newNATCapability(d.vm, d.net, d.logger), + runner: runner, + d: d, + vm: vm, + } +} + +func TestNATCapabilityApplyConfigChange_NoOpWhenFlagUnchanged(t *testing.T) { + f := newNATCapabilityFixture(t, true) + if err := f.cap.ApplyConfigChange(context.Background(), f.vm, f.vm); err != nil { + t.Fatalf("ApplyConfigChange: %v", err) + } + if n := f.runner.total(); n != 0 { + t.Fatalf("runner calls = %d, want 0 when NATEnabled didn't change", n) + } +} + +func TestNATCapabilityApplyConfigChange_NoOpWhenVMNotAlive(t *testing.T) { + f := newNATCapabilityFixture(t, false) + // Clear handles → vmAlive returns false → ApplyConfigChange must + // skip rather than attempt a tap-less ensureNAT. + f.d.vm.clearVMHandles(f.vm) + + after := f.vm + after.Spec.NATEnabled = true + if err := f.cap.ApplyConfigChange(context.Background(), f.vm, after); err != nil { + t.Fatalf("ApplyConfigChange: %v", err) + } + if n := f.runner.total(); n != 0 { + t.Fatalf("runner calls = %d, want 0 when VM is not alive", n) + } +} + +func TestNATCapabilityApplyConfigChange_TogglesEnsureNATWhenAlive(t *testing.T) { + f := newNATCapabilityFixture(t, false) + after := f.vm + after.Spec.NATEnabled = true + if err := f.cap.ApplyConfigChange(context.Background(), f.vm, after); err != nil { + t.Fatalf("ApplyConfigChange: %v", err) + } + if n := f.runner.total(); n == 0 { + t.Fatal("runner calls = 0, want ensureNAT to reach the host when toggling NAT on a running VM") + } +} + +func TestNATCapabilityCleanup_NoOpWhenNATDisabled(t *testing.T) { + f := newNATCapabilityFixture(t, false) + if err := f.cap.Cleanup(context.Background(), f.vm); err != nil { + t.Fatalf("Cleanup: %v", err) + } + if n := f.runner.total(); n != 0 { + t.Fatalf("runner calls = %d, want 0 when NAT was never enabled", n) + } +} + +func TestNATCapabilityCleanup_NoOpWhenRuntimeHandlesMissing(t *testing.T) { + f := newNATCapabilityFixture(t, true) + // Runtime tap device becomes empty — simulates a VM that failed + // before host wiring completed, so Cleanup has nothing to revert. + f.d.vm.clearVMHandles(f.vm) + + if err := f.cap.Cleanup(context.Background(), f.vm); err != nil { + t.Fatalf("Cleanup: %v", err) + } + if n := f.runner.total(); n != 0 { + t.Fatalf("runner calls = %d, want 0 when tap/guestIP are empty", n) + } +} + +func TestNATCapabilityCleanup_ReversesNATWhenRuntimePresent(t *testing.T) { + f := newNATCapabilityFixture(t, true) + if err := f.cap.Cleanup(context.Background(), f.vm); err != nil { + t.Fatalf("Cleanup: %v", err) + } + if n := f.runner.total(); n == 0 { + t.Fatal("runner calls = 0, want ensureNAT(false) to execute when runtime wiring exists") + } +} + +// TestNATCapabilityCleanup_FallsBackToRuntimeTapDevice simulates the +// post-crash / corrupt-handles.json scenario: the in-memory handle +// cache is empty, but the DB-backed VM.Runtime still carries the +// tap name (startVMLocked persists it alongside the handle cache). +// Cleanup must use that fallback so the iptables FORWARD rules +// keyed on the tap are actually removed — if Cleanup short-circuits +// the way it did before this fix, those rules leak forever. +func TestNATCapabilityCleanup_FallsBackToRuntimeTapDevice(t *testing.T) { + f := newNATCapabilityFixture(t, true) + // Wipe the handle cache, as if the daemon had just restarted + // against a corrupt (or missing) handles.json. + f.d.vm.clearVMHandles(f.vm) + // But the VM row in the DB still has the tap recorded. + f.vm.Runtime.TapDevice = "tap-nat-42" + + if err := f.cap.Cleanup(context.Background(), f.vm); err != nil { + t.Fatalf("Cleanup: %v", err) + } + if n := f.runner.total(); n == 0 { + t.Fatal("runner calls = 0, want ensureNAT(false) to execute via the Runtime.TapDevice fallback; NAT rules would leak across daemon restarts") + } +} diff --git a/internal/daemon/nat_test.go b/internal/daemon/nat_test.go index d5a01d0..e844e05 100644 --- a/internal/daemon/nat_test.go +++ b/internal/daemon/nat_test.go @@ -33,11 +33,10 @@ func TestNATRulesForVM(t *testing.T) { vm := model.VMRecord{ Runtime: model.VMRuntime{ - GuestIP: "172.16.0.8", - TapDevice: "tap-fc-abcd1234", + GuestIP: "172.16.0.8", }, } - rules, err := natRulesForVM(vm, "wlan0") + rules, err := natRulesForVM(vm, "tap-fc-abcd1234", "wlan0") if err != nil { t.Fatalf("natRulesForVM returned error: %v", err) } @@ -61,30 +60,25 @@ func TestNATRulesForVMRequiresRuntimeData(t *testing.T) { tests := []struct { name string vm model.VMRecord + tap string uplink string }{ { - name: "guest ip", - vm: model.VMRecord{ - Runtime: model.VMRuntime{TapDevice: "tap-fc-abcd1234"}, - }, + name: "guest ip", + vm: model.VMRecord{}, + tap: "tap-fc-abcd1234", uplink: "eth0", }, { - name: "tap", - vm: model.VMRecord{ - Runtime: model.VMRuntime{GuestIP: "172.16.0.8"}, - }, + name: "tap", + vm: model.VMRecord{Runtime: model.VMRuntime{GuestIP: "172.16.0.8"}}, + tap: "", uplink: "eth0", }, { - name: "uplink", - vm: model.VMRecord{ - Runtime: model.VMRuntime{ - GuestIP: "172.16.0.8", - TapDevice: "tap-fc-abcd1234", - }, - }, + name: "uplink", + vm: model.VMRecord{Runtime: model.VMRuntime{GuestIP: "172.16.0.8"}}, + tap: "tap-fc-abcd1234", uplink: "", }, } @@ -93,7 +87,7 @@ func TestNATRulesForVMRequiresRuntimeData(t *testing.T) { tt := tt t.Run(tt.name, func(t *testing.T) { t.Parallel() - if _, err := natRulesForVM(tt.vm, tt.uplink); err == nil { + if _, err := natRulesForVM(tt.vm, tt.tap, tt.uplink); err == nil { t.Fatalf("expected natRulesForVM to fail for missing %s", tt.name) } }) diff --git a/internal/daemon/open_close_test.go b/internal/daemon/open_close_test.go new file mode 100644 index 0000000..feaee22 --- /dev/null +++ b/internal/daemon/open_close_test.go @@ -0,0 +1,146 @@ +package daemon + +import ( + "errors" + "io" + "log/slog" + "sync/atomic" + "testing" + + "banger/internal/model" + "banger/internal/vmdns" +) + +// TestCloseOnPartiallyInitialisedDaemon pins the contract that Open's +// error-path defer relies on: Close must be safe to call when a +// startup step failed before every subsystem was set up. If this +// breaks, `defer d.Close() on err != nil` in Open() starts panicking +// on zero-valued fields. +func TestCloseOnPartiallyInitialisedDaemon(t *testing.T) { + cases := []struct { + name string + build func(t *testing.T) *Daemon + verify func(t *testing.T, d *Daemon) + }{ + { + name: "only store + closing channel (early failure)", + build: func(t *testing.T) *Daemon { + return &Daemon{ + store: openDaemonStore(t), + closing: make(chan struct{}), + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + } + }, + verify: func(t *testing.T, d *Daemon) { + // closing channel should have been closed. + select { + case <-d.closing: + default: + t.Error("closing channel not closed by Close") + } + }, + }, + { + name: "with vmDNS listener (fail after startVMDNS)", + build: func(t *testing.T) *Daemon { + server, err := vmdns.New("127.0.0.1:0", nil) + if err != nil { + skipIfSocketRestricted(t, err) + t.Fatalf("vmdns.New: %v", err) + } + return &Daemon{ + store: openDaemonStore(t), + closing: make(chan struct{}), + net: &HostNetwork{vmDNS: server}, + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + } + }, + verify: func(t *testing.T, d *Daemon) { + if d.net.vmDNS != nil { + t.Error("vmDNS not cleared by Close") + } + }, + }, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + d := tc.build(t) + if err := d.Close(); err != nil { + t.Fatalf("Close returned error: %v", err) + } + tc.verify(t, d) + + // Second Close must be a no-op (sync.Once) — must not + // panic on channel or re-close. + if err := d.Close(); err != nil { + t.Fatalf("second Close error: %v", err) + } + }) + } +} + +// TestCloseIdempotentUnderConcurrency catches regressions of the +// sync.Once guard that makes repeated Close calls safe. The open- +// failure defer relies on this: if the user cancels before Open +// returns and also calls Close afterwards, both paths must survive. +func TestCloseIdempotentUnderConcurrency(t *testing.T) { + d := &Daemon{ + store: openDaemonStore(t), + closing: make(chan struct{}), + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + config: model.DaemonConfig{BridgeName: ""}, + } + wireServices(d) + + var count atomic.Int32 + done := make(chan struct{}) + for i := 0; i < 5; i++ { + go func() { + if err := d.Close(); err != nil { + t.Errorf("Close error: %v", err) + } + count.Add(1) + if count.Load() == 5 { + close(done) + } + }() + } + <-done + + // Channel must be closed exactly once (sync.Once covers the + // inner close(d.closing)). Reading from a closed channel is + // non-blocking; panicking here would mean the channel wasn't + // closed or was double-closed (close panics are uncatchable). + select { + case <-d.closing: + default: + t.Fatal("closing channel not closed after concurrent Close calls") + } +} + +// TestOpenFailureRunsCloseCleanup is a structural check: confirms +// the deferred rollback in Open actually fires. Can't easily run +// Open() end-to-end (hits paths.Resolve + sudo), but we can simulate +// the pattern by threading a named-return err through the same +// defer and asserting Close runs. +func TestOpenFailureRunsCloseCleanup(t *testing.T) { + closed := false + fakeClose := func() { closed = true } + + runOpen := func() (err error) { + defer func() { + if err != nil { + fakeClose() + } + }() + err = errors.New("simulated late-stage startup failure") + return err + } + + if err := runOpen(); err == nil { + t.Fatal("expected simulated error") + } + if !closed { + t.Fatal("deferred cleanup did not fire on err != nil") + } +} diff --git a/internal/daemon/opencode.go b/internal/daemon/opencode.go deleted file mode 100644 index 791a5e4..0000000 --- a/internal/daemon/opencode.go +++ /dev/null @@ -1,18 +0,0 @@ -package daemon - -import ( - "context" - - "banger/internal/model" - "banger/internal/opencode" -) - -type opencodeCapability struct{} - -func (opencodeCapability) Name() string { return "opencode" } - -func (opencodeCapability) PostStart(ctx context.Context, d *Daemon, vm model.VMRecord, _ model.Image) error { - return opencode.WaitReady(ctx, d.logger, vm.Runtime.VSockPath, func(stage, detail string) { - vmCreateStage(ctx, stage, detail) - }) -} diff --git a/internal/daemon/operations.go b/internal/daemon/operations.go new file mode 100644 index 0000000..00046d1 --- /dev/null +++ b/internal/daemon/operations.go @@ -0,0 +1,37 @@ +package daemon + +import ( + "context" + + "banger/internal/api" +) + +// ListOperations returns a snapshot of every async operation tracked +// across the daemon's per-kind registries. Today the only kind is +// vm.create; future async kinds (image build, kernel pull) will plug +// in here. +// +// The primary consumer is `banger update`'s preflight, which refuses +// to swap binaries while anything is in flight. Done operations are +// included in the snapshot so an operator running an interactive +// `banger ... | jq` can see recently-completed work; the update +// preflight filters by Done itself. +func (d *Daemon) ListOperations(_ context.Context) (api.OperationsListResult, error) { + out := api.OperationsListResult{Operations: []api.OperationSummary{}} + if d.vm == nil { + return out, nil + } + for _, op := range d.vm.createOps.List() { + snap := op.snapshot() + out.Operations = append(out.Operations, api.OperationSummary{ + ID: snap.ID, + Kind: "vm.create", + Stage: snap.Stage, + Detail: snap.Detail, + Done: snap.Done, + StartedAt: snap.StartedAt, + UpdatedAt: snap.UpdatedAt, + }) + } + return out, nil +} diff --git a/internal/daemon/opstate/registry.go b/internal/daemon/opstate/registry.go new file mode 100644 index 0000000..f82ac40 --- /dev/null +++ b/internal/daemon/opstate/registry.go @@ -0,0 +1,75 @@ +// Package opstate provides a mutex-guarded registry for long-running +// operations (e.g. async VM create, async image build). A registry stores +// operations by ID and can prune completed ones after a retention window. +package opstate + +import ( + "sync" + "time" +) + +// AsyncOp is the protocol each operation type must satisfy. Implementations +// own their own concurrency for the returned values — the registry treats +// them as opaque. +type AsyncOp interface { + ID() string + IsDone() bool + UpdatedAt() time.Time + Cancel() +} + +// Registry is a mutex-guarded map of in-flight operations keyed by op ID. +// One registry per operation kind; each owns its own lock. +type Registry[T AsyncOp] struct { + mu sync.Mutex + byID map[string]T +} + +// Insert adds op keyed by its ID. +func (r *Registry[T]) Insert(op T) { + r.mu.Lock() + defer r.mu.Unlock() + if r.byID == nil { + r.byID = map[string]T{} + } + r.byID[op.ID()] = op +} + +// Get returns the operation with the given ID, if present. +func (r *Registry[T]) Get(id string) (T, bool) { + r.mu.Lock() + defer r.mu.Unlock() + op, ok := r.byID[id] + return op, ok +} + +// List returns a snapshot of every operation currently in the +// registry — both pending and (un-pruned) completed. Callers filter +// by IsDone() if they care about state. The slice is freshly +// allocated; mutating it doesn't affect the registry. +// +// Used by `banger update`'s preflight to detect in-flight operations +// before swapping binaries. +func (r *Registry[T]) List() []T { + r.mu.Lock() + defer r.mu.Unlock() + out := make([]T, 0, len(r.byID)) + for _, op := range r.byID { + out = append(out, op) + } + return out +} + +// Prune drops completed operations last updated before the cutoff. +func (r *Registry[T]) Prune(before time.Time) { + r.mu.Lock() + defer r.mu.Unlock() + for id, op := range r.byID { + if !op.IsDone() { + continue + } + if op.UpdatedAt().Before(before) { + delete(r.byID, id) + } + } +} diff --git a/internal/daemon/opstate/registry_test.go b/internal/daemon/opstate/registry_test.go new file mode 100644 index 0000000..d0965c3 --- /dev/null +++ b/internal/daemon/opstate/registry_test.go @@ -0,0 +1,114 @@ +package opstate + +import ( + "sync/atomic" + "testing" + "time" +) + +type fakeOp struct { + id string + done atomic.Bool + updatedAt time.Time + canceled atomic.Bool +} + +func (f *fakeOp) ID() string { return f.id } +func (f *fakeOp) IsDone() bool { return f.done.Load() } +func (f *fakeOp) UpdatedAt() time.Time { return f.updatedAt } +func (f *fakeOp) Cancel() { f.canceled.Store(true) } + +func TestRegistryInsertAndGet(t *testing.T) { + var r Registry[*fakeOp] + op := &fakeOp{id: "op-1", updatedAt: time.Now()} + r.Insert(op) + got, ok := r.Get("op-1") + if !ok { + t.Fatal("Get after Insert missed") + } + if got.ID() != "op-1" { + t.Fatalf("Get().ID = %q", got.ID()) + } + + _, ok = r.Get("missing") + if ok { + t.Fatal("Get on missing key should miss") + } +} + +func TestRegistryPruneDropsCompletedOldOps(t *testing.T) { + var r Registry[*fakeOp] + now := time.Now() + + recent := &fakeOp{id: "recent", updatedAt: now} + recent.done.Store(true) + + stale := &fakeOp{id: "stale", updatedAt: now.Add(-time.Hour)} + stale.done.Store(true) + + pending := &fakeOp{id: "pending", updatedAt: now.Add(-time.Hour)} + // NOT done → stays even though old. + + r.Insert(recent) + r.Insert(stale) + r.Insert(pending) + + cutoff := now.Add(-time.Minute) + r.Prune(cutoff) + + if _, ok := r.Get("stale"); ok { + t.Error("stale op should have been pruned") + } + if _, ok := r.Get("recent"); !ok { + t.Error("recent op should survive (newer than cutoff)") + } + if _, ok := r.Get("pending"); !ok { + t.Error("pending op should survive (not done)") + } +} + +func TestRegistryListReturnsSnapshot(t *testing.T) { + var r Registry[*fakeOp] + now := time.Now() + + a := &fakeOp{id: "a", updatedAt: now} + b := &fakeOp{id: "b", updatedAt: now} + c := &fakeOp{id: "c", updatedAt: now} + c.done.Store(true) + r.Insert(a) + r.Insert(b) + r.Insert(c) + + got := r.List() + if len(got) != 3 { + t.Fatalf("List() returned %d entries, want 3", len(got)) + } + ids := map[string]bool{} + for _, op := range got { + ids[op.ID()] = true + } + for _, want := range []string{"a", "b", "c"} { + if !ids[want] { + t.Errorf("List() missing %q; got %v", want, ids) + } + } + + // Mutating the returned slice must not poison the registry. + got[0] = &fakeOp{id: "tampered"} + if _, ok := r.Get("tampered"); ok { + t.Error("List() returned the registry's internal map, not a copy") + } +} + +func TestRegistryListEmpty(t *testing.T) { + var r Registry[*fakeOp] + if got := r.List(); len(got) != 0 { + t.Fatalf("List() on empty registry returned %d entries, want 0", len(got)) + } +} + +func TestRegistryPruneNoOpOnEmpty(t *testing.T) { + var r Registry[*fakeOp] + // Just shouldn't panic. + r.Prune(time.Now()) +} diff --git a/internal/daemon/ports.go b/internal/daemon/ports.go deleted file mode 100644 index 0c472f0..0000000 --- a/internal/daemon/ports.go +++ /dev/null @@ -1,165 +0,0 @@ -package daemon - -import ( - "context" - "crypto/tls" - "errors" - "fmt" - "io" - "net" - "net/http" - "sort" - "strconv" - "strings" - "time" - - "banger/internal/api" - "banger/internal/model" - "banger/internal/system" - "banger/internal/vmdns" - "banger/internal/vsockagent" -) - -const httpProbeTimeout = 750 * time.Millisecond - -func (d *Daemon) PortsVM(ctx context.Context, idOrName string) (result api.VMPortsResult, err error) { - _, err = d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - result.Name = vm.Name - result.DNSName = strings.TrimSpace(vm.Runtime.DNSName) - if result.DNSName == "" && strings.TrimSpace(vm.Name) != "" { - result.DNSName = vmdns.RecordName(vm.Name) - } - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - return model.VMRecord{}, fmt.Errorf("vm %s is not running", vm.Name) - } - if strings.TrimSpace(vm.Runtime.GuestIP) == "" { - return model.VMRecord{}, errors.New("vm has no guest IP") - } - if strings.TrimSpace(vm.Runtime.VSockPath) == "" { - return model.VMRecord{}, errors.New("vm has no vsock path") - } - if vm.Runtime.VSockCID == 0 { - return model.VMRecord{}, errors.New("vm has no vsock cid") - } - if err := d.ensureSocketAccess(ctx, vm.Runtime.VSockPath, "firecracker vsock socket"); err != nil { - return model.VMRecord{}, err - } - portsCtx, cancel := context.WithTimeout(ctx, 3*time.Second) - defer cancel() - listeners, err := vsockagent.Ports(portsCtx, d.logger, vm.Runtime.VSockPath) - if err != nil { - return model.VMRecord{}, err - } - result.Ports = buildVMPorts(vm, listeners) - return vm, nil - }) - return result, err -} - -func buildVMPorts(vm model.VMRecord, listeners []vsockagent.PortListener) []api.VMPort { - endpointHost := strings.TrimSpace(vm.Runtime.DNSName) - if endpointHost == "" { - endpointHost = strings.TrimSpace(vm.Runtime.GuestIP) - } - probeHost := strings.TrimSpace(vm.Runtime.GuestIP) - ports := make([]api.VMPort, 0, len(listeners)) - for _, listener := range listeners { - if listener.Port <= 0 { - continue - } - port := api.VMPort{ - Proto: strings.ToLower(strings.TrimSpace(listener.Proto)), - BindAddress: strings.TrimSpace(listener.BindAddress), - Port: listener.Port, - PID: listener.PID, - Process: strings.TrimSpace(listener.Process), - Command: strings.TrimSpace(listener.Command), - Endpoint: net.JoinHostPort(endpointHost, strconv.Itoa(listener.Port)), - } - if port.Command == "" { - port.Command = port.Process - } - if port.Proto == "tcp" && probeHost != "" && endpointHost != "" { - if scheme, ok := probeWebListener(probeHost, listener.Port); ok { - port.Proto = scheme - port.Endpoint = scheme + "://" + net.JoinHostPort(endpointHost, strconv.Itoa(listener.Port)) + "/" - } - } - ports = append(ports, port) - } - sort.Slice(ports, func(i, j int) bool { - if ports[i].Proto != ports[j].Proto { - return ports[i].Proto < ports[j].Proto - } - if ports[i].Port != ports[j].Port { - return ports[i].Port < ports[j].Port - } - if ports[i].PID != ports[j].PID { - return ports[i].PID < ports[j].PID - } - if ports[i].Process != ports[j].Process { - return ports[i].Process < ports[j].Process - } - if ports[i].Command != ports[j].Command { - return ports[i].Command < ports[j].Command - } - return ports[i].BindAddress < ports[j].BindAddress - }) - return dedupeVMPorts(ports) -} - -func probeWebListener(guestIP string, port int) (string, bool) { - if probeHTTPScheme("https", guestIP, port) { - return "https", true - } - if probeHTTPScheme("http", guestIP, port) { - return "http", true - } - return "", false -} - -func probeHTTPScheme(scheme, guestIP string, port int) bool { - if strings.TrimSpace(guestIP) == "" || port <= 0 { - return false - } - url := scheme + "://" + net.JoinHostPort(strings.TrimSpace(guestIP), strconv.Itoa(port)) + "/" - req, err := http.NewRequest(http.MethodGet, url, nil) - if err != nil { - return false - } - transport := &http.Transport{Proxy: nil} - if scheme == "https" { - transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true} - } - client := &http.Client{ - Timeout: httpProbeTimeout, - CheckRedirect: func(req *http.Request, via []*http.Request) error { - return http.ErrUseLastResponse - }, - Transport: transport, - } - resp, err := client.Do(req) - if err != nil { - return false - } - defer resp.Body.Close() - _, _ = io.Copy(io.Discard, io.LimitReader(resp.Body, 1)) - return resp.ProtoMajor >= 1 -} - -func dedupeVMPorts(ports []api.VMPort) []api.VMPort { - if len(ports) < 2 { - return ports - } - deduped := make([]api.VMPort, 0, len(ports)) - seen := make(map[string]struct{}, len(ports)) - for _, port := range ports { - key := port.Proto + "\x00" + port.Endpoint - if _, ok := seen[key]; ok { - continue - } - seen[key] = struct{}{} - deduped = append(deduped, port) - } - return deduped -} diff --git a/internal/daemon/preflight.go b/internal/daemon/preflight.go index 0d3c251..b058815 100644 --- a/internal/daemon/preflight.go +++ b/internal/daemon/preflight.go @@ -8,22 +8,20 @@ import ( "banger/internal/system" ) -var vsockHostDevicePath = "/dev/vhost-vsock" +// defaultVsockHostDevice is the vhost-vsock device file every +// Firecracker guest relies on to talk to the host via vsock. Tests +// point at a tempfile by setting VMService.vsockHostDevice; production +// wiring defaults the field to this path in wireServices. +const defaultVsockHostDevice = "/dev/vhost-vsock" -func (d *Daemon) validateStartPrereqs(ctx context.Context, vm model.VMRecord, image model.Image) error { +func (s *VMService) validateStartPrereqs(ctx context.Context, vm model.VMRecord, image model.Image) error { checks := system.NewPreflight() - d.addBaseStartPrereqs(checks, image) - d.addCapabilityStartPrereqs(ctx, checks, vm, image) + s.addBaseStartPrereqs(checks, image) + s.capHooks.addStartPrereqs(ctx, checks, vm, image) return checks.Err("vm start preflight failed") } -func (d *Daemon) validateImageBuildPrereqs(ctx context.Context, baseRootfs, kernelPath, initrdPath, modulesDir, sizeSpec string) error { - checks := system.NewPreflight() - d.addImageBuildPrereqs(ctx, checks, baseRootfs, kernelPath, initrdPath, modulesDir, sizeSpec) - return checks.Err("image build preflight failed") -} - -func (d *Daemon) validateWorkDiskResizePrereqs() error { +func (s *VMService) validateWorkDiskResizePrereqs() error { checks := system.NewPreflight() checks.RequireCommand("truncate", toolHint("truncate")) checks.RequireCommand("e2fsck", `install e2fsprogs`) @@ -31,32 +29,15 @@ func (d *Daemon) validateWorkDiskResizePrereqs() error { return checks.Err("work disk resize preflight failed") } -func (d *Daemon) addNATPrereqs(ctx context.Context, checks *system.Preflight) { - checks.RequireCommand("iptables", toolHint("iptables")) - checks.RequireCommand("sysctl", toolHint("sysctl")) - runner := d.runner - if runner == nil { - runner = system.NewRunner() - } - out, err := runner.Run(ctx, "ip", "route", "show", "default") - if err != nil { - checks.Addf("failed to inspect the default route for NAT: %v", err) - return - } - if _, err := parseDefaultUplink(string(out)); err != nil { - checks.Addf("failed to detect the uplink interface for NAT: %v", err) - } -} - -func (d *Daemon) addBaseStartPrereqs(checks *system.Preflight, image model.Image) { - d.addBaseStartCommandPrereqs(checks) - checks.RequireExecutable(d.config.FirecrackerBin, "firecracker binary", `install firecracker or set "firecracker_bin"`) - if helper, err := d.vsockAgentBinary(); err == nil { +func (s *VMService) addBaseStartPrereqs(checks *system.Preflight, image model.Image) { + s.addBaseStartCommandPrereqs(checks) + checks.RequireExecutable(s.config.FirecrackerBin, "firecracker binary", `install firecracker or set "firecracker_bin"`) + if helper, err := vsockAgentBinary(s.layout); err == nil { checks.RequireExecutable(helper, "vsock agent helper", `run 'make build' or reinstall banger`) } else { checks.Addf("%v", err) } - checks.RequireFile(vsockHostDevicePath, "vsock host device", "load the vhost_vsock kernel module on the host") + checks.RequireFile(s.vsockHostDevice, "vsock host device", "load the vhost_vsock kernel module on the host") checks.RequireFile(image.RootfsPath, "rootfs image", "select a valid registered image") checks.RequireFile(image.KernelPath, "kernel image", `re-register or rebuild the image with a valid kernel`) if strings.TrimSpace(image.InitrdPath) != "" { @@ -64,41 +45,12 @@ func (d *Daemon) addBaseStartPrereqs(checks *system.Preflight, image model.Image } } -func (d *Daemon) addBaseStartCommandPrereqs(checks *system.Preflight) { - for _, command := range []string{"sudo", "ip", "dmsetup", "losetup", "blockdev", "truncate", "pgrep", "chown", "chmod", "kill", "e2cp", "e2rm", "debugfs"} { +func (s *VMService) addBaseStartCommandPrereqs(checks *system.Preflight) { + for _, command := range []string{"ip", "dmsetup", "losetup", "blockdev", "truncate", "pgrep", "chown", "chmod", "kill", "e2cp", "e2rm", "debugfs"} { checks.RequireCommand(command, toolHint(command)) } } -func (d *Daemon) addImageBuildPrereqs(ctx context.Context, checks *system.Preflight, baseRootfs, kernelPath, initrdPath, modulesDir, sizeSpec string) { - for _, command := range []string{"sudo", "ip", "pgrep", "chown", "chmod", "kill"} { - checks.RequireCommand(command, toolHint(command)) - } - for _, command := range []string{"mkfs.ext4", "mount", "umount", "cp"} { - checks.RequireCommand(command, toolHint(command)) - } - checks.RequireExecutable(d.config.FirecrackerBin, "firecracker binary", `install firecracker or set "firecracker_bin"`) - checks.RequireFile(d.config.SSHKeyPath, "ssh private key", `set "ssh_key_path" or let banger create its default key`) - if helper, err := d.vsockAgentBinary(); err == nil { - checks.RequireExecutable(helper, "vsock agent helper", `run 'make build' or reinstall banger`) - } else { - checks.Addf("%v", err) - } - checks.RequireFile(baseRootfs, "base image rootfs", `pass --from-image with a valid registered image`) - checks.RequireFile(kernelPath, "kernel image", `pass --kernel or build from an image with a valid kernel`) - if strings.TrimSpace(initrdPath) != "" { - checks.RequireFile(initrdPath, "initrd image", `pass --initrd or build from an image with a valid initrd`) - } - if strings.TrimSpace(modulesDir) != "" { - checks.RequireDir(modulesDir, "modules directory", `pass --modules or build from an image with a valid modules dir`) - } - if strings.TrimSpace(sizeSpec) != "" { - checks.RequireCommand("e2fsck", toolHint("e2fsck")) - checks.RequireCommand("resize2fs", toolHint("resize2fs")) - } - d.addNATPrereqs(ctx, checks) -} - func toolHint(command string) string { switch command { case "ip": @@ -117,8 +69,6 @@ func toolHint(command string) string { return "install e2fsprogs" case "e2cp", "e2rm": return "install e2tools" - case "sudo": - return "install sudo" default: return "" } diff --git a/internal/daemon/privileged_ops.go b/internal/daemon/privileged_ops.go new file mode 100644 index 0000000..6d498c6 --- /dev/null +++ b/internal/daemon/privileged_ops.go @@ -0,0 +1,527 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "log/slog" + "os" + "path/filepath" + "strconv" + "strings" + "syscall" + + "banger/internal/daemon/dmsnap" + "banger/internal/daemon/fcproc" + "banger/internal/firecracker" + "banger/internal/hostnat" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/roothelper" + "banger/internal/system" +) + +type privilegedOps interface { + EnsureBridge(context.Context) error + CreateTap(context.Context, string) error + DeleteTap(context.Context, string) error + SyncResolverRouting(context.Context, string) error + ClearResolverRouting(context.Context) error + EnsureNAT(context.Context, string, string, bool) error + CreateDMSnapshot(context.Context, string, string, string) (dmSnapshotHandles, error) + CleanupDMSnapshot(context.Context, dmSnapshotHandles) error + RemoveDMSnapshot(context.Context, string) error + FsckSnapshot(context.Context, string) error + ReadExt4File(context.Context, string, string) ([]byte, error) + WriteExt4Files(context.Context, string, []roothelper.Ext4Write) error + ResolveFirecrackerBinary(context.Context, string) (string, error) + LaunchFirecracker(context.Context, roothelper.FirecrackerLaunchRequest) (int, error) + EnsureSocketAccess(context.Context, string, string) error + FindFirecrackerPID(context.Context, string) (int, error) + KillProcess(context.Context, int) error + SignalProcess(context.Context, int, string) error + ProcessRunning(context.Context, int, string) (bool, error) + CleanupJailerChroot(context.Context, string) error +} + +type localPrivilegedOps struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + clientUID int + clientGID int +} + +func (n *HostNetwork) privOps() privilegedOps { + if n.priv == nil { + n.priv = newLocalPrivilegedOps(n.runner, n.logger, n.config, n.layout, os.Getuid(), os.Getgid()) + } + return n.priv +} + +func (s *VMService) privOps() privilegedOps { + if s.priv == nil { + s.priv = newLocalPrivilegedOps(s.runner, s.logger, s.config, s.layout, os.Getuid(), os.Getgid()) + } + return s.priv +} + +func newLocalPrivilegedOps(runner system.CommandRunner, logger *slog.Logger, cfg model.DaemonConfig, layout paths.Layout, clientUID, clientGID int) privilegedOps { + if clientUID < 0 { + clientUID = os.Getuid() + } + if clientGID < 0 { + clientGID = os.Getgid() + } + return &localPrivilegedOps{ + runner: runner, + logger: logger, + config: cfg, + layout: layout, + clientUID: clientUID, + clientGID: clientGID, + } +} + +func (o *localPrivilegedOps) EnsureBridge(ctx context.Context) error { + return o.fc().EnsureBridge(ctx) +} + +func (o *localPrivilegedOps) CreateTap(ctx context.Context, tapName string) error { + return o.fc().CreateTapOwned(ctx, tapName, o.clientUID, o.clientGID) +} + +func (o *localPrivilegedOps) DeleteTap(ctx context.Context, tapName string) error { + _, err := o.runner.RunSudo(ctx, "ip", "link", "del", tapName) + return err +} + +func (o *localPrivilegedOps) SyncResolverRouting(ctx context.Context, serverAddr string) error { + if strings.TrimSpace(o.config.BridgeName) == "" || strings.TrimSpace(serverAddr) == "" { + return nil + } + if _, err := system.LookupExecutable("resolvectl"); err != nil { + return nil + } + if _, err := o.runner.RunSudo(ctx, "resolvectl", "dns", o.config.BridgeName, serverAddr); err != nil { + return err + } + if _, err := o.runner.RunSudo(ctx, "resolvectl", "domain", o.config.BridgeName, vmResolverRouteDomain); err != nil { + return err + } + _, err := o.runner.RunSudo(ctx, "resolvectl", "default-route", o.config.BridgeName, "no") + return err +} + +func (o *localPrivilegedOps) ClearResolverRouting(ctx context.Context) error { + if strings.TrimSpace(o.config.BridgeName) == "" { + return nil + } + if _, err := system.LookupExecutable("resolvectl"); err != nil { + return nil + } + _, err := o.runner.RunSudo(ctx, "resolvectl", "revert", o.config.BridgeName) + return err +} + +func (o *localPrivilegedOps) EnsureNAT(ctx context.Context, guestIP, tap string, enable bool) error { + return hostnat.Ensure(ctx, o.runner, guestIP, tap, enable) +} + +func (o *localPrivilegedOps) CreateDMSnapshot(ctx context.Context, rootfsPath, cowPath, dmName string) (dmSnapshotHandles, error) { + return dmsnap.Create(ctx, o.runner, rootfsPath, cowPath, dmName) +} + +func (o *localPrivilegedOps) CleanupDMSnapshot(ctx context.Context, handles dmSnapshotHandles) error { + return dmsnap.Cleanup(ctx, o.runner, handles) +} + +func (o *localPrivilegedOps) RemoveDMSnapshot(ctx context.Context, target string) error { + return dmsnap.Remove(ctx, o.runner, target) +} + +func (o *localPrivilegedOps) FsckSnapshot(ctx context.Context, dmDev string) error { + if _, err := o.runner.RunSudo(ctx, "e2fsck", "-fy", dmDev); err != nil { + if code := system.ExitCode(err); code < 0 || code > 1 { + return err + } + } + return nil +} + +func (o *localPrivilegedOps) ReadExt4File(ctx context.Context, imagePath, guestPath string) ([]byte, error) { + return system.ReadExt4File(ctx, o.runner, imagePath, guestPath) +} + +func (o *localPrivilegedOps) WriteExt4Files(ctx context.Context, imagePath string, files []roothelper.Ext4Write) error { + for _, file := range files { + mode := os.FileMode(file.Mode) + if mode == 0 { + mode = 0o644 + } + if err := system.WriteExt4FileOwned(ctx, o.runner, imagePath, file.GuestPath, mode, 0, 0, file.Data); err != nil { + return err + } + } + return nil +} + +func (o *localPrivilegedOps) ResolveFirecrackerBinary(_ context.Context, requested string) (string, error) { + manager := fcproc.New(o.runner, fcproc.Config{FirecrackerBin: normalizeFirecrackerBinary(requested, o.config.FirecrackerBin)}, o.logger) + return manager.ResolveBinary() +} + +func (o *localPrivilegedOps) LaunchFirecracker(ctx context.Context, req roothelper.FirecrackerLaunchRequest) (int, error) { + mc, err := o.buildLaunchMachineConfig(ctx, req) + if err != nil { + return 0, err + } + // Symlink before Start: with jailer the actual API socket lives at + // `/firecracker.socket` (~120+ bytes — over the AF_UNIX + // sun_path limit of 108). The SDK's waitForSocket and connect(2) + // would EINVAL on the long path. Pre-creating the symlink at the + // short req.SocketPath lets the SDK poll/connect via the short + // path; the kernel only enforces sun_path on the path you pass, + // not on the resolved target. + if err := o.exposeJailerSockets(req); err != nil { + return 0, fmt.Errorf("expose jailer sockets: %w", err) + } + machine, err := firecracker.NewMachine(ctx, mc) + if err != nil { + return 0, err + } + chownDone := o.maybeChownSockets(ctx, req, mc) + startErr := machine.Start(ctx) + chownErr := <-chownDone + if startErr != nil { + if pid := o.fc().ResolvePID(context.Background(), machine, mc.SocketPath); pid > 0 { + _ = o.KillProcess(context.Background(), pid) + } + return 0, startErr + } + if chownErr != nil { + return 0, chownErr + } + if req.Jailer == nil { + // Belt-and-suspenders for the legacy direct-firecracker path. + // The jailer path doesn't need this — firecracker drops to the + // configured uid before creating the socket. + if err := o.EnsureSocketAccess(ctx, mc.SocketPath, "firecracker api socket"); err != nil { + return 0, err + } + if strings.TrimSpace(mc.VSockPath) != "" { + if err := o.EnsureSocketAccess(ctx, mc.VSockPath, "firecracker vsock socket"); err != nil { + return 0, err + } + } + } + pid := o.fc().ResolvePID(context.Background(), machine, mc.SocketPath) + if pid <= 0 { + return 0, errors.New("firecracker started but pid could not be resolved") + } + return pid, nil +} + +// maybeChownSockets runs the post-Start sudo-chown race only on the legacy +// direct-firecracker path. With the jailer the firecracker process is +// already running as the configured uid before it creates the socket, so +// no chown is needed (and chown on the symlink would tweak the symlink's +// metadata — not the target's — anyway). +func (o *localPrivilegedOps) maybeChownSockets(ctx context.Context, req roothelper.FirecrackerLaunchRequest, mc firecracker.MachineConfig) <-chan error { + if req.Jailer != nil { + ch := make(chan error, 1) + ch <- nil + close(ch) + return ch + } + return o.fc().EnsureSocketAccessForAsync(ctx, []string{mc.SocketPath, mc.VSockPath}, o.clientUID, o.clientGID) +} + +// buildLaunchMachineConfig mirrors the helper-side equivalent: when jailer +// is enabled, stage the chroot tree and rewrite the path fields to their +// chroot-translated form (host-visible for sockets, chroot-internal for +// kernel/drives — see firecracker.MachineConfig.Jailer doc). +func (o *localPrivilegedOps) buildLaunchMachineConfig(ctx context.Context, req roothelper.FirecrackerLaunchRequest) (firecracker.MachineConfig, error) { + mc := firecracker.MachineConfig{ + BinaryPath: req.BinaryPath, + VMID: req.VMID, + SocketPath: req.SocketPath, + LogPath: req.LogPath, + MetricsPath: req.MetricsPath, + KernelImagePath: req.KernelImagePath, + InitrdPath: req.InitrdPath, + KernelArgs: req.KernelArgs, + Drives: req.Drives, + TapDevice: req.TapDevice, + VSockPath: req.VSockPath, + VSockCID: req.VSockCID, + VCPUCount: req.VCPUCount, + MemoryMiB: req.MemoryMiB, + Logger: o.logger, + } + if req.Jailer == nil { + return mc, nil + } + chrootRoot := firecracker.JailerChrootRoot(req.Jailer.ChrootBaseDir, req.VMID) + driveSpecs := make([]fcproc.ChrootDriveSpec, 0, len(req.Drives)) + chrootDrives := make([]firecracker.DriveConfig, 0, len(req.Drives)) + for _, d := range req.Drives { + name := chrootDriveName(d) + driveSpecs = append(driveSpecs, fcproc.ChrootDriveSpec{ChrootName: name, HostPath: d.Path}) + chrootDrives = append(chrootDrives, firecracker.DriveConfig{ + ID: d.ID, + Path: "/" + name, + ReadOnly: d.ReadOnly, + IsRoot: d.IsRoot, + }) + } + wantVSock := strings.TrimSpace(req.VSockPath) != "" + if err := o.fc().PrepareJailerChroot(ctx, chrootRoot, + req.Jailer.UID, req.Jailer.GID, + req.BinaryPath, + req.KernelImagePath, "vmlinux", + req.InitrdPath, "initrd", + driveSpecs, wantVSock, + ); err != nil { + return firecracker.MachineConfig{}, fmt.Errorf("prepare jailer chroot: %w", err) + } + // SocketPath stays the short request path: the SDK polls/connects + // to it via os.Stat / net.Dial("unix", ...), and AF_UNIX sun_path + // is hard-capped at 108 bytes — the actual chroot path is well over + // that. exposeJailerSockets pre-creates the req.SocketPath as a + // symlink whose target is the long chroot socket; the kernel only + // enforces sun_path on the path you hand to connect, not on the + // resolved target. + // + // VSockPath, by contrast, is sent to firecracker via the API and + // resolved from inside the chroot, so it must be the chroot-internal + // path. The host-visible vsock socket is reachable via a symlink + // at req.VSockPath, also installed by exposeJailerSockets. + _ = chrootRoot + if wantVSock { + mc.VSockPath = firecracker.JailerVSockName + } + mc.KernelImagePath = "/vmlinux" + if strings.TrimSpace(req.InitrdPath) != "" { + mc.InitrdPath = "/initrd" + } else { + mc.InitrdPath = "" + } + mc.Drives = chrootDrives + // LogPath stays set so buildProcessRunner's openLogFile captures firecracker + // stderr via cmd.Stderr. buildConfig clears sdk.Config.LogPath for jailer + // mode to avoid PUT /logger with a host path firecracker can't open. + mc.MetricsPath = "" + mc.Jailer = &firecracker.JailerOpts{ + Binary: req.Jailer.Binary, + ChrootBaseDir: req.Jailer.ChrootBaseDir, + UID: req.Jailer.UID, + GID: req.Jailer.GID, + } + return mc, nil +} + +func (o *localPrivilegedOps) exposeJailerSockets(req roothelper.FirecrackerLaunchRequest) error { + if req.Jailer == nil { + return nil + } + chrootRoot := firecracker.JailerChrootRoot(req.Jailer.ChrootBaseDir, req.VMID) + hostAPI := filepath.Join(chrootRoot, strings.TrimPrefix(firecracker.JailerSocketName, "/")) + if err := atomicSymlink(hostAPI, req.SocketPath); err != nil { + return err + } + if strings.TrimSpace(req.VSockPath) != "" { + hostVSock := filepath.Join(chrootRoot, strings.TrimPrefix(firecracker.JailerVSockName, "/")) + if err := atomicSymlink(hostVSock, req.VSockPath); err != nil { + return err + } + } + return nil +} + +// chrootDriveName mirrors the helper-side helper of the same name; kept as +// a free function so both paths produce identical chroot layouts. +func chrootDriveName(d firecracker.DriveConfig) string { + if id := strings.TrimSpace(d.ID); id != "" { + return id + } + return filepath.Base(d.Path) +} + +func atomicSymlink(target, link string) error { + if err := os.Remove(link); err != nil && !os.IsNotExist(err) { + return err + } + return os.Symlink(target, link) +} + +func (o *localPrivilegedOps) EnsureSocketAccess(ctx context.Context, socketPath, label string) error { + return o.fc().EnsureSocketAccessFor(ctx, socketPath, label, o.clientUID, o.clientGID) +} + +func (o *localPrivilegedOps) FindFirecrackerPID(ctx context.Context, apiSock string) (int, error) { + return o.fc().FindPID(ctx, apiSock) +} + +func (o *localPrivilegedOps) KillProcess(ctx context.Context, pid int) error { + return o.fc().Kill(ctx, pid) +} + +func (o *localPrivilegedOps) SignalProcess(ctx context.Context, pid int, signal string) error { + if strings.TrimSpace(signal) == "" { + signal = "TERM" + } + _, err := o.runner.RunSudo(ctx, "kill", "-"+signal, strconv.Itoa(pid)) + return err +} + +func (o *localPrivilegedOps) ProcessRunning(_ context.Context, pid int, apiSock string) (bool, error) { + return system.ProcessRunning(pid, apiSock), nil +} + +func (o *localPrivilegedOps) CleanupJailerChroot(ctx context.Context, chrootRoot string) error { + return o.fc().CleanupJailerChroot(ctx, chrootRoot) +} + +func (o *localPrivilegedOps) fc() *fcproc.Manager { + return fcproc.New(o.runner, fcproc.Config{ + FirecrackerBin: normalizeFirecrackerBinary("", o.config.FirecrackerBin), + BridgeName: o.config.BridgeName, + BridgeIP: o.config.BridgeIP, + CIDR: o.config.CIDR, + RuntimeDir: o.layout.RuntimeDir, + }, o.logger) +} + +type helperPrivilegedOps struct { + client *roothelper.Client + config model.DaemonConfig + layout paths.Layout +} + +func newHelperPrivilegedOps(client *roothelper.Client, cfg model.DaemonConfig, layout paths.Layout) privilegedOps { + return &helperPrivilegedOps{client: client, config: cfg, layout: layout} +} + +func (o *helperPrivilegedOps) EnsureBridge(ctx context.Context) error { + return o.client.EnsureBridge(ctx, o.networkConfig()) +} + +func (o *helperPrivilegedOps) CreateTap(ctx context.Context, tapName string) error { + return o.client.CreateTap(ctx, o.networkConfig(), tapName) +} + +func (o *helperPrivilegedOps) DeleteTap(ctx context.Context, tapName string) error { + return o.client.DeleteTap(ctx, tapName) +} + +func (o *helperPrivilegedOps) SyncResolverRouting(ctx context.Context, serverAddr string) error { + return o.client.SyncResolverRouting(ctx, o.config.BridgeName, serverAddr) +} + +func (o *helperPrivilegedOps) ClearResolverRouting(ctx context.Context) error { + return o.client.ClearResolverRouting(ctx, o.config.BridgeName) +} + +func (o *helperPrivilegedOps) EnsureNAT(ctx context.Context, guestIP, tap string, enable bool) error { + return o.client.EnsureNAT(ctx, guestIP, tap, enable) +} + +func (o *helperPrivilegedOps) CreateDMSnapshot(ctx context.Context, rootfsPath, cowPath, dmName string) (dmSnapshotHandles, error) { + return o.client.CreateDMSnapshot(ctx, rootfsPath, cowPath, dmName) +} + +func (o *helperPrivilegedOps) CleanupDMSnapshot(ctx context.Context, handles dmSnapshotHandles) error { + return o.client.CleanupDMSnapshot(ctx, handles) +} + +func (o *helperPrivilegedOps) RemoveDMSnapshot(ctx context.Context, target string) error { + return o.client.RemoveDMSnapshot(ctx, target) +} + +func (o *helperPrivilegedOps) FsckSnapshot(ctx context.Context, dmDev string) error { + return o.client.FsckSnapshot(ctx, dmDev) +} + +func (o *helperPrivilegedOps) ReadExt4File(ctx context.Context, imagePath, guestPath string) ([]byte, error) { + return o.client.ReadExt4File(ctx, imagePath, guestPath) +} + +func (o *helperPrivilegedOps) WriteExt4Files(ctx context.Context, imagePath string, files []roothelper.Ext4Write) error { + return o.client.WriteExt4Files(ctx, imagePath, files) +} + +func (o *helperPrivilegedOps) ResolveFirecrackerBinary(ctx context.Context, requested string) (string, error) { + return o.client.ResolveFirecrackerBinary(ctx, normalizeFirecrackerBinary(requested, o.config.FirecrackerBin)) +} + +func (o *helperPrivilegedOps) LaunchFirecracker(ctx context.Context, req roothelper.FirecrackerLaunchRequest) (int, error) { + req.Network = o.networkConfig() + pid, err := o.client.LaunchFirecracker(ctx, req) + if err != nil { + return 0, err + } + // The root helper runs with PrivateMounts=yes, so symlinks it creates + // (exposeJailerSockets) are invisible to the daemon's namespace. Re-create + // them here so the daemon can reach the API and vsock sockets. + if req.Jailer != nil { + chrootRoot := firecracker.JailerChrootRoot(req.Jailer.ChrootBaseDir, req.VMID) + hostAPI := filepath.Join(chrootRoot, strings.TrimPrefix(firecracker.JailerSocketName, "/")) + if err := atomicSymlink(hostAPI, req.SocketPath); err != nil { + return 0, fmt.Errorf("api socket symlink: %w", err) + } + if strings.TrimSpace(req.VSockPath) != "" { + hostVSock := filepath.Join(chrootRoot, strings.TrimPrefix(firecracker.JailerVSockName, "/")) + if err := atomicSymlink(hostVSock, req.VSockPath); err != nil { + return 0, fmt.Errorf("vsock symlink: %w", err) + } + } + } + return pid, nil +} + +func (o *helperPrivilegedOps) EnsureSocketAccess(ctx context.Context, socketPath, label string) error { + if info, err := os.Stat(socketPath); err == nil { + if stat, ok := info.Sys().(*syscall.Stat_t); ok && int(stat.Uid) == os.Getuid() { + return os.Chmod(socketPath, 0o600) + } + } + return o.client.EnsureSocketAccess(ctx, socketPath, label) +} + +func (o *helperPrivilegedOps) FindFirecrackerPID(ctx context.Context, apiSock string) (int, error) { + return o.client.FindFirecrackerPID(ctx, apiSock) +} + +func (o *helperPrivilegedOps) KillProcess(ctx context.Context, pid int) error { + return o.client.KillProcess(ctx, pid) +} + +func (o *helperPrivilegedOps) SignalProcess(ctx context.Context, pid int, signal string) error { + return o.client.SignalProcess(ctx, pid, signal) +} + +func (o *helperPrivilegedOps) ProcessRunning(ctx context.Context, pid int, apiSock string) (bool, error) { + return o.client.ProcessRunning(ctx, pid, apiSock) +} + +func (o *helperPrivilegedOps) CleanupJailerChroot(ctx context.Context, chrootRoot string) error { + return o.client.CleanupJailerChroot(ctx, chrootRoot) +} + +func (o *helperPrivilegedOps) networkConfig() roothelper.NetworkConfig { + return roothelper.NetworkConfig{ + BridgeName: o.config.BridgeName, + BridgeIP: o.config.BridgeIP, + CIDR: o.config.CIDR, + } +} + +func normalizeFirecrackerBinary(requested, configured string) string { + requested = strings.TrimSpace(requested) + if requested != "" { + return requested + } + return strings.TrimSpace(configured) +} diff --git a/internal/daemon/runtime_assets.go b/internal/daemon/runtime_assets.go index 16c4cf6..7584d62 100644 --- a/internal/daemon/runtime_assets.go +++ b/internal/daemon/runtime_assets.go @@ -6,7 +6,11 @@ import ( "banger/internal/paths" ) -func (d *Daemon) vsockAgentBinary() (string, error) { +// vsockAgentBinary resolves the companion helper the daemon ships +// alongside its own binary. It's stateless — the signature takes no +// argument so callers on *Daemon / *VMService / doctor all share one +// entry point instead of each owning a forwarder method. +func vsockAgentBinary(_ paths.Layout) (string, error) { path, err := paths.CompanionBinaryPath("banger-vsock-agent") if err != nil { return "", fmt.Errorf("vsock agent helper not available: %w", err) diff --git a/internal/daemon/snapshot.go b/internal/daemon/snapshot.go index f6ce45d..0515b31 100644 --- a/internal/daemon/snapshot.go +++ b/internal/daemon/snapshot.go @@ -2,110 +2,22 @@ package daemon import ( "context" - "errors" - "fmt" - "strings" - "time" + + "banger/internal/daemon/dmsnap" ) -type dmSnapshotHandles struct { - BaseLoop string - COWLoop string - DMName string - DMDev string +// dmSnapshotHandles is retained as a package-local alias for the subpackage +// type so existing call sites and tests read naturally. +type dmSnapshotHandles = dmsnap.Handles + +func (n *HostNetwork) createDMSnapshot(ctx context.Context, rootfsPath, cowPath, dmName string) (dmSnapshotHandles, error) { + return n.privOps().CreateDMSnapshot(ctx, rootfsPath, cowPath, dmName) } -func (d *Daemon) createDMSnapshot(ctx context.Context, rootfsPath, cowPath, dmName string) (handles dmSnapshotHandles, err error) { - defer func() { - if err == nil { - return - } - if cleanupErr := d.cleanupDMSnapshot(context.Background(), handles); cleanupErr != nil { - err = errors.Join(err, cleanupErr) - } - }() - - baseBytes, err := d.runner.RunSudo(ctx, "losetup", "-f", "--show", "--read-only", rootfsPath) - if err != nil { - return handles, err - } - handles.BaseLoop = strings.TrimSpace(string(baseBytes)) - - cowBytes, err := d.runner.RunSudo(ctx, "losetup", "-f", "--show", cowPath) - if err != nil { - return handles, err - } - handles.COWLoop = strings.TrimSpace(string(cowBytes)) - - sectorsBytes, err := d.runner.RunSudo(ctx, "blockdev", "--getsz", handles.BaseLoop) - if err != nil { - return handles, err - } - sectors := strings.TrimSpace(string(sectorsBytes)) - - if _, err := d.runner.RunSudo(ctx, "dmsetup", "create", dmName, "--table", fmt.Sprintf("0 %s snapshot %s %s P 8", sectors, handles.BaseLoop, handles.COWLoop)); err != nil { - return handles, err - } - handles.DMName = dmName - handles.DMDev = "/dev/mapper/" + dmName - return handles, nil +func (n *HostNetwork) cleanupDMSnapshot(ctx context.Context, handles dmSnapshotHandles) error { + return n.privOps().CleanupDMSnapshot(ctx, handles) } -func (d *Daemon) cleanupDMSnapshot(ctx context.Context, handles dmSnapshotHandles) error { - var cleanupErr error - - switch { - case handles.DMName != "": - if err := d.removeDMSnapshot(ctx, handles.DMName); err != nil { - cleanupErr = errors.Join(cleanupErr, err) - } - case handles.DMDev != "": - if err := d.removeDMSnapshot(ctx, handles.DMDev); err != nil { - cleanupErr = errors.Join(cleanupErr, err) - } - } - - if handles.COWLoop != "" { - if _, err := d.runner.RunSudo(ctx, "losetup", "-d", handles.COWLoop); err != nil { - if !isMissingSnapshotHandle(err) { - cleanupErr = errors.Join(cleanupErr, err) - } - } - } - if handles.BaseLoop != "" { - if _, err := d.runner.RunSudo(ctx, "losetup", "-d", handles.BaseLoop); err != nil { - if !isMissingSnapshotHandle(err) { - cleanupErr = errors.Join(cleanupErr, err) - } - } - } - - return cleanupErr -} - -func (d *Daemon) removeDMSnapshot(ctx context.Context, target string) error { - deadline := time.Now().Add(15 * time.Second) - for { - if _, err := d.runner.RunSudo(ctx, "dmsetup", "remove", target); err != nil { - if isMissingSnapshotHandle(err) { - return nil - } - if strings.Contains(err.Error(), "Device or resource busy") && time.Now().Before(deadline) { - time.Sleep(100 * time.Millisecond) - continue - } - return err - } - return nil - } -} - -func isMissingSnapshotHandle(err error) bool { - if err == nil { - return false - } - msg := err.Error() - return strings.Contains(msg, "No such device or address") || - strings.Contains(msg, "not found") || - strings.Contains(msg, "does not exist") +func (n *HostNetwork) removeDMSnapshot(ctx context.Context, target string) error { + return n.privOps().RemoveDMSnapshot(ctx, target) } diff --git a/internal/daemon/snapshot_test.go b/internal/daemon/snapshot_test.go index 2411206..415cda7 100644 --- a/internal/daemon/snapshot_test.go +++ b/internal/daemon/snapshot_test.go @@ -73,8 +73,9 @@ func TestCreateDMSnapshotFailsWithoutRollbackWhenBaseLoopSetupFails(t *testing.T }, } d := &Daemon{runner: runner} + wireServices(d) - _, err := d.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") + _, err := d.net.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") if !errors.Is(err, attachErr) { t.Fatalf("error = %v, want %v", err, attachErr) } @@ -97,8 +98,9 @@ func TestCreateDMSnapshotRollsBackBaseLoopWhenCowLoopSetupFails(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - _, err := d.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") + _, err := d.net.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") if !errors.Is(err, attachErr) { t.Fatalf("error = %v, want %v", err, attachErr) } @@ -120,8 +122,9 @@ func TestCreateDMSnapshotRollsBackBothLoopsWhenBlockdevFails(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - _, err := d.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") + _, err := d.net.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") if !errors.Is(err, blockdevErr) { t.Fatalf("error = %v, want %v", err, blockdevErr) } @@ -144,8 +147,9 @@ func TestCreateDMSnapshotRollsBackLoopsWhenDMSetupFails(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - _, err := d.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") + _, err := d.net.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") if !errors.Is(err, dmErr) { t.Fatalf("error = %v, want %v", err, dmErr) } @@ -173,8 +177,9 @@ func TestCreateDMSnapshotJoinsRollbackErrors(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - _, err := d.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") + _, err := d.net.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") if err == nil { t.Fatal("expected createDMSnapshot to return an error") } @@ -197,8 +202,9 @@ func TestCreateDMSnapshotReturnsHandlesOnSuccess(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - handles, err := d.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") + handles, err := d.net.createDMSnapshot(context.Background(), "/rootfs.ext4", "/cow.ext4", "fc-rootfs-test") if err != nil { t.Fatalf("createDMSnapshot returned error: %v", err) } @@ -226,8 +232,9 @@ func TestCleanupDMSnapshotRemovesResourcesInReverseOrder(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - err := d.cleanupDMSnapshot(context.Background(), dmSnapshotHandles{ + err := d.net.cleanupDMSnapshot(context.Background(), dmSnapshotHandles{ BaseLoop: "/dev/loop10", COWLoop: "/dev/loop11", DMName: "fc-rootfs-test", @@ -250,8 +257,9 @@ func TestCleanupDMSnapshotUsesPartialHandles(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - err := d.cleanupDMSnapshot(context.Background(), dmSnapshotHandles{ + err := d.net.cleanupDMSnapshot(context.Background(), dmSnapshotHandles{ BaseLoop: "/dev/loop10", DMDev: "/dev/mapper/fc-rootfs-test", }) @@ -276,8 +284,9 @@ func TestCleanupDMSnapshotJoinsTeardownErrors(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - err := d.cleanupDMSnapshot(context.Background(), dmSnapshotHandles{ + err := d.net.cleanupDMSnapshot(context.Background(), dmSnapshotHandles{ BaseLoop: "/dev/loop10", COWLoop: "/dev/loop11", DMName: "fc-rootfs-test", @@ -306,8 +315,9 @@ func TestRemoveDMSnapshotRetriesBusyDevice(t *testing.T) { }, } d := &Daemon{runner: runner} + wireServices(d) - if err := d.removeDMSnapshot(context.Background(), "fc-rootfs-test"); err != nil { + if err := d.net.removeDMSnapshot(context.Background(), "fc-rootfs-test"); err != nil { t.Fatalf("removeDMSnapshot returned error: %v", err) } runner.assertExhausted() diff --git a/internal/daemon/ssh_client_config.go b/internal/daemon/ssh_client_config.go new file mode 100644 index 0000000..069cc2d --- /dev/null +++ b/internal/daemon/ssh_client_config.go @@ -0,0 +1,284 @@ +package daemon + +import ( + "fmt" + "log/slog" + "os" + "path/filepath" + "strings" + + "banger/internal/guest" + "banger/internal/model" + "banger/internal/paths" +) + +// Marker sentinels that fence the `Include` block banger writes into +// ~/.ssh/config when the user runs `banger ssh-config --install`. +const ( + bangerSSHIncludeBegin = "# BEGIN BANGER SSH INCLUDE" + bangerSSHIncludeEnd = "# END BANGER SSH INCLUDE" +) + +// removeVMKnownHosts drops every host-key pin for vm from the +// banger-owned known_hosts. Best-effort — a failure here only +// matters if the same IP/name is reused by a fresh VM before the +// next daemon restart, and even then it just causes a +// TOFU-mismatch error that the user can clear manually. Logged at +// warn so it shows up if it ever actually breaks things. +func removeVMKnownHosts(knownHostsPath string, vm model.VMRecord, logger *slog.Logger) { + if strings.TrimSpace(knownHostsPath) == "" { + return + } + var hosts []string + if ip := strings.TrimSpace(vm.Runtime.GuestIP); ip != "" { + hosts = append(hosts, ip) + } + if dns := strings.TrimSpace(vm.Runtime.DNSName); dns != "" { + hosts = append(hosts, dns) + } + if len(hosts) == 0 { + return + } + if err := guest.RemoveKnownHosts(knownHostsPath, hosts...); err != nil && logger != nil { + logger.Warn("remove known_hosts entries", "vm_id", vm.ID, "error", err.Error()) + } +} + +// BangerSSHConfigPath is the file banger owns and keeps in sync with +// the current default key + known_hosts locations. Users who want the +// `ssh .vm` shortcut opt in via `banger ssh-config --install`, +// which adds an Include line to ~/.ssh/config pointing at this file. +// The daemon never touches ~/.ssh/config on its own. +func BangerSSHConfigPath(layout paths.Layout) string { + if strings.TrimSpace(layout.ConfigDir) == "" { + return "" + } + return filepath.Join(layout.ConfigDir, "ssh_config") +} + +func (d *Daemon) ensureVMSSHClientConfig() { + if err := SyncVMSSHClientConfig(d.userLayout, d.config.SSHKeyPath); err != nil && d.logger != nil { + d.logger.Warn("vm ssh client config sync failed", "error", err.Error()) + } +} + +// syncVMSSHClientConfig writes banger's own ssh_config file with the +// current `Host *.vm` stanza. It does NOT touch ~/.ssh/config; that's +// the job of `banger ssh-config --install` (user-initiated). +// +// The file lives in the banger config dir so users who manage their +// SSH config declaratively can decide how (or whether) to pull it in. +func SyncVMSSHClientConfig(layout paths.Layout, keyPath string) error { + keyPath = strings.TrimSpace(keyPath) + if keyPath == "" { + return nil + } + target := BangerSSHConfigPath(layout) + if target == "" { + return nil + } + if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil { + return err + } + block := renderManagedVMSSHBlock(keyPath, layout.KnownHostsPath) + return writeTextFileIfChanged(target, block, 0o644) +} + +// InstallUserSSHInclude adds an `Include ` line +// to ~/.ssh/config inside a banger-owned marker block. Idempotent: +// running it twice leaves a single block. +func InstallUserSSHInclude(layout paths.Layout) error { + bangerConfig := BangerSSHConfigPath(layout) + if bangerConfig == "" { + return fmt.Errorf("banger config dir is not configured") + } + userConfigPath, err := userSSHConfigPath() + if err != nil { + return err + } + existing, err := readTextFileIfExists(userConfigPath) + if err != nil { + return err + } + block := renderBangerSSHIncludeBlock(bangerConfig) + updated, err := upsertManagedBlock(existing, bangerSSHIncludeBegin, bangerSSHIncludeEnd, block) + if err != nil { + return err + } + return writeTextFileIfChanged(userConfigPath, updated, 0o600) +} + +// UninstallUserSSHInclude removes the Include block from +// ~/.ssh/config. Idempotent: missing file or missing block is a +// no-op. +func UninstallUserSSHInclude() error { + userConfigPath, err := userSSHConfigPath() + if err != nil { + return err + } + existing, err := readTextFileIfExists(userConfigPath) + if err != nil { + return err + } + if existing == "" { + return nil + } + updated, err := removeManagedBlock(existing, bangerSSHIncludeBegin, bangerSSHIncludeEnd) + if err != nil { + return err + } + return writeTextFileIfChanged(userConfigPath, updated, 0o600) +} + +// UserSSHIncludeInstalled reports whether ~/.ssh/config contains the +// banger Include block. Used by `ssh-config` (status readout) and +// `doctor`. +func UserSSHIncludeInstalled() (bool, error) { + userConfigPath, err := userSSHConfigPath() + if err != nil { + return false, err + } + existing, err := readTextFileIfExists(userConfigPath) + if err != nil { + return false, err + } + return strings.Contains(existing, bangerSSHIncludeBegin), nil +} + +func userSSHConfigPath() (string, error) { + home, err := os.UserHomeDir() + if err != nil { + return "", err + } + return filepath.Join(home, ".ssh", "config"), nil +} + +// renderManagedVMSSHBlock produces the body banger writes into its +// own ssh_config file. Host-key verification uses the banger-owned +// known_hosts — NOT the user's ~/.ssh/known_hosts, and NOT /dev/null. +// `accept-new` means first contact pins the key; any later mismatch +// fails the connect. +func renderManagedVMSSHBlock(keyPath, knownHostsPath string) string { + keyPath = strings.TrimSpace(keyPath) + knownHostsPath = strings.TrimSpace(knownHostsPath) + lines := []string{ + "# Generated by banger. Edits will be overwritten on daemon start.", + "# Enable the `ssh .vm` shortcut via `banger ssh-config --install`.", + "Host *.vm", + " User root", + " IdentityFile " + keyPath, + " IdentitiesOnly yes", + " BatchMode yes", + " PreferredAuthentications publickey", + " PasswordAuthentication no", + " KbdInteractiveAuthentication no", + } + if knownHostsPath != "" { + lines = append(lines, + " UserKnownHostsFile "+knownHostsPath, + " StrictHostKeyChecking accept-new", + ) + } else { + // Missing known_hosts path is a configuration anomaly — fail + // closed rather than silently disable verification. + lines = append(lines, " StrictHostKeyChecking yes") + } + lines = append(lines, " LogLevel ERROR", "") + return strings.Join(lines, "\n") +} + +// renderBangerSSHIncludeBlock returns the marker-fenced block that +// `ssh-config --install` writes into ~/.ssh/config. +func renderBangerSSHIncludeBlock(bangerConfigPath string) string { + lines := []string{ + bangerSSHIncludeBegin, + "# Added by `banger ssh-config --install`. Remove with", + "# `banger ssh-config --uninstall`, or delete the whole block.", + "Include " + bangerConfigPath, + bangerSSHIncludeEnd, + "", + } + return strings.Join(lines, "\n") +} + +// upsertManagedBlock replaces an existing marker-fenced block with +// `block` (including the begin/end markers), or appends `block` if +// no such block exists. `block` must contain the markers itself. +func upsertManagedBlock(existing, beginMarker, endMarker, block string) (string, error) { + existing = normalizeConfigText(existing) + block = normalizeConfigText(block) + + start := strings.Index(existing, beginMarker) + if start >= 0 { + end := strings.Index(existing[start:], endMarker) + if end < 0 { + return "", fmt.Errorf("managed block %q is missing end marker %q", beginMarker, endMarker) + } + end += start + len(endMarker) + for end < len(existing) && existing[end] == '\n' { + end++ + } + existing = strings.TrimRight(existing[:start]+existing[end:], "\n") + } + + if strings.TrimSpace(existing) == "" { + return block, nil + } + return strings.TrimRight(existing, "\n") + "\n\n" + block, nil +} + +// removeManagedBlock strips a marker-fenced block from existing text +// and returns the result (unchanged if no block is present). Missing +// end marker with present begin marker is treated as corruption. +func removeManagedBlock(existing, beginMarker, endMarker string) (string, error) { + existing = normalizeConfigText(existing) + start := strings.Index(existing, beginMarker) + if start < 0 { + return existing, nil + } + end := strings.Index(existing[start:], endMarker) + if end < 0 { + return "", fmt.Errorf("managed block %q is missing end marker %q", beginMarker, endMarker) + } + end += start + len(endMarker) + for end < len(existing) && existing[end] == '\n' { + end++ + } + stripped := strings.TrimRight(existing[:start]+existing[end:], "\n") + return normalizeConfigText(stripped), nil +} + +func normalizeConfigText(text string) string { + text = strings.ReplaceAll(text, "\r\n", "\n") + text = strings.TrimRight(text, "\n") + if text == "" { + return "" + } + return text + "\n" +} + +func readTextFileIfExists(path string) (string, error) { + data, err := os.ReadFile(path) + if err == nil { + return string(data), nil + } + if os.IsNotExist(err) { + return "", nil + } + return "", err +} + +func writeTextFileIfChanged(path, content string, mode os.FileMode) error { + content = normalizeConfigText(content) + existing, err := readTextFileIfExists(path) + if err != nil { + return err + } + if existing == content { + return nil + } + if err := os.MkdirAll(filepath.Dir(path), 0o700); err != nil { + return err + } + return os.WriteFile(path, []byte(content), mode) +} diff --git a/internal/daemon/ssh_client_config_test.go b/internal/daemon/ssh_client_config_test.go new file mode 100644 index 0000000..6133217 --- /dev/null +++ b/internal/daemon/ssh_client_config_test.go @@ -0,0 +1,184 @@ +package daemon + +import ( + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/paths" +) + +// Under the opt-in contract the daemon writes its own ssh_config file +// and never touches ~/.ssh/config on its own. +func TestSyncVMSSHClientConfigWritesBangerFileOnly(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + knownHostsPath := filepath.Join(homeDir, ".local", "state", "banger", "ssh", "known_hosts") + layout := paths.Layout{ + ConfigDir: filepath.Join(homeDir, ".config", "banger"), + KnownHostsPath: knownHostsPath, + } + keyPath := filepath.Join(homeDir, ".config", "banger", "ssh", "id_ed25519") + + if err := SyncVMSSHClientConfig(layout, keyPath); err != nil { + t.Fatalf("SyncVMSSHClientConfig: %v", err) + } + + // Banger's own ssh_config file has the `Host *.vm` stanza. + bangerConfig, err := os.ReadFile(BangerSSHConfigPath(layout)) + if err != nil { + t.Fatalf("ReadFile(banger ssh_config): %v", err) + } + for _, want := range []string{ + "Host *.vm", + "IdentityFile " + keyPath, + "UserKnownHostsFile " + knownHostsPath, + "StrictHostKeyChecking accept-new", + } { + if !strings.Contains(string(bangerConfig), want) { + t.Fatalf("banger ssh_config missing %q:\n%s", want, bangerConfig) + } + } + + // ~/.ssh/config must NOT have been created or modified. + if _, err := os.Stat(filepath.Join(homeDir, ".ssh", "config")); !os.IsNotExist(err) { + t.Fatalf("~/.ssh/config should be untouched; stat err = %v", err) + } +} + +func TestInstallUserSSHIncludeAddsIncludeBlock(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + layout := paths.Layout{ConfigDir: filepath.Join(homeDir, ".config", "banger")} + if err := os.MkdirAll(layout.ConfigDir, 0o755); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + // Write a fake banger ssh_config so Install has something to include. + if err := os.WriteFile(BangerSSHConfigPath(layout), []byte("Host *.vm\n"), 0o644); err != nil { + t.Fatalf("WriteFile(banger ssh_config): %v", err) + } + + if err := InstallUserSSHInclude(layout); err != nil { + t.Fatalf("InstallUserSSHInclude: %v", err) + } + got, err := os.ReadFile(filepath.Join(homeDir, ".ssh", "config")) + if err != nil { + t.Fatalf("ReadFile(~/.ssh/config): %v", err) + } + want := "Include " + BangerSSHConfigPath(layout) + if !strings.Contains(string(got), want) { + t.Fatalf("user config missing %q:\n%s", want, got) + } + if !strings.Contains(string(got), bangerSSHIncludeBegin) { + t.Fatalf("user config missing begin marker:\n%s", got) + } +} + +func TestInstallUserSSHIncludeIsIdempotent(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + layout := paths.Layout{ConfigDir: filepath.Join(homeDir, ".config", "banger")} + if err := os.MkdirAll(layout.ConfigDir, 0o755); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + if err := os.WriteFile(BangerSSHConfigPath(layout), []byte("Host *.vm\n"), 0o644); err != nil { + t.Fatalf("WriteFile: %v", err) + } + for i := 0; i < 3; i++ { + if err := InstallUserSSHInclude(layout); err != nil { + t.Fatalf("InstallUserSSHInclude (%d): %v", i, err) + } + } + got, err := os.ReadFile(filepath.Join(homeDir, ".ssh", "config")) + if err != nil { + t.Fatalf("ReadFile: %v", err) + } + if n := strings.Count(string(got), bangerSSHIncludeBegin); n != 1 { + t.Fatalf("begin markers = %d, want 1:\n%s", n, got) + } +} + +func TestUninstallUserSSHIncludeRemovesIncludeBlock(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + sshDir := filepath.Join(homeDir, ".ssh") + if err := os.MkdirAll(sshDir, 0o700); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + seed := strings.Join([]string{ + "Host keep", + " HostName 198.51.100.1", + "", + bangerSSHIncludeBegin, + "Include /tmp/banger-ssh-config", + bangerSSHIncludeEnd, + "", + }, "\n") + if err := os.WriteFile(filepath.Join(sshDir, "config"), []byte(seed), 0o600); err != nil { + t.Fatalf("seed: %v", err) + } + + if err := UninstallUserSSHInclude(); err != nil { + t.Fatalf("UninstallUserSSHInclude: %v", err) + } + got, err := os.ReadFile(filepath.Join(sshDir, "config")) + if err != nil { + t.Fatalf("ReadFile: %v", err) + } + gotStr := string(got) + if strings.Contains(gotStr, bangerSSHIncludeBegin) { + t.Fatalf("begin marker survived uninstall:\n%s", gotStr) + } + if !strings.Contains(gotStr, "Host keep") { + t.Fatalf("lost unrelated entry:\n%s", gotStr) + } +} + +func TestUninstallUserSSHIncludeIsNoOpWhenMissing(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + if err := UninstallUserSSHInclude(); err != nil { + t.Fatalf("UninstallUserSSHInclude on missing file: %v", err) + } + // Still no ~/.ssh/config. + if _, err := os.Stat(filepath.Join(homeDir, ".ssh", "config")); !os.IsNotExist(err) { + t.Fatalf("~/.ssh/config unexpectedly created; stat err = %v", err) + } +} + +func TestUserSSHIncludeInstalledDetectsMarker(t *testing.T) { + for _, tc := range []struct { + name string + seed string + wantIn bool + }{ + {"missing file", "", false}, + {"unrelated only", "Host other\n HostName 1.2.3.4\n", false}, + {"installed", bangerSSHIncludeBegin + "\nInclude /tmp/banger\n" + bangerSSHIncludeEnd + "\n", true}, + } { + t.Run(tc.name, func(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + if tc.seed != "" { + if err := os.MkdirAll(filepath.Join(homeDir, ".ssh"), 0o700); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + if err := os.WriteFile(filepath.Join(homeDir, ".ssh", "config"), []byte(tc.seed), 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + } + got, err := UserSSHIncludeInstalled() + if err != nil { + t.Fatalf("UserSSHIncludeInstalled: %v", err) + } + if got != tc.wantIn { + t.Fatalf("got %v, want %v", got, tc.wantIn) + } + }) + } +} diff --git a/internal/daemon/sshd_config_test.go b/internal/daemon/sshd_config_test.go new file mode 100644 index 0000000..46cae4a --- /dev/null +++ b/internal/daemon/sshd_config_test.go @@ -0,0 +1,64 @@ +package daemon + +import ( + "strings" + "testing" +) + +// TestSshdGuestConfig_Hardened is a regression guard for the guest +// SSH posture. An earlier version shipped `LogLevel DEBUG3` and +// `StrictModes no`; both are gone and must not come back without an +// explicit call-out. +func TestSshdGuestConfig_Hardened(t *testing.T) { + cfg := sshdGuestConfig() + + // Posture: key-only, root via pubkey, no password / keyboard- + // interactive fallback, pinned authorized_keys path. + mustContain := []string{ + "PermitRootLogin prohibit-password", + "PubkeyAuthentication yes", + "PasswordAuthentication no", + "KbdInteractiveAuthentication no", + "AuthorizedKeysFile /root/.ssh/authorized_keys", + // Quiet-login: short-lived sandboxes don't need the Debian + // MOTD or the "Last login" line. .hushlogin in /root covers + // pam_motd; these two cover sshd's own paths. + "PrintMotd no", + "PrintLastLog no", + } + for _, line := range mustContain { + if !strings.Contains(cfg, line) { + t.Errorf("sshd drop-in missing %q:\n%s", line, cfg) + } + } + + // Things that must NOT appear. Each has a history and a reason. + mustNotContain := map[string]string{ + "LogLevel DEBUG3": "was debug leftover; floods journald", + "StrictModes no": "masked a /root perm drift; real fix is EnsureExt4RootPerms at authsync time", + // Blanket "PermitRootLogin yes" (without prohibit-password) + // would re-enable password root login if something else + // flipped PasswordAuthentication back to yes. + "PermitRootLogin yes": "use prohibit-password instead", + } + for needle, why := range mustNotContain { + if strings.Contains(cfg, needle) { + t.Errorf("sshd drop-in contains %q (%s):\n%s", needle, why, cfg) + } + } +} + +func TestSshdGuestConfig_IsCompleteLines(t *testing.T) { + // Every directive should be a full line on its own. Trailing + // newline matters — sshd_config.d files without a newline sometimes + // get misparsed when concatenated with other drop-ins. + cfg := sshdGuestConfig() + if !strings.HasSuffix(cfg, "\n") { + t.Errorf("sshd drop-in should end with newline:\n%q", cfg) + } + for _, line := range strings.Split(strings.TrimRight(cfg, "\n"), "\n") { + if strings.TrimSpace(line) == "" { + t.Errorf("sshd drop-in has blank line:\n%s", cfg) + } + } +} diff --git a/internal/daemon/stats_service.go b/internal/daemon/stats_service.go new file mode 100644 index 0000000..a15495b --- /dev/null +++ b/internal/daemon/stats_service.go @@ -0,0 +1,387 @@ +package daemon + +import ( + "context" + "crypto/tls" + "errors" + "fmt" + "io" + "log/slog" + "net" + "net/http" + "sort" + "strconv" + "strings" + "time" + + "banger/internal/api" + "banger/internal/model" + "banger/internal/store" + "banger/internal/system" + "banger/internal/vmdns" + "banger/internal/vsockagent" +) + +// StatsService owns the "observe a VM" surface: stats collection +// (CPU / memory / disk), listening-port enumeration, vsock-agent +// health probes, the background poller that refreshes stats for every +// live VM, and the auto-stop-when-idle sweep. +// +// Split out from VMService (commit 3 of the god-service decomposition): +// nothing here orchestrates lifecycle. The three VMService touch +// points stats genuinely needs — vmAlive, vmHandles, the per-VM lock +// helpers, plus cleanupRuntime for the stale-VM sweep — come in as +// function-typed closures so StatsService has no back-reference to +// its sibling. Same pattern WorkspaceService already uses. +type StatsService struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + store *store.Store + net *HostNetwork + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog + + // vmAlive / vmHandles are the minimum pair needed to answer "is + // this VM actually running right now?" + "what PID is it?". + // Closures over VMService so we re-read d.vm at call time — wire + // order in wireServices puts d.vm before d.stats, so these are + // safe by the time anything on StatsService fires. + vmAlive func(vm model.VMRecord) bool + vmHandles func(vmID string) model.VMHandles + + // Lock helpers: stats collection and the stale-sweep both mutate + // VM records (persist new stats, flip State to Stopped on auto- + // stop) and so need the same per-VM mutex lifecycle ops hold. + withVMLockByRef func(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) + withVMLockByIDErr func(ctx context.Context, id string, fn func(model.VMRecord) error) error + + // cleanupRuntime is the auto-stop-sweep's only call into the + // lifecycle side — forcibly tears down a VM that's been idle past + // AutoStopStaleAfter. Keeping it as a closure means StatsService + // never directly dereferences VMService. + cleanupRuntime func(ctx context.Context, vm model.VMRecord, preserveDisks bool) error +} + +type statsServiceDeps struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + store *store.Store + net *HostNetwork + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog + vmAlive func(vm model.VMRecord) bool + vmHandles func(vmID string) model.VMHandles + withVMLockByRef func(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) + withVMLockByIDErr func(ctx context.Context, id string, fn func(model.VMRecord) error) error + cleanupRuntime func(ctx context.Context, vm model.VMRecord, preserveDisks bool) error +} + +func newStatsService(deps statsServiceDeps) *StatsService { + return &StatsService{ + runner: deps.runner, + logger: deps.logger, + config: deps.config, + store: deps.store, + net: deps.net, + beginOperation: deps.beginOperation, + vmAlive: deps.vmAlive, + vmHandles: deps.vmHandles, + withVMLockByRef: deps.withVMLockByRef, + withVMLockByIDErr: deps.withVMLockByIDErr, + cleanupRuntime: deps.cleanupRuntime, + } +} + +// ---- stats ---- + +func (s *StatsService) GetVMStats(ctx context.Context, idOrName string) (model.VMRecord, model.VMStats, error) { + vm, err := s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + return s.getVMStatsLocked(ctx, vm) + }) + if err != nil { + return model.VMRecord{}, model.VMStats{}, err + } + return vm, vm.Stats, nil +} + +func (s *StatsService) HealthVM(ctx context.Context, idOrName string) (result api.VMHealthResult, err error) { + _, err = s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + result.Name = vm.Name + if !s.vmAlive(vm) { + result.Healthy = false + return vm, nil + } + if strings.TrimSpace(vm.Runtime.VSockPath) == "" { + return model.VMRecord{}, errors.New("vm has no vsock path") + } + if vm.Runtime.VSockCID == 0 { + return model.VMRecord{}, errors.New("vm has no vsock cid") + } + if err := s.net.ensureSocketAccess(ctx, vm.Runtime.VSockPath, "firecracker vsock socket"); err != nil { + return model.VMRecord{}, err + } + pingCtx, cancel := context.WithTimeout(ctx, 3*time.Second) + defer cancel() + if err := vsockagent.Health(pingCtx, s.logger, vm.Runtime.VSockPath); err != nil { + return model.VMRecord{}, err + } + result.Healthy = true + return vm, nil + }) + return result, err +} + +func (s *StatsService) PingVM(ctx context.Context, idOrName string) (result api.VMPingResult, err error) { + health, err := s.HealthVM(ctx, idOrName) + if err != nil { + return api.VMPingResult{}, err + } + return api.VMPingResult{Name: health.Name, Alive: health.Healthy}, nil +} + +func (s *StatsService) getVMStatsLocked(ctx context.Context, vm model.VMRecord) (model.VMRecord, error) { + stats, err := s.collectStats(ctx, vm) + if err == nil { + vm.Stats = stats + vm.UpdatedAt = model.Now() + _ = s.store.UpsertVM(ctx, vm) + if s.logger != nil { + s.logger.Debug("vm stats collected", append(vmLogAttrs(vm), "rss_bytes", stats.RSSBytes, "vsz_bytes", stats.VSZBytes, "cpu_percent", stats.CPUPercent)...) + } + } + return vm, nil +} + +// pollStats runs on the daemon's background ticker; refreshes stats +// for every VM the store knows about, skipping ones that aren't alive. +func (s *StatsService) pollStats(ctx context.Context) error { + vms, err := s.store.ListVMs(ctx) + if err != nil { + return err + } + for _, vm := range vms { + if err := s.withVMLockByIDErr(ctx, vm.ID, func(vm model.VMRecord) error { + if !s.vmAlive(vm) { + return nil + } + stats, err := s.collectStats(ctx, vm) + if err != nil { + if s.logger != nil { + s.logger.Debug("vm stats collection failed", append(vmLogAttrs(vm), "error", err.Error())...) + } + return nil + } + vm.Stats = stats + vm.UpdatedAt = model.Now() + return s.store.UpsertVM(ctx, vm) + }); err != nil { + return err + } + } + return nil +} + +// stopStaleVMs auto-stops any running VM whose LastTouchedAt is older +// than config.AutoStopStaleAfter. This is the only path through +// StatsService that actually mutates VM lifecycle state — it needs +// cleanupRuntime to tear down the kernel + process side. +func (s *StatsService) stopStaleVMs(ctx context.Context) (err error) { + if s.config.AutoStopStaleAfter <= 0 { + return nil + } + op := s.beginOperation(ctx, "vm.stop_stale") + defer func() { + if err != nil { + op.fail(err) + return + } + op.done() + }() + vms, err := s.store.ListVMs(ctx) + if err != nil { + return err + } + now := model.Now() + for _, vm := range vms { + if err := s.withVMLockByIDErr(ctx, vm.ID, func(vm model.VMRecord) error { + if !s.vmAlive(vm) { + return nil + } + if now.Sub(vm.LastTouchedAt) < s.config.AutoStopStaleAfter { + return nil + } + op.stage("stopping_vm", vmLogAttrs(vm)...) + _ = s.net.sendCtrlAltDel(ctx, vm.Runtime.APISockPath) + _ = s.net.waitForExit(ctx, s.vmHandles(vm.ID).PID, vm.Runtime.APISockPath, 10*time.Second) + _ = s.cleanupRuntime(ctx, vm, true) + vm.State = model.VMStateStopped + vm.Runtime.State = model.VMStateStopped + clearRuntimeTeardownState(&vm) + vm.UpdatedAt = model.Now() + return s.store.UpsertVM(ctx, vm) + }); err != nil { + return err + } + } + return nil +} + +func (s *StatsService) collectStats(ctx context.Context, vm model.VMRecord) (model.VMStats, error) { + stats := model.VMStats{ + CollectedAt: model.Now(), + SystemOverlayBytes: system.AllocatedBytes(vm.Runtime.SystemOverlay), + WorkDiskBytes: system.AllocatedBytes(vm.Runtime.WorkDiskPath), + MetricsRaw: system.ParseMetricsFile(vm.Runtime.MetricsPath), + } + if s.vmAlive(vm) { + if ps, err := system.ReadProcessStats(ctx, s.vmHandles(vm.ID).PID); err == nil { + stats.CPUPercent = ps.CPUPercent + stats.RSSBytes = ps.RSSBytes + stats.VSZBytes = ps.VSZBytes + } + } + return stats, nil +} + +// ---- ports ---- + +const httpProbeTimeout = 750 * time.Millisecond + +func (s *StatsService) PortsVM(ctx context.Context, idOrName string) (result api.VMPortsResult, err error) { + _, err = s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + result.Name = vm.Name + result.DNSName = strings.TrimSpace(vm.Runtime.DNSName) + if result.DNSName == "" && strings.TrimSpace(vm.Name) != "" { + result.DNSName = vmdns.RecordName(vm.Name) + } + if !s.vmAlive(vm) { + return model.VMRecord{}, fmt.Errorf("vm %s is not running", vm.Name) + } + if strings.TrimSpace(vm.Runtime.GuestIP) == "" { + return model.VMRecord{}, errors.New("vm has no guest IP") + } + if strings.TrimSpace(vm.Runtime.VSockPath) == "" { + return model.VMRecord{}, errors.New("vm has no vsock path") + } + if vm.Runtime.VSockCID == 0 { + return model.VMRecord{}, errors.New("vm has no vsock cid") + } + if err := s.net.ensureSocketAccess(ctx, vm.Runtime.VSockPath, "firecracker vsock socket"); err != nil { + return model.VMRecord{}, err + } + portsCtx, cancel := context.WithTimeout(ctx, 3*time.Second) + defer cancel() + listeners, err := vsockagent.Ports(portsCtx, s.logger, vm.Runtime.VSockPath) + if err != nil { + return model.VMRecord{}, err + } + result.Ports = buildVMPorts(vm, listeners) + return vm, nil + }) + return result, err +} + +func buildVMPorts(vm model.VMRecord, listeners []vsockagent.PortListener) []api.VMPort { + endpointHost := strings.TrimSpace(vm.Runtime.DNSName) + if endpointHost == "" { + endpointHost = strings.TrimSpace(vm.Runtime.GuestIP) + } + probeHost := strings.TrimSpace(vm.Runtime.GuestIP) + ports := make([]api.VMPort, 0, len(listeners)) + for _, listener := range listeners { + if listener.Port <= 0 { + continue + } + port := api.VMPort{ + Proto: strings.ToLower(strings.TrimSpace(listener.Proto)), + BindAddress: strings.TrimSpace(listener.BindAddress), + Port: listener.Port, + PID: listener.PID, + Process: strings.TrimSpace(listener.Process), + Command: strings.TrimSpace(listener.Command), + Endpoint: net.JoinHostPort(endpointHost, strconv.Itoa(listener.Port)), + } + if port.Command == "" { + port.Command = port.Process + } + if port.Proto == "tcp" && probeHost != "" && endpointHost != "" { + if scheme, ok := probeWebListener(probeHost, listener.Port); ok { + port.Proto = scheme + port.Endpoint = scheme + "://" + net.JoinHostPort(endpointHost, strconv.Itoa(listener.Port)) + "/" + } + } + ports = append(ports, port) + } + sort.Slice(ports, func(i, j int) bool { + if ports[i].Proto != ports[j].Proto { + return ports[i].Proto < ports[j].Proto + } + if ports[i].Port != ports[j].Port { + return ports[i].Port < ports[j].Port + } + if ports[i].PID != ports[j].PID { + return ports[i].PID < ports[j].PID + } + if ports[i].Process != ports[j].Process { + return ports[i].Process < ports[j].Process + } + return ports[i].BindAddress < ports[j].BindAddress + }) + return dedupeVMPorts(ports) +} + +func probeWebListener(guestIP string, port int) (string, bool) { + if probeHTTPScheme("https", guestIP, port) { + return "https", true + } + if probeHTTPScheme("http", guestIP, port) { + return "http", true + } + return "", false +} + +func probeHTTPScheme(scheme, guestIP string, port int) bool { + if strings.TrimSpace(guestIP) == "" || port <= 0 { + return false + } + url := scheme + "://" + net.JoinHostPort(strings.TrimSpace(guestIP), strconv.Itoa(port)) + "/" + req, err := http.NewRequest(http.MethodGet, url, nil) + if err != nil { + return false + } + transport := &http.Transport{Proxy: nil} + if scheme == "https" { + transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true} + } + client := &http.Client{ + Timeout: httpProbeTimeout, + CheckRedirect: func(req *http.Request, via []*http.Request) error { + return http.ErrUseLastResponse + }, + Transport: transport, + } + resp, err := client.Do(req) + if err != nil { + return false + } + defer resp.Body.Close() + _, _ = io.Copy(io.Discard, io.LimitReader(resp.Body, 1)) + return resp.ProtoMajor >= 1 +} + +func dedupeVMPorts(ports []api.VMPort) []api.VMPort { + if len(ports) < 2 { + return ports + } + deduped := make([]api.VMPort, 0, len(ports)) + seen := make(map[string]struct{}, len(ports)) + for _, port := range ports { + key := port.Proto + "\x00" + port.Endpoint + if _, ok := seen[key]; ok { + continue + } + seen[key] = struct{}{} + deduped = append(deduped, port) + } + return deduped +} diff --git a/internal/daemon/stats_service_test.go b/internal/daemon/stats_service_test.go new file mode 100644 index 0000000..83a69e2 --- /dev/null +++ b/internal/daemon/stats_service_test.go @@ -0,0 +1,51 @@ +package daemon + +import ( + "testing" + + "banger/internal/model" + "banger/internal/paths" +) + +// TestWireServicesInstantiatesStatsService pins that wireServices +// leaves d.stats non-nil after construction. A wiring-order bug that +// left stats unset would silently break background stats polling and +// the vm.stats / vm.health / vm.ping / vm.ports RPC methods — none +// of those would nil-deref at cold boot because the daemon might +// not get a call for minutes, but the pollStats ticker would +// immediately panic on its first fire. +func TestWireServicesInstantiatesStatsService(t *testing.T) { + d := &Daemon{ + runner: &permissiveRunner{}, + config: model.DaemonConfig{BridgeIP: model.DefaultBridgeIP}, + layout: paths.Layout{ + StateDir: t.TempDir(), + ConfigDir: t.TempDir(), + RuntimeDir: t.TempDir(), + VMsDir: t.TempDir(), + }, + } + wireServices(d) + + if d.stats == nil { + t.Fatal("d.stats is nil after wireServices") + } + // Spot-check the three closures that back every stats method — + // a nil closure would be a less-obvious wiring regression than + // a nil service. + if d.stats.vmAlive == nil { + t.Fatal("d.stats.vmAlive closure is nil") + } + if d.stats.vmHandles == nil { + t.Fatal("d.stats.vmHandles closure is nil") + } + if d.stats.cleanupRuntime == nil { + t.Fatal("d.stats.cleanupRuntime closure is nil") + } + if d.stats.withVMLockByRef == nil { + t.Fatal("d.stats.withVMLockByRef closure is nil") + } + if d.stats.withVMLockByIDErr == nil { + t.Fatal("d.stats.withVMLockByIDErr closure is nil") + } +} diff --git a/internal/daemon/tap_pool.go b/internal/daemon/tap_pool.go index ddf436e..d91debf 100644 --- a/internal/daemon/tap_pool.go +++ b/internal/daemon/tap_pool.go @@ -5,102 +5,162 @@ import ( "fmt" "strconv" "strings" + "sync" + "sync/atomic" ) const tapPoolPrefix = "tap-pool-" -func (d *Daemon) initializeTapPool(ctx context.Context) error { - if d.config.TapPoolSize <= 0 || d.store == nil { - return nil - } - vms, err := d.store.ListVMs(ctx) - if err != nil { - return err +// tapPool owns the idle TAP interface cache plus the monotonic index used to +// name new pool entries. All access goes through mu. +type tapPool struct { + mu sync.Mutex + entries []string + next int + warming bool +} + +// maxConcurrentTapWarmup caps the number of `priv.create_tap` RPCs the +// warmup loop runs in parallel. Each tap creation is ~4 root-helper +// shell-outs serialized within one RPC handler; running too many at +// once just contends on netlink. 8 is the production sweet spot for +// SMOKE_JOBS=8. +const maxConcurrentTapWarmup = 8 + +// initializeTapPool seeds the monotonic pool index from the set of +// tap names already in use by running/stopped VMs, so newly warmed +// pool entries don't collide with existing ones. Callers (Daemon.Open) +// enumerate used taps from the handle cache and pass them in. +func (n *HostNetwork) initializeTapPool(usedTaps []string) { + if n.config.TapPoolSize <= 0 { + return } next := 0 - for _, vm := range vms { - if index, ok := parseTapPoolIndex(vm.Runtime.TapDevice); ok && index >= next { + for _, tapName := range usedTaps { + if index, ok := parseTapPoolIndex(tapName); ok && index >= next { next = index + 1 } } - d.tapPoolMu.Lock() - d.tapPoolNext = next - d.tapPoolMu.Unlock() - return nil + n.tapPool.mu.Lock() + n.tapPool.next = next + n.tapPool.mu.Unlock() } -func (d *Daemon) ensureTapPool(ctx context.Context) { - if d.config.TapPoolSize <= 0 { +func (n *HostNetwork) ensureTapPool(ctx context.Context) { + if n.config.TapPoolSize <= 0 { return } + + // Dedupe concurrent warmup invocations. Releases trigger a fresh + // ensureTapPool in a goroutine; without this, N parallel releases + // would each spin up their own warmup loop racing on n.tapPool.next. + n.tapPool.mu.Lock() + if n.tapPool.warming { + n.tapPool.mu.Unlock() + return + } + n.tapPool.warming = true + n.tapPool.mu.Unlock() + defer func() { + n.tapPool.mu.Lock() + n.tapPool.warming = false + n.tapPool.mu.Unlock() + }() + for { select { case <-ctx.Done(): return - case <-d.closing: + case <-n.closing: return default: } - d.tapPoolMu.Lock() - if len(d.tapPool) >= d.config.TapPoolSize { - d.tapPoolMu.Unlock() + n.tapPool.mu.Lock() + deficit := n.config.TapPoolSize - len(n.tapPool.entries) + if deficit <= 0 { + n.tapPool.mu.Unlock() return } - tapName := fmt.Sprintf("%s%d", tapPoolPrefix, d.tapPoolNext) - d.tapPoolNext++ - d.tapPoolMu.Unlock() - - if err := d.createTap(ctx, tapName); err != nil { - if d.logger != nil { - d.logger.Warn("tap pool warmup failed", "tap_device", tapName, "error", err.Error()) - } - return + batch := deficit + if batch > maxConcurrentTapWarmup { + batch = maxConcurrentTapWarmup } + // Reserve names up front so concurrent goroutines can't collide + // on n.tapPool.next. + names := make([]string, batch) + for i := range names { + names[i] = fmt.Sprintf("%s%d", tapPoolPrefix, n.tapPool.next) + n.tapPool.next++ + } + n.tapPool.mu.Unlock() - d.tapPoolMu.Lock() - d.tapPool = append(d.tapPool, tapName) - d.tapPoolMu.Unlock() + var ( + wg sync.WaitGroup + progress atomic.Int32 + ) + for _, tapName := range names { + wg.Add(1) + go func(tapName string) { + defer wg.Done() + if err := n.createTap(ctx, tapName); err != nil { + if n.logger != nil { + n.logger.Warn("tap pool warmup failed", "tap_device", tapName, "error", err.Error()) + } + return + } + n.tapPool.mu.Lock() + n.tapPool.entries = append(n.tapPool.entries, tapName) + n.tapPool.mu.Unlock() + progress.Add(1) + if n.logger != nil { + n.logger.Debug("tap added to idle pool", "tap_device", tapName) + } + }(tapName) + } + wg.Wait() - if d.logger != nil { - d.logger.Debug("tap added to idle pool", "tap_device", tapName) + // Whole batch failed → bail rather than burn names indefinitely + // (the original sequential loop bailed on first error too). + if progress.Load() == 0 { + return } } } -func (d *Daemon) acquireTap(ctx context.Context, fallbackName string) (string, error) { - d.tapPoolMu.Lock() - if n := len(d.tapPool); n > 0 { - tapName := d.tapPool[n-1] - d.tapPool = d.tapPool[:n-1] - d.tapPoolMu.Unlock() +func (n *HostNetwork) acquireTap(ctx context.Context, fallbackName string) (string, error) { + n.tapPool.mu.Lock() + if count := len(n.tapPool.entries); count > 0 { + tapName := n.tapPool.entries[count-1] + n.tapPool.entries = n.tapPool.entries[:count-1] + n.tapPool.mu.Unlock() return tapName, nil } - d.tapPoolMu.Unlock() + n.tapPool.mu.Unlock() - if err := d.createTap(ctx, fallbackName); err != nil { + if err := n.createTap(ctx, fallbackName); err != nil { return "", err } return fallbackName, nil } -func (d *Daemon) releaseTap(ctx context.Context, tapName string) error { +func (n *HostNetwork) releaseTap(ctx context.Context, tapName string) error { tapName = strings.TrimSpace(tapName) if tapName == "" { return nil } if isTapPoolName(tapName) { - d.tapPoolMu.Lock() - if len(d.tapPool) < d.config.TapPoolSize { - d.tapPool = append(d.tapPool, tapName) - d.tapPoolMu.Unlock() + n.tapPool.mu.Lock() + if len(n.tapPool.entries) < n.config.TapPoolSize { + n.tapPool.entries = append(n.tapPool.entries, tapName) + n.tapPool.mu.Unlock() return nil } - d.tapPoolMu.Unlock() + n.tapPool.mu.Unlock() } - _, err := d.runner.RunSudo(ctx, "ip", "link", "del", tapName) + err := n.privOps().DeleteTap(ctx, tapName) if err == nil { - go d.ensureTapPool(context.Background()) + go n.ensureTapPool(context.Background()) } return err } diff --git a/internal/daemon/vm.go b/internal/daemon/vm.go index afb34ad..4551c96 100644 --- a/internal/daemon/vm.go +++ b/internal/daemon/vm.go @@ -4,1300 +4,45 @@ import ( "context" "errors" "fmt" - "log/slog" - "net" "os" - "path/filepath" "strconv" "strings" "time" - "banger/internal/api" + "banger/internal/daemon/fcproc" "banger/internal/firecracker" - "banger/internal/guest" - "banger/internal/guestconfig" - "banger/internal/guestnet" "banger/internal/model" "banger/internal/namegen" "banger/internal/system" "banger/internal/vmdns" - "banger/internal/vsockagent" ) +// Cross-service constants. Kept in vm.go because both lifecycle +// (VMService) and networking (HostNetwork) reference them; moving +// them to either owner would read as a layering violation. var ( - errWaitForExitTimeout = errors.New("timed out waiting for VM to exit") + errWaitForExitTimeout = fcproc.ErrWaitForExitTimeout gracefulShutdownWait = 10 * time.Second vsockReadyWait = 30 * time.Second vsockReadyPoll = 200 * time.Millisecond ) -const ( - workDiskOpencodeAuthDirRelativePath = ".local/share/opencode" - workDiskOpencodeAuthRelativePath = workDiskOpencodeAuthDirRelativePath + "/auth.json" - hostOpencodeAuthDefaultDisplayPath = "~/" + workDiskOpencodeAuthRelativePath -) - -func (d *Daemon) CreateVM(ctx context.Context, params api.VMCreateParams) (vm model.VMRecord, err error) { - d.mu.Lock() - defer d.mu.Unlock() - op := d.beginOperation("vm.create") - defer func() { - if err != nil { - op.fail(err) - return - } - op.done(vmLogAttrs(vm)...) - }() - if err := validateOptionalPositiveSetting("vcpu", params.VCPUCount); err != nil { - return model.VMRecord{}, err - } - if err := validateOptionalPositiveSetting("memory", params.MemoryMiB); err != nil { - return model.VMRecord{}, err - } - - imageName := params.ImageName - if imageName == "" { - imageName = d.config.DefaultImageName - } - vmCreateStage(ctx, "resolve_image", "resolving image") - image, err := d.FindImage(ctx, imageName) - if err != nil { - return model.VMRecord{}, err - } - vmCreateStage(ctx, "resolve_image", "using image "+image.Name) - op.stage("image_resolved", imageLogAttrs(image)...) - name := strings.TrimSpace(params.Name) - if name == "" { - name, err = d.generateName(ctx) - if err != nil { - return model.VMRecord{}, err - } - } - if _, err := d.FindVM(ctx, name); err == nil { - return model.VMRecord{}, fmt.Errorf("vm name already exists: %s", name) - } - id, err := model.NewID() - if err != nil { - return model.VMRecord{}, err - } - unlockVM := d.lockVMID(id) - defer unlockVM() - guestIP, err := d.store.NextGuestIP(ctx, bridgePrefix(d.config.BridgeIP)) - if err != nil { - return model.VMRecord{}, err - } - vmDir := filepath.Join(d.layout.VMsDir, id) - if err := os.MkdirAll(vmDir, 0o755); err != nil { - return model.VMRecord{}, err - } - vsockCID, err := defaultVSockCID(guestIP) - if err != nil { - return model.VMRecord{}, err - } - systemOverlaySize := int64(model.DefaultSystemOverlaySize) - if params.SystemOverlaySize != "" { - systemOverlaySize, err = model.ParseSize(params.SystemOverlaySize) - if err != nil { - return model.VMRecord{}, err - } - } - workDiskSize := int64(model.DefaultWorkDiskSize) - if params.WorkDiskSize != "" { - workDiskSize, err = model.ParseSize(params.WorkDiskSize) - if err != nil { - return model.VMRecord{}, err - } - } - now := model.Now() - spec := model.VMSpec{ - VCPUCount: optionalIntOrDefault(params.VCPUCount, model.DefaultVCPUCount), - MemoryMiB: optionalIntOrDefault(params.MemoryMiB, model.DefaultMemoryMiB), - SystemOverlaySizeByte: systemOverlaySize, - WorkDiskSizeBytes: workDiskSize, - NATEnabled: params.NATEnabled, - } - vm = model.VMRecord{ - ID: id, - Name: name, - ImageID: image.ID, - State: model.VMStateCreated, - CreatedAt: now, - UpdatedAt: now, - LastTouchedAt: now, - Spec: spec, - Runtime: model.VMRuntime{ - State: model.VMStateCreated, - GuestIP: guestIP, - DNSName: vmdns.RecordName(name), - VMDir: vmDir, - VSockPath: defaultVSockPath(d.layout.RuntimeDir, id), - VSockCID: vsockCID, - SystemOverlay: filepath.Join(vmDir, "system.cow"), - WorkDiskPath: filepath.Join(vmDir, "root.ext4"), - LogPath: filepath.Join(vmDir, "firecracker.log"), - MetricsPath: filepath.Join(vmDir, "metrics.json"), - }, - } - vmCreateBindVM(ctx, vm) - vmCreateStage(ctx, "reserve_vm", fmt.Sprintf("allocated %s (%s)", vm.Name, vm.Runtime.GuestIP)) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - op.stage("persisted", vmLogAttrs(vm)...) - if params.NoStart { - vm.State = model.VMStateStopped - vm.Runtime.State = model.VMStateStopped - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil - } - return d.startVMLocked(ctx, vm, image) -} - -func (d *Daemon) StartVM(ctx context.Context, idOrName string) (model.VMRecord, error) { - return d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - image, err := d.store.GetImageByID(ctx, vm.ImageID) - if err != nil { - return model.VMRecord{}, err - } - if vm.State == model.VMStateRunning && system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - if d.logger != nil { - d.logger.Info("vm already running", vmLogAttrs(vm)...) - } - return vm, nil - } - return d.startVMLocked(ctx, vm, image) - }) -} - -func (d *Daemon) startVMLocked(ctx context.Context, vm model.VMRecord, image model.Image) (_ model.VMRecord, err error) { - op := d.beginOperation("vm.start", append(vmLogAttrs(vm), imageLogAttrs(image)...)...) - defer func() { - if err != nil { - err = annotateLogPath(err, vm.Runtime.LogPath) - op.fail(err, vmLogAttrs(vm)...) - return - } - op.done(vmLogAttrs(vm)...) - }() - op.stage("preflight") - vmCreateStage(ctx, "preflight", "checking host prerequisites") - if err := d.validateStartPrereqs(ctx, vm, image); err != nil { - return model.VMRecord{}, err - } - if err := os.MkdirAll(vm.Runtime.VMDir, 0o755); err != nil { - return model.VMRecord{}, err - } - op.stage("cleanup_runtime") - if err := d.cleanupRuntime(ctx, vm, true); err != nil { - return model.VMRecord{}, err - } - clearRuntimeHandles(&vm) - op.stage("bridge") - if err := d.ensureBridge(ctx); err != nil { - return model.VMRecord{}, err - } - op.stage("socket_dir") - if err := d.ensureSocketDir(); err != nil { - return model.VMRecord{}, err - } - - shortID := system.ShortID(vm.ID) - apiSock := filepath.Join(d.layout.RuntimeDir, "fc-"+shortID+".sock") - dmName := "fc-rootfs-" + shortID - tapName := "tap-fc-" + shortID - if strings.TrimSpace(vm.Runtime.VSockPath) == "" { - vm.Runtime.VSockPath = defaultVSockPath(d.layout.RuntimeDir, vm.ID) - } - if vm.Runtime.VSockCID == 0 { - vm.Runtime.VSockCID, err = defaultVSockCID(vm.Runtime.GuestIP) - if err != nil { - return model.VMRecord{}, err - } - } - if err := os.RemoveAll(apiSock); err != nil && !os.IsNotExist(err) { - return model.VMRecord{}, err - } - if err := os.RemoveAll(vm.Runtime.VSockPath); err != nil && !os.IsNotExist(err) { - return model.VMRecord{}, err - } - - op.stage("system_overlay", "overlay_path", vm.Runtime.SystemOverlay) - vmCreateStage(ctx, "prepare_rootfs", "preparing system overlay") - if err := d.ensureSystemOverlay(ctx, &vm); err != nil { - return model.VMRecord{}, err - } - - op.stage("dm_snapshot", "dm_name", dmName) - vmCreateStage(ctx, "prepare_rootfs", "creating root filesystem snapshot") - handles, err := d.createDMSnapshot(ctx, image.RootfsPath, vm.Runtime.SystemOverlay, dmName) - if err != nil { - return model.VMRecord{}, err - } - vm.Runtime.BaseLoop = handles.BaseLoop - vm.Runtime.COWLoop = handles.COWLoop - vm.Runtime.DMName = handles.DMName - vm.Runtime.DMDev = handles.DMDev - vm.Runtime.APISockPath = apiSock - vm.Runtime.State = model.VMStateRunning - vm.State = model.VMStateRunning - vm.Runtime.LastError = "" - - cleanupOnErr := func(err error) (model.VMRecord, error) { - vm.State = model.VMStateError - vm.Runtime.State = model.VMStateError - vm.Runtime.LastError = err.Error() - op.stage("cleanup_after_failure", "error", err.Error()) - if cleanupErr := d.cleanupRuntime(context.Background(), vm, true); cleanupErr != nil { - err = errors.Join(err, cleanupErr) - } - clearRuntimeHandles(&vm) - _ = d.store.UpsertVM(context.Background(), vm) - return model.VMRecord{}, err - } - - op.stage("patch_root_overlay") - vmCreateStage(ctx, "prepare_rootfs", "writing guest configuration") - if err := d.patchRootOverlay(ctx, vm, image); err != nil { - return cleanupOnErr(err) - } - op.stage("prepare_host_features") - vmCreateStage(ctx, "prepare_host_features", "preparing host-side vm features") - if err := d.prepareCapabilityHosts(ctx, &vm, image); err != nil { - return cleanupOnErr(err) - } - op.stage("tap") - tap, err := d.acquireTap(ctx, tapName) - if err != nil { - return cleanupOnErr(err) - } - vm.Runtime.TapDevice = tap - op.stage("metrics_file", "metrics_path", vm.Runtime.MetricsPath) - if err := os.WriteFile(vm.Runtime.MetricsPath, nil, 0o644); err != nil { - return cleanupOnErr(err) - } - - op.stage("firecracker_binary") - fcPath, err := d.firecrackerBinary() - if err != nil { - return cleanupOnErr(err) - } - op.stage("firecracker_launch", "log_path", vm.Runtime.LogPath, "metrics_path", vm.Runtime.MetricsPath) - vmCreateStage(ctx, "boot_firecracker", "starting firecracker") - firecrackerCtx := context.Background() - machineConfig := firecracker.MachineConfig{ - BinaryPath: fcPath, - VMID: vm.ID, - SocketPath: apiSock, - LogPath: vm.Runtime.LogPath, - MetricsPath: vm.Runtime.MetricsPath, - KernelImagePath: image.KernelPath, - InitrdPath: image.InitrdPath, - KernelArgs: system.BuildBootArgs(vm.Name), - Drives: []firecracker.DriveConfig{{ - ID: "rootfs", - Path: vm.Runtime.DMDev, - ReadOnly: false, - IsRoot: true, - }}, - TapDevice: tap, - VSockPath: vm.Runtime.VSockPath, - VSockCID: vm.Runtime.VSockCID, - VCPUCount: vm.Spec.VCPUCount, - MemoryMiB: vm.Spec.MemoryMiB, - Logger: d.logger, - } - d.contributeMachineConfig(&machineConfig, vm, image) - machine, err := firecracker.NewMachine(firecrackerCtx, machineConfig) - if err != nil { - return cleanupOnErr(err) - } - if err := machine.Start(firecrackerCtx); err != nil { - vm.Runtime.PID = d.resolveFirecrackerPID(firecrackerCtx, machine, apiSock) - return cleanupOnErr(err) - } - vm.Runtime.PID = d.resolveFirecrackerPID(firecrackerCtx, machine, apiSock) - op.debugStage("firecracker_started", "pid", vm.Runtime.PID) - op.stage("socket_access", "api_socket", apiSock) - if err := d.ensureSocketAccess(ctx, apiSock, "firecracker api socket"); err != nil { - return cleanupOnErr(err) - } - op.stage("vsock_access", "vsock_path", vm.Runtime.VSockPath, "vsock_cid", vm.Runtime.VSockCID) - if err := d.ensureSocketAccess(ctx, vm.Runtime.VSockPath, "firecracker vsock socket"); err != nil { - return cleanupOnErr(err) - } - vmCreateStage(ctx, "wait_vsock_agent", "waiting for guest vsock agent") - if err := waitForGuestVSockAgent(ctx, d.logger, vm.Runtime.VSockPath, vsockReadyWait); err != nil { - return cleanupOnErr(err) - } - op.stage("post_start_features") - vmCreateStage(ctx, "wait_guest_ready", "waiting for guest services") - if err := d.postStartCapabilities(ctx, vm, image); err != nil { - return cleanupOnErr(err) - } - system.TouchNow(&vm) - op.stage("persist") - vmCreateStage(ctx, "finalize", "saving vm state") - if err := d.store.UpsertVM(ctx, vm); err != nil { - return cleanupOnErr(err) - } - return vm, nil -} - -func (d *Daemon) StopVM(ctx context.Context, idOrName string) (model.VMRecord, error) { - return d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - return d.stopVMLocked(ctx, vm) - }) -} - -func (d *Daemon) stopVMLocked(ctx context.Context, current model.VMRecord) (vm model.VMRecord, err error) { - vm = current - op := d.beginOperation("vm.stop", "vm_ref", vm.ID) - defer func() { - if err != nil { - op.fail(err, vmLogAttrs(vm)...) - return - } - op.done(vmLogAttrs(vm)...) - }() - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - op.stage("cleanup_stale_runtime") - if err := d.cleanupRuntime(ctx, vm, true); err != nil { - return model.VMRecord{}, err - } - vm.State = model.VMStateStopped - vm.Runtime.State = model.VMStateStopped - clearRuntimeHandles(&vm) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil - } - op.stage("graceful_shutdown") - if err := d.sendCtrlAltDel(ctx, vm); err != nil { - return model.VMRecord{}, err - } - op.stage("wait_for_exit", "pid", vm.Runtime.PID) - if err := d.waitForExit(ctx, vm.Runtime.PID, vm.Runtime.APISockPath, gracefulShutdownWait); err != nil { - if !errors.Is(err, errWaitForExitTimeout) { - return model.VMRecord{}, err - } - op.stage("graceful_shutdown_timeout", "pid", vm.Runtime.PID) - } - op.stage("cleanup_runtime") - if err := d.cleanupRuntime(ctx, vm, true); err != nil { - return model.VMRecord{}, err - } - vm.State = model.VMStateStopped - vm.Runtime.State = model.VMStateStopped - clearRuntimeHandles(&vm) - system.TouchNow(&vm) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil -} - -func (d *Daemon) KillVM(ctx context.Context, params api.VMKillParams) (model.VMRecord, error) { - return d.withVMLockByRef(ctx, params.IDOrName, func(vm model.VMRecord) (model.VMRecord, error) { - return d.killVMLocked(ctx, vm, params.Signal) - }) -} - -func (d *Daemon) killVMLocked(ctx context.Context, current model.VMRecord, signalValue string) (vm model.VMRecord, err error) { - vm = current - op := d.beginOperation("vm.kill", "vm_ref", vm.ID, "signal", signalValue) - defer func() { - if err != nil { - op.fail(err, vmLogAttrs(vm)...) - return - } - op.done(vmLogAttrs(vm)...) - }() - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - op.stage("cleanup_stale_runtime") - if err := d.cleanupRuntime(ctx, vm, true); err != nil { - return model.VMRecord{}, err - } - vm.State = model.VMStateStopped - vm.Runtime.State = model.VMStateStopped - clearRuntimeHandles(&vm) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil - } - - signal := strings.TrimSpace(signalValue) - if signal == "" { - signal = "TERM" - } - op.stage("send_signal", "pid", vm.Runtime.PID, "signal", signal) - if _, err := d.runner.RunSudo(ctx, "kill", "-"+signal, strconv.Itoa(vm.Runtime.PID)); err != nil { - return model.VMRecord{}, err - } - op.stage("wait_for_exit", "pid", vm.Runtime.PID) - if err := d.waitForExit(ctx, vm.Runtime.PID, vm.Runtime.APISockPath, 30*time.Second); err != nil { - if !errors.Is(err, errWaitForExitTimeout) { - return model.VMRecord{}, err - } - op.stage("signal_timeout", "pid", vm.Runtime.PID, "signal", signal) - } - op.stage("cleanup_runtime") - if err := d.cleanupRuntime(ctx, vm, true); err != nil { - return model.VMRecord{}, err - } - vm.State = model.VMStateStopped - vm.Runtime.State = model.VMStateStopped - clearRuntimeHandles(&vm) - system.TouchNow(&vm) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil -} - -func (d *Daemon) RestartVM(ctx context.Context, idOrName string) (vm model.VMRecord, err error) { - op := d.beginOperation("vm.restart", "vm_ref", idOrName) - defer func() { - if err != nil { - op.fail(err, vmLogAttrs(vm)...) - return - } - op.done(vmLogAttrs(vm)...) - }() - resolved, err := d.FindVM(ctx, idOrName) - if err != nil { - return model.VMRecord{}, err - } - return d.withVMLockByID(ctx, resolved.ID, func(vm model.VMRecord) (model.VMRecord, error) { - op.stage("stop") - vm, err = d.stopVMLocked(ctx, vm) - if err != nil { - return model.VMRecord{}, err - } - image, err := d.store.GetImageByID(ctx, vm.ImageID) - if err != nil { - return model.VMRecord{}, err - } - op.stage("start", vmLogAttrs(vm)...) - return d.startVMLocked(ctx, vm, image) - }) -} - -func (d *Daemon) DeleteVM(ctx context.Context, idOrName string) (model.VMRecord, error) { - return d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - return d.deleteVMLocked(ctx, vm) - }) -} - -func (d *Daemon) deleteVMLocked(ctx context.Context, current model.VMRecord) (vm model.VMRecord, err error) { - vm = current - op := d.beginOperation("vm.delete", "vm_ref", vm.ID) - defer func() { - if err != nil { - op.fail(err, vmLogAttrs(vm)...) - return - } - op.done(vmLogAttrs(vm)...) - }() - if vm.State == model.VMStateRunning && system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - op.stage("kill_running_vm", "pid", vm.Runtime.PID) - _ = d.killVMProcess(ctx, vm.Runtime.PID) - } - op.stage("cleanup_runtime") - if err := d.cleanupRuntime(ctx, vm, false); err != nil { - return model.VMRecord{}, err - } - op.stage("delete_store_record") - if err := d.store.DeleteVM(ctx, vm.ID); err != nil { - return model.VMRecord{}, err - } - if vm.Runtime.VMDir != "" { - op.stage("delete_vm_dir", "vm_dir", vm.Runtime.VMDir) - if err := os.RemoveAll(vm.Runtime.VMDir); err != nil { - return model.VMRecord{}, err - } - } - return vm, nil -} - -func (d *Daemon) SetVM(ctx context.Context, params api.VMSetParams) (model.VMRecord, error) { - return d.withVMLockByRef(ctx, params.IDOrName, func(vm model.VMRecord) (model.VMRecord, error) { - return d.setVMLocked(ctx, vm, params) - }) -} - -func (d *Daemon) setVMLocked(ctx context.Context, current model.VMRecord, params api.VMSetParams) (vm model.VMRecord, err error) { - vm = current - op := d.beginOperation("vm.set", "vm_ref", vm.ID) - defer func() { - if err != nil { - op.fail(err, vmLogAttrs(vm)...) - return - } - op.done(vmLogAttrs(vm)...) - }() - running := vm.State == model.VMStateRunning && system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) - if params.VCPUCount != nil { - if err := validateOptionalPositiveSetting("vcpu", params.VCPUCount); err != nil { - return model.VMRecord{}, err - } - if running { - return model.VMRecord{}, errors.New("vcpu changes require the VM to be stopped") - } - op.stage("update_vcpu", "vcpu_count", *params.VCPUCount) - vm.Spec.VCPUCount = *params.VCPUCount - } - if params.MemoryMiB != nil { - if err := validateOptionalPositiveSetting("memory", params.MemoryMiB); err != nil { - return model.VMRecord{}, err - } - if running { - return model.VMRecord{}, errors.New("memory changes require the VM to be stopped") - } - op.stage("update_memory", "memory_mib", *params.MemoryMiB) - vm.Spec.MemoryMiB = *params.MemoryMiB - } - if params.WorkDiskSize != "" { - size, err := model.ParseSize(params.WorkDiskSize) - if err != nil { - return model.VMRecord{}, err - } - if running { - return model.VMRecord{}, errors.New("disk changes require the VM to be stopped") - } - if size < vm.Spec.WorkDiskSizeBytes { - return model.VMRecord{}, errors.New("disk size can only grow") - } - if size > vm.Spec.WorkDiskSizeBytes { - if exists(vm.Runtime.WorkDiskPath) { - op.stage("resize_work_disk", "from_bytes", vm.Spec.WorkDiskSizeBytes, "to_bytes", size) - if err := d.validateWorkDiskResizePrereqs(); err != nil { - return model.VMRecord{}, err - } - if err := system.ResizeExt4Image(ctx, d.runner, vm.Runtime.WorkDiskPath, size); err != nil { - return model.VMRecord{}, err - } - } - vm.Spec.WorkDiskSizeBytes = size - } - } - if params.NATEnabled != nil { - op.stage("update_nat", "nat_enabled", *params.NATEnabled) - vm.Spec.NATEnabled = *params.NATEnabled - } - if running { - if err := d.applyCapabilityConfigChanges(ctx, current, vm); err != nil { - return model.VMRecord{}, err - } - } - system.TouchNow(&vm) - if err := d.store.UpsertVM(ctx, vm); err != nil { - return model.VMRecord{}, err - } - return vm, nil -} - -func (d *Daemon) GetVMStats(ctx context.Context, idOrName string) (model.VMRecord, model.VMStats, error) { - vm, err := d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - return d.getVMStatsLocked(ctx, vm) - }) - if err != nil { - return model.VMRecord{}, model.VMStats{}, err - } - return vm, vm.Stats, nil -} - -func (d *Daemon) HealthVM(ctx context.Context, idOrName string) (result api.VMHealthResult, err error) { - _, err = d.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { - result.Name = vm.Name - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - result.Healthy = false - return vm, nil - } - if strings.TrimSpace(vm.Runtime.VSockPath) == "" { - return model.VMRecord{}, errors.New("vm has no vsock path") - } - if vm.Runtime.VSockCID == 0 { - return model.VMRecord{}, errors.New("vm has no vsock cid") - } - if err := d.ensureSocketAccess(ctx, vm.Runtime.VSockPath, "firecracker vsock socket"); err != nil { - return model.VMRecord{}, err - } - pingCtx, cancel := context.WithTimeout(ctx, 3*time.Second) - defer cancel() - if err := vsockagent.Health(pingCtx, d.logger, vm.Runtime.VSockPath); err != nil { - return model.VMRecord{}, err - } - result.Healthy = true - return vm, nil - }) - return result, err -} - -func (d *Daemon) PingVM(ctx context.Context, idOrName string) (result api.VMPingResult, err error) { - health, err := d.HealthVM(ctx, idOrName) - if err != nil { - return api.VMPingResult{}, err - } - return api.VMPingResult{Name: health.Name, Alive: health.Healthy}, nil -} - -func (d *Daemon) getVMStatsLocked(ctx context.Context, vm model.VMRecord) (model.VMRecord, error) { - stats, err := d.collectStats(ctx, vm) - if err == nil { - vm.Stats = stats - vm.UpdatedAt = model.Now() - _ = d.store.UpsertVM(ctx, vm) - if d.logger != nil { - d.logger.Debug("vm stats collected", append(vmLogAttrs(vm), "rss_bytes", stats.RSSBytes, "vsz_bytes", stats.VSZBytes, "cpu_percent", stats.CPUPercent)...) - } - } - return vm, nil -} - -func (d *Daemon) pollStats(ctx context.Context) error { - vms, err := d.store.ListVMs(ctx) - if err != nil { - return err - } - for _, vm := range vms { - if err := d.withVMLockByIDErr(ctx, vm.ID, func(vm model.VMRecord) error { - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - return nil - } - stats, err := d.collectStats(ctx, vm) - if err != nil { - if d.logger != nil { - d.logger.Debug("vm stats collection failed", append(vmLogAttrs(vm), "error", err.Error())...) - } - return nil - } - vm.Stats = stats - vm.UpdatedAt = model.Now() - return d.store.UpsertVM(ctx, vm) - }); err != nil { - return err - } - } - return nil -} - -func (d *Daemon) stopStaleVMs(ctx context.Context) (err error) { - if d.config.AutoStopStaleAfter <= 0 { +// rebuildDNS enumerates live VMs and republishes the DNS record set. +// Lives on VMService because "alive" is a VM-state concern that +// HostNetwork shouldn't need to reach into. VMService orchestrates: +// VM list from the store, alive filter, hand the resulting map to +// HostNetwork.replaceDNS. +func (s *VMService) rebuildDNS(ctx context.Context) error { + if s.net == nil { return nil } - op := d.beginOperation("vm.stop_stale") - defer func() { - if err != nil { - op.fail(err) - return - } - op.done() - }() - vms, err := d.store.ListVMs(ctx) - if err != nil { - return err - } - now := model.Now() - for _, vm := range vms { - if err := d.withVMLockByIDErr(ctx, vm.ID, func(vm model.VMRecord) error { - if vm.State != model.VMStateRunning || !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - return nil - } - if now.Sub(vm.LastTouchedAt) < d.config.AutoStopStaleAfter { - return nil - } - op.stage("stopping_vm", vmLogAttrs(vm)...) - _ = d.sendCtrlAltDel(ctx, vm) - _ = d.waitForExit(ctx, vm.Runtime.PID, vm.Runtime.APISockPath, 10*time.Second) - _ = d.cleanupRuntime(ctx, vm, true) - vm.State = model.VMStateStopped - vm.Runtime.State = model.VMStateStopped - clearRuntimeHandles(&vm) - vm.UpdatedAt = model.Now() - return d.store.UpsertVM(ctx, vm) - }); err != nil { - return err - } - } - return nil -} - -func (d *Daemon) collectStats(ctx context.Context, vm model.VMRecord) (model.VMStats, error) { - stats := model.VMStats{ - CollectedAt: model.Now(), - SystemOverlayBytes: system.AllocatedBytes(vm.Runtime.SystemOverlay), - WorkDiskBytes: system.AllocatedBytes(vm.Runtime.WorkDiskPath), - MetricsRaw: system.ParseMetricsFile(vm.Runtime.MetricsPath), - } - if vm.Runtime.PID > 0 && system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { - ps, err := system.ReadProcessStats(ctx, vm.Runtime.PID) - if err == nil { - stats.CPUPercent = ps.CPUPercent - stats.RSSBytes = ps.RSSBytes - stats.VSZBytes = ps.VSZBytes - } - } - return stats, nil -} - -func (d *Daemon) ensureSystemOverlay(ctx context.Context, vm *model.VMRecord) error { - if exists(vm.Runtime.SystemOverlay) { - return nil - } - _, err := d.runner.Run(ctx, "truncate", "-s", strconv.FormatInt(vm.Spec.SystemOverlaySizeByte, 10), vm.Runtime.SystemOverlay) - return err -} - -func (d *Daemon) patchRootOverlay(ctx context.Context, vm model.VMRecord, image model.Image) error { - resolv := []byte(fmt.Sprintf("nameserver %s\n", d.config.DefaultDNS)) - hostname := []byte(vm.Name + "\n") - hosts := []byte(fmt.Sprintf("127.0.0.1 localhost\n127.0.1.1 %s\n", vm.Name)) - sshdConfig := []byte(strings.Join([]string{ - "LogLevel DEBUG3", - "PermitRootLogin yes", - "PubkeyAuthentication yes", - "AuthorizedKeysFile /root/.ssh/authorized_keys", - "StrictModes no", - "", - }, "\n")) - fstab, err := system.ReadDebugFSText(ctx, d.runner, vm.Runtime.DMDev, "/etc/fstab") - if err != nil { - fstab = "" - } - builder := guestconfig.NewBuilder() - builder.WriteFile("/etc/resolv.conf", resolv) - builder.WriteFile("/etc/hostname", hostname) - builder.WriteFile("/etc/hosts", hosts) - builder.WriteFile(guestnet.ConfigPath, guestnet.ConfigFile(vm.Runtime.GuestIP, d.config.BridgeIP, d.config.DefaultDNS)) - builder.WriteFile(guestnet.GuestScriptPath, []byte(guestnet.BootstrapScript())) - builder.WriteFile("/etc/ssh/sshd_config.d/99-banger.conf", sshdConfig) - builder.DropMountTarget("/home") - builder.DropMountTarget("/var") - builder.AddMount(guestconfig.MountSpec{ - Source: "tmpfs", - Target: "/run", - FSType: "tmpfs", - Options: []string{"defaults", "nodev", "nosuid", "mode=0755"}, - Dump: 0, - Pass: 0, - }) - builder.AddMount(guestconfig.MountSpec{ - Source: "tmpfs", - Target: "/tmp", - FSType: "tmpfs", - Options: []string{"defaults", "nodev", "nosuid", "mode=1777"}, - Dump: 0, - Pass: 0, - }) - d.contributeGuestConfig(builder, vm, image) - builder.WriteFile("/etc/fstab", []byte(builder.RenderFSTab(fstab))) - files := builder.Files() - for _, guestPath := range builder.FilePaths() { - data := files[guestPath] - if guestPath == guestnet.GuestScriptPath { - if err := system.WriteExt4FileMode(ctx, d.runner, vm.Runtime.DMDev, guestPath, 0o755, data); err != nil { - return err - } - continue - } - if err := system.WriteExt4File(ctx, d.runner, vm.Runtime.DMDev, guestPath, data); err != nil { - return err - } - } - return nil -} - -type workDiskPreparation struct { - ClonedFromSeed bool -} - -func (d *Daemon) ensureWorkDisk(ctx context.Context, vm *model.VMRecord, image model.Image) (workDiskPreparation, error) { - if exists(vm.Runtime.WorkDiskPath) { - return workDiskPreparation{}, nil - } - if exists(image.WorkSeedPath) { - vmCreateStage(ctx, "prepare_work_disk", "cloning work seed") - if err := system.CopyFilePreferClone(image.WorkSeedPath, vm.Runtime.WorkDiskPath); err != nil { - return workDiskPreparation{}, err - } - seedInfo, err := os.Stat(image.WorkSeedPath) - if err != nil { - return workDiskPreparation{}, err - } - if vm.Spec.WorkDiskSizeBytes < seedInfo.Size() { - return workDiskPreparation{}, fmt.Errorf("requested work disk size %d is smaller than seed image %d", vm.Spec.WorkDiskSizeBytes, seedInfo.Size()) - } - if vm.Spec.WorkDiskSizeBytes > seedInfo.Size() { - vmCreateStage(ctx, "prepare_work_disk", "resizing work disk") - if err := system.ResizeExt4Image(ctx, d.runner, vm.Runtime.WorkDiskPath, vm.Spec.WorkDiskSizeBytes); err != nil { - return workDiskPreparation{}, err - } - } - return workDiskPreparation{ClonedFromSeed: true}, nil - } - vmCreateStage(ctx, "prepare_work_disk", "creating empty work disk") - if _, err := d.runner.Run(ctx, "truncate", "-s", strconv.FormatInt(vm.Spec.WorkDiskSizeBytes, 10), vm.Runtime.WorkDiskPath); err != nil { - return workDiskPreparation{}, err - } - if _, err := d.runner.Run(ctx, "mkfs.ext4", "-F", vm.Runtime.WorkDiskPath); err != nil { - return workDiskPreparation{}, err - } - rootMount, cleanupRoot, err := system.MountTempDir(ctx, d.runner, vm.Runtime.DMDev, true) - if err != nil { - return workDiskPreparation{}, err - } - defer cleanupRoot() - workMount, cleanupWork, err := system.MountTempDir(ctx, d.runner, vm.Runtime.WorkDiskPath, false) - if err != nil { - return workDiskPreparation{}, err - } - defer cleanupWork() - vmCreateStage(ctx, "prepare_work_disk", "copying /root into work disk") - if err := system.CopyDirContents(ctx, d.runner, filepath.Join(rootMount, "root"), workMount, true); err != nil { - return workDiskPreparation{}, err - } - if err := d.flattenNestedWorkHome(ctx, workMount); err != nil { - return workDiskPreparation{}, err - } - return workDiskPreparation{}, nil -} - -func (d *Daemon) ensureAuthorizedKeyOnWorkDisk(ctx context.Context, vm *model.VMRecord, image model.Image, prep workDiskPreparation) error { - fingerprint, err := guest.AuthorizedPublicKeyFingerprint(d.config.SSHKeyPath) - if err != nil { - return fmt.Errorf("derive authorized ssh key fingerprint: %w", err) - } - if prep.ClonedFromSeed && image.SeededSSHPublicKeyFingerprint != "" && image.SeededSSHPublicKeyFingerprint == fingerprint { - vmCreateStage(ctx, "prepare_work_disk", "using seeded SSH access") - return nil - } - publicKey, err := guest.AuthorizedPublicKey(d.config.SSHKeyPath) - if err != nil { - return fmt.Errorf("derive authorized ssh key: %w", err) - } - vmCreateStage(ctx, "prepare_work_disk", "repairing SSH access on work disk") - workMount, cleanupWork, err := system.MountTempDir(ctx, d.runner, vm.Runtime.WorkDiskPath, false) - if err != nil { - return err - } - defer cleanupWork() - - if err := d.flattenNestedWorkHome(ctx, workMount); err != nil { - return err - } - - sshDir := filepath.Join(workMount, ".ssh") - if _, err := d.runner.RunSudo(ctx, "mkdir", "-p", sshDir); err != nil { - return err - } - if _, err := d.runner.RunSudo(ctx, "chmod", "700", sshDir); err != nil { - return err - } - - authorizedKeysPath := filepath.Join(sshDir, "authorized_keys") - existing, err := d.runner.RunSudo(ctx, "cat", authorizedKeysPath) - if err != nil { - existing = nil - } - merged := mergeAuthorizedKey(existing, publicKey) - - tmpFile, err := os.CreateTemp("", "banger-authorized-keys-*") - if err != nil { - return err - } - tmpPath := tmpFile.Name() - if _, err := tmpFile.Write(merged); err != nil { - _ = tmpFile.Close() - _ = os.Remove(tmpPath) - return err - } - if err := tmpFile.Close(); err != nil { - _ = os.Remove(tmpPath) - return err - } - defer os.Remove(tmpPath) - - if _, err := d.runner.RunSudo(ctx, "install", "-m", "600", tmpPath, authorizedKeysPath); err != nil { - return err - } - if prep.ClonedFromSeed && image.Managed { - vmCreateStage(ctx, "prepare_work_disk", "refreshing managed work seed") - if err := d.refreshManagedWorkSeedFingerprint(ctx, image, fingerprint); err != nil { - return err - } - } - return nil -} - -func (d *Daemon) ensureOpencodeAuthOnWorkDisk(ctx context.Context, vm *model.VMRecord) error { - hostAuthPath, err := resolveHostOpencodeAuthPath() - if err != nil { - d.warnOpencodeAuthSyncSkipped(*vm, hostOpencodeAuthDefaultDisplayPath, err) - return nil - } - authData, err := os.ReadFile(hostAuthPath) - if err != nil { - d.warnOpencodeAuthSyncSkipped(*vm, hostAuthPath, err) - return nil - } - - vmCreateStage(ctx, "prepare_work_disk", "syncing opencode auth") - workMount, cleanupWork, err := system.MountTempDir(ctx, d.runner, vm.Runtime.WorkDiskPath, false) - if err != nil { - return err - } - defer cleanupWork() - - if err := d.flattenNestedWorkHome(ctx, workMount); err != nil { - return err - } - - authDir := filepath.Join(workMount, workDiskOpencodeAuthDirRelativePath) - if _, err := d.runner.RunSudo(ctx, "mkdir", "-p", authDir); err != nil { - return err - } - authPath := filepath.Join(workMount, workDiskOpencodeAuthRelativePath) - - tmpFile, err := os.CreateTemp("", "banger-opencode-auth-*") - if err != nil { - return err - } - tmpPath := tmpFile.Name() - if _, err := tmpFile.Write(authData); err != nil { - _ = tmpFile.Close() - _ = os.Remove(tmpPath) - return err - } - if err := tmpFile.Close(); err != nil { - _ = os.Remove(tmpPath) - return err - } - defer os.Remove(tmpPath) - - _, err = d.runner.RunSudo(ctx, "install", "-m", "600", tmpPath, authPath) - return err -} - -func resolveHostOpencodeAuthPath() (string, error) { - home, err := os.UserHomeDir() - if err != nil { - return "", err - } - return filepath.Join(home, workDiskOpencodeAuthRelativePath), nil -} - -func (d *Daemon) warnOpencodeAuthSyncSkipped(vm model.VMRecord, hostPath string, err error) { - if d.logger == nil || err == nil { - return - } - d.logger.Warn("guest opencode auth sync skipped", append(vmLogAttrs(vm), "host_path", hostPath, "error", err.Error())...) -} - -func mergeAuthorizedKey(existing, managed []byte) []byte { - managedLine := strings.TrimSpace(string(managed)) - if managedLine == "" { - return append([]byte(nil), existing...) - } - - lines := strings.Split(strings.ReplaceAll(string(existing), "\r\n", "\n"), "\n") - out := make([]string, 0, len(lines)+1) - found := false - for _, line := range lines { - line = strings.TrimRight(line, "\r") - trimmed := strings.TrimSpace(line) - if trimmed == "" { - continue - } - if trimmed == managedLine { - found = true - } - out = append(out, line) - } - if !found { - out = append(out, managedLine) - } - return []byte(strings.Join(out, "\n") + "\n") -} - -func (d *Daemon) flattenNestedWorkHome(ctx context.Context, workMount string) error { - nestedHome := filepath.Join(workMount, "root") - if !exists(nestedHome) { - return nil - } - if _, err := d.runner.RunSudo(ctx, "chmod", "755", nestedHome); err != nil { - return err - } - entries, err := os.ReadDir(nestedHome) - if err != nil { - return err - } - for _, entry := range entries { - sourcePath := filepath.Join(nestedHome, entry.Name()) - if _, err := d.runner.RunSudo(ctx, "cp", "-a", sourcePath, workMount+"/"); err != nil { - return err - } - } - _, err = d.runner.RunSudo(ctx, "rm", "-rf", nestedHome) - return err -} - -func (d *Daemon) ensureBridge(ctx context.Context) error { - if _, err := d.runner.Run(ctx, "ip", "link", "show", d.config.BridgeName); err == nil { - _, err = d.runner.RunSudo(ctx, "ip", "link", "set", d.config.BridgeName, "up") - return err - } - if _, err := d.runner.RunSudo(ctx, "ip", "link", "add", "name", d.config.BridgeName, "type", "bridge"); err != nil { - return err - } - if _, err := d.runner.RunSudo(ctx, "ip", "addr", "add", fmt.Sprintf("%s/%s", d.config.BridgeIP, d.config.CIDR), "dev", d.config.BridgeName); err != nil { - return err - } - _, err := d.runner.RunSudo(ctx, "ip", "link", "set", d.config.BridgeName, "up") - return err -} - -func (d *Daemon) ensureSocketDir() error { - return os.MkdirAll(d.layout.RuntimeDir, 0o755) -} - -func (d *Daemon) createTap(ctx context.Context, tap string) error { - if _, err := d.runner.Run(ctx, "ip", "link", "show", tap); err == nil { - _, _ = d.runner.RunSudo(ctx, "ip", "link", "del", tap) - } - if _, err := d.runner.RunSudo(ctx, "ip", "tuntap", "add", "dev", tap, "mode", "tap", "user", strconv.Itoa(os.Getuid()), "group", strconv.Itoa(os.Getgid())); err != nil { - return err - } - if _, err := d.runner.RunSudo(ctx, "ip", "link", "set", tap, "master", d.config.BridgeName); err != nil { - return err - } - if _, err := d.runner.RunSudo(ctx, "ip", "link", "set", tap, "up"); err != nil { - return err - } - _, err := d.runner.RunSudo(ctx, "ip", "link", "set", d.config.BridgeName, "up") - return err -} - -func (d *Daemon) firecrackerBinary() (string, error) { - if d.config.FirecrackerBin == "" { - return "", fmt.Errorf("firecracker binary not configured; install firecracker or set firecracker_bin") - } - path := d.config.FirecrackerBin - if strings.ContainsRune(path, os.PathSeparator) { - if !exists(path) { - return "", fmt.Errorf("firecracker binary not found at %s; install firecracker or set firecracker_bin", path) - } - return path, nil - } - resolved, err := system.LookupExecutable(path) - if err != nil { - return "", fmt.Errorf("firecracker binary %q not found in PATH; install firecracker or set firecracker_bin", path) - } - return resolved, nil -} - -func (d *Daemon) ensureSocketAccess(ctx context.Context, socketPath, label string) error { - if err := waitForPath(ctx, socketPath, 5*time.Second, label); err != nil { - return err - } - if _, err := d.runner.RunSudo(ctx, "chown", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), socketPath); err != nil { - return err - } - _, err := d.runner.RunSudo(ctx, "chmod", "600", socketPath) - return err -} - -func (d *Daemon) findFirecrackerPID(ctx context.Context, apiSock string) (int, error) { - out, err := d.runner.Run(ctx, "pgrep", "-n", "-f", apiSock) - if err != nil { - return 0, err - } - return strconv.Atoi(strings.TrimSpace(string(out))) -} - -func (d *Daemon) resolveFirecrackerPID(ctx context.Context, machine *firecracker.Machine, apiSock string) int { - if pid, err := d.findFirecrackerPID(ctx, apiSock); err == nil && pid > 0 { - return pid - } - if machine != nil { - if pid, err := machine.PID(); err == nil && pid > 0 { - return pid - } - } - return 0 -} - -func (d *Daemon) sendCtrlAltDel(ctx context.Context, vm model.VMRecord) error { - if err := d.ensureSocketAccess(ctx, vm.Runtime.APISockPath, "firecracker api socket"); err != nil { - return err - } - client := firecracker.New(vm.Runtime.APISockPath, d.logger) - return client.SendCtrlAltDel(ctx) -} - -func (d *Daemon) waitForExit(ctx context.Context, pid int, apiSock string, timeout time.Duration) error { - deadline := time.Now().Add(timeout) - for { - if !system.ProcessRunning(pid, apiSock) { - return nil - } - if time.Now().After(deadline) { - return errWaitForExitTimeout - } - select { - case <-ctx.Done(): - return ctx.Err() - case <-time.After(100 * time.Millisecond): - } - } -} - -func (d *Daemon) cleanupRuntime(ctx context.Context, vm model.VMRecord, preserveDisks bool) error { - if d.logger != nil { - d.logger.Debug("cleanup runtime", append(vmLogAttrs(vm), "preserve_disks", preserveDisks)...) - } - cleanupPID := vm.Runtime.PID - if vm.Runtime.APISockPath != "" { - if pid, err := d.findFirecrackerPID(ctx, vm.Runtime.APISockPath); err == nil && pid > 0 { - cleanupPID = pid - } - } - if cleanupPID > 0 && system.ProcessRunning(cleanupPID, vm.Runtime.APISockPath) { - _ = d.killVMProcess(ctx, cleanupPID) - if err := d.waitForExit(ctx, cleanupPID, vm.Runtime.APISockPath, 30*time.Second); err != nil { - return err - } - } - snapshotErr := d.cleanupDMSnapshot(ctx, dmSnapshotHandles{ - BaseLoop: vm.Runtime.BaseLoop, - COWLoop: vm.Runtime.COWLoop, - DMName: vm.Runtime.DMName, - DMDev: vm.Runtime.DMDev, - }) - featureErr := d.cleanupCapabilityState(ctx, vm) - var tapErr error - if vm.Runtime.TapDevice != "" { - tapErr = d.releaseTap(ctx, vm.Runtime.TapDevice) - } - if vm.Runtime.APISockPath != "" { - _ = os.Remove(vm.Runtime.APISockPath) - } - if vm.Runtime.VSockPath != "" { - _ = os.Remove(vm.Runtime.VSockPath) - } - if !preserveDisks && vm.Runtime.VMDir != "" { - return errors.Join(snapshotErr, featureErr, tapErr, os.RemoveAll(vm.Runtime.VMDir)) - } - return errors.Join(snapshotErr, featureErr, tapErr) -} - -func clearRuntimeHandles(vm *model.VMRecord) { - vm.Runtime.PID = 0 - vm.Runtime.APISockPath = "" - vm.Runtime.TapDevice = "" - vm.Runtime.BaseLoop = "" - vm.Runtime.COWLoop = "" - vm.Runtime.DMName = "" - vm.Runtime.DMDev = "" -} - -func defaultVSockPath(runtimeDir, vmID string) string { - return filepath.Join(runtimeDir, "fc-"+system.ShortID(vmID)+".vsock") -} - -func defaultVSockCID(guestIP string) (uint32, error) { - ip := net.ParseIP(strings.TrimSpace(guestIP)).To4() - if ip == nil { - return 0, fmt.Errorf("guest IP is not IPv4: %q", guestIP) - } - return 10000 + uint32(ip[3]), nil -} - -func waitForPath(ctx context.Context, path string, timeout time.Duration, label string) error { - deadline := time.Now().Add(timeout) - for { - if _, err := os.Stat(path); err == nil { - return nil - } else if err != nil && !os.IsNotExist(err) { - return err - } - if time.Now().After(deadline) { - return fmt.Errorf("%s not ready: %s: %w", label, path, context.DeadlineExceeded) - } - select { - case <-ctx.Done(): - return ctx.Err() - case <-time.After(100 * time.Millisecond): - } - } -} - -func waitForGuestVSockAgent(ctx context.Context, logger *slog.Logger, socketPath string, timeout time.Duration) error { - if strings.TrimSpace(socketPath) == "" { - return errors.New("vsock path is required") - } - - waitCtx, cancel := context.WithTimeout(ctx, timeout) - defer cancel() - - ticker := time.NewTicker(vsockReadyPoll) - defer ticker.Stop() - - var lastErr error - for { - pingCtx, pingCancel := context.WithTimeout(waitCtx, 3*time.Second) - err := vsockagent.Health(pingCtx, logger, socketPath) - pingCancel() - if err == nil { - return nil - } - lastErr = err - - select { - case <-waitCtx.Done(): - if lastErr != nil { - return fmt.Errorf("guest vsock agent not ready: %w", lastErr) - } - return errors.New("guest vsock agent not ready before timeout") - case <-ticker.C: - } - } -} - -func (d *Daemon) setDNS(ctx context.Context, vmName, guestIP string) error { - if d.vmDNS == nil { - return nil - } - return d.vmDNS.Set(vmdns.RecordName(vmName), guestIP) -} - -func (d *Daemon) removeDNS(ctx context.Context, dnsName string) error { - if dnsName == "" { - return nil - } - if d.vmDNS == nil { - return nil - } - return d.vmDNS.Remove(dnsName) -} - -func (d *Daemon) rebuildDNS(ctx context.Context) error { - if d.vmDNS == nil { - return nil - } - vms, err := d.store.ListVMs(ctx) + vms, err := s.store.ListVMs(ctx) if err != nil { return err } records := make(map[string]string) for _, vm := range vms { - if vm.State != model.VMStateRunning { - continue - } - if !system.ProcessRunning(vm.Runtime.PID, vm.Runtime.APISockPath) { + if !s.vmAlive(vm) { continue } if strings.TrimSpace(vm.Runtime.GuestIP) == "" { @@ -1305,15 +50,138 @@ func (d *Daemon) rebuildDNS(ctx context.Context) error { } records[vmdns.RecordName(vm.Name)] = vm.Runtime.GuestIP } - return d.vmDNS.Replace(records) + return s.net.replaceDNS(records) } -func (d *Daemon) killVMProcess(ctx context.Context, pid int) error { - _, err := d.runner.RunSudo(ctx, "kill", "-KILL", strconv.Itoa(pid)) - return err +func persistRuntimeTeardownState(vm *model.VMRecord, h model.VMHandles) { + if vm == nil { + return + } + vm.Runtime.TapDevice = h.TapDevice + vm.Runtime.BaseLoop = h.BaseLoop + vm.Runtime.COWLoop = h.COWLoop + vm.Runtime.DMName = h.DMName + vm.Runtime.DMDev = h.DMDev } -func (d *Daemon) generateName(ctx context.Context) (string, error) { +func clearRuntimeTeardownState(vm *model.VMRecord) { + if vm == nil { + return + } + vm.Runtime.TapDevice = "" + vm.Runtime.BaseLoop = "" + vm.Runtime.COWLoop = "" + vm.Runtime.DMName = "" + vm.Runtime.DMDev = "" +} + +func teardownHandlesForCleanup(vm model.VMRecord, live model.VMHandles) model.VMHandles { + recovered := live + if strings.TrimSpace(recovered.TapDevice) == "" { + recovered.TapDevice = strings.TrimSpace(vm.Runtime.TapDevice) + } + if strings.TrimSpace(recovered.BaseLoop) == "" { + recovered.BaseLoop = strings.TrimSpace(vm.Runtime.BaseLoop) + } + if strings.TrimSpace(recovered.COWLoop) == "" { + recovered.COWLoop = strings.TrimSpace(vm.Runtime.COWLoop) + } + if strings.TrimSpace(recovered.DMName) == "" { + recovered.DMName = strings.TrimSpace(vm.Runtime.DMName) + } + if strings.TrimSpace(recovered.DMDev) == "" { + recovered.DMDev = strings.TrimSpace(vm.Runtime.DMDev) + } + return recovered +} + +// cleanupRuntime tears down the host-side state for a VM: firecracker +// process, DM snapshot, capabilities, tap, sockets. Lives on VMService +// because it reaches into handles (VMService-owned); the capability +// teardown goes through the capHooks seam to keep Daemon out of the +// dependency chain. +// +// Idempotency contract: every step runs even when an earlier step +// fails, and the per-step errors are joined into the returned value. +// A waitForExit timeout (firecracker refused to die) used to early- +// return, leaving DM/feature/tap state stranded on the host across +// daemon restarts. With collect-and-continue the kernel teardowns +// still attempt; in the worst case (firecracker actually still alive) +// they fail with EBUSY which is also surfaced via errors.Join — no +// damage, but the operator sees the full picture. +func (s *VMService) cleanupRuntime(ctx context.Context, vm model.VMRecord, preserveDisks bool) error { + if s.logger != nil { + s.logger.Debug("cleanup runtime", append(vmLogAttrs(vm), "preserve_disks", preserveDisks)...) + } + h := s.vmHandles(vm.ID) + cleanupPID := h.PID + if vm.Runtime.APISockPath != "" { + if pid, err := s.net.findFirecrackerPID(ctx, vm.Runtime.APISockPath); err == nil && pid > 0 { + cleanupPID = pid + } + } + var waitErr error + if cleanupPID > 0 && system.ProcessRunning(cleanupPID, vm.Runtime.APISockPath) { + _ = s.net.killVMProcess(ctx, cleanupPID) + waitErr = s.net.waitForExit(ctx, cleanupPID, vm.Runtime.APISockPath, 30*time.Second) + if waitErr != nil && s.logger != nil { + s.logger.Warn("cleanup wait_for_exit failed; continuing teardown", append(vmLogAttrs(vm), "pid", cleanupPID, "error", waitErr.Error())...) + } + } + handles := teardownHandlesForCleanup(vm, h) + snapshotErr := s.net.cleanupDMSnapshot(ctx, dmSnapshotHandles{ + BaseLoop: handles.BaseLoop, + COWLoop: handles.COWLoop, + DMName: handles.DMName, + DMDev: handles.DMDev, + }) + featureErr := s.capHooks.cleanupState(ctx, vm) + var tapErr error + // Prefer the handle cache (fresh from startVMLocked), but fall + // back to the VMRuntime mirrors so restart-time cleanup still works + // when handles.json is missing or corrupt. + tap := handles.TapDevice + if tap != "" { + tapErr = s.net.releaseTap(ctx, tap) + } + if vm.Runtime.APISockPath != "" { + _ = os.Remove(vm.Runtime.APISockPath) + } + if vm.Runtime.VSockPath != "" { + _ = os.Remove(vm.Runtime.VSockPath) + } + // Remove the jailer chroot tree (kernel hard-links, mknod'd device + // nodes, the chroot root itself). Skipped silently when the jailer + // is disabled or the chroot was never created. We intentionally + // don't gate on JailerEnabled today — old VMs created before the + // flag flipped on still need their chroots removed if any exist. + jailerErr := s.cleanupJailerChroot(ctx, vm) + // The handles are only meaningful while the kernel objects exist; + // dropping them here keeps the cache in sync with reality even + // when the caller forgets to call clearVMHandles explicitly. + s.clearVMHandles(vm) + if !preserveDisks && vm.Runtime.VMDir != "" { + return errors.Join(waitErr, snapshotErr, featureErr, tapErr, jailerErr, os.RemoveAll(vm.Runtime.VMDir)) + } + return errors.Join(waitErr, snapshotErr, featureErr, tapErr, jailerErr) +} + +// cleanupJailerChroot removes the per-VM chroot tree if it exists. Returns +// nil silently when the jailer was never enabled or the chroot path can't +// be computed (no JailerChrootBase configured). +func (s *VMService) cleanupJailerChroot(ctx context.Context, vm model.VMRecord) error { + base := strings.TrimSpace(s.config.JailerChrootBase) + if base == "" { + return nil + } + chrootRoot := firecracker.JailerChrootRoot(base, vm.ID) + if _, err := os.Stat(chrootRoot); os.IsNotExist(err) { + return nil + } + return s.privOps().CleanupJailerChroot(ctx, chrootRoot) +} + +func (s *VMService) generateName(ctx context.Context) (string, error) { _ = ctx if name := strings.TrimSpace(namegen.Generate()); name != "" { return name, nil diff --git a/internal/daemon/vm_authsync.go b/internal/daemon/vm_authsync.go new file mode 100644 index 0000000..117014a --- /dev/null +++ b/internal/daemon/vm_authsync.go @@ -0,0 +1,404 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "os" + "path" + "path/filepath" + "strconv" + "strings" + + "banger/internal/config" + "banger/internal/guest" + "banger/internal/model" + "banger/internal/system" +) + +const ( + workDiskGitConfigRelativePath = ".gitconfig" + hostGlobalGitIdentitySource = "git config --global" +) + +type gitIdentity struct { + Name string + Email string +} + +func (s *WorkspaceService) ensureAuthorizedKeyOnWorkDisk(ctx context.Context, vm *model.VMRecord, image model.Image, prep workDiskPreparation) error { + fingerprint, err := guest.AuthorizedPublicKeyFingerprint(s.config.SSHKeyPath) + if err != nil { + return fmt.Errorf("derive authorized ssh key fingerprint: %w", err) + } + if prep.ClonedFromSeed && image.SeededSSHPublicKeyFingerprint != "" && image.SeededSSHPublicKeyFingerprint == fingerprint { + vmCreateStage(ctx, "prepare_work_disk", "using seeded SSH access") + return nil + } + publicKey, err := guest.AuthorizedPublicKey(s.config.SSHKeyPath) + if err != nil { + return fmt.Errorf("derive authorized ssh key: %w", err) + } + vmCreateStage(ctx, "prepare_work_disk", "provisioning SSH access on work disk") + + workDisk := vm.Runtime.WorkDiskPath + if err := provisionAuthorizedKey(ctx, s.runner, workDisk, publicKey); err != nil { + return err + } + + if prep.ClonedFromSeed && image.Managed { + vmCreateStage(ctx, "prepare_work_disk", "refreshing managed work seed") + if err := s.imageWorkSeed(ctx, image, fingerprint); err != nil { + return err + } + } + return nil +} + +// provisionAuthorizedKey writes the managed SSH key into +// /.ssh/authorized_keys on an ext4 image via the sudoless toolkit. +// Shared between work-disk and image-seed paths — both need the same +// sequence: normalise fs-root perms, create /.ssh, merge against any +// existing authorized_keys, rewrite with root:root:0600. +// +// The fs root doubles as /root inside the guest, which sshd walks +// under StrictModes; forcing 0755 root:root here keeps a drifted +// seed image from silently rejecting the key at login time. +func provisionAuthorizedKey(ctx context.Context, runner system.CommandRunner, imagePath string, publicKey []byte) error { + if err := system.EnsureExt4RootPerms(ctx, runner, imagePath, 0o755, 0, 0); err != nil { + return err + } + if err := system.MkdirExt4(ctx, runner, imagePath, "/.ssh", 0o700, 0, 0); err != nil { + return err + } + var existing []byte + exists, err := system.Ext4PathExists(ctx, runner, imagePath, "/.ssh/authorized_keys") + if err != nil { + return err + } + if exists { + existing, err = system.ReadExt4File(ctx, runner, imagePath, "/.ssh/authorized_keys") + if err != nil { + return err + } + } + merged := mergeAuthorizedKey(existing, publicKey) + return system.WriteExt4FileOwned(ctx, runner, imagePath, "/.ssh/authorized_keys", 0o600, 0, 0, merged) +} + +// ensureHushLoginOnWorkDisk lands /root/.hushlogin in the guest by +// writing /.hushlogin at the root of the work disk (which mounts at +// /root inside the guest). pam_motd checks $HOME/.hushlogin and stays +// silent when it exists — combined with sshd's PrintMotd no / PrintLastLog no +// that suppresses the Debian-style banner on `banger vm run`. +func (s *WorkspaceService) ensureHushLoginOnWorkDisk(ctx context.Context, vm *model.VMRecord) error { + return system.WriteExt4FileOwned(ctx, s.runner, vm.Runtime.WorkDiskPath, "/.hushlogin", 0o644, 0, 0, nil) +} + +func (s *WorkspaceService) ensureGitIdentityOnWorkDisk(ctx context.Context, vm *model.VMRecord) error { + runner := s.runner + if runner == nil { + runner = system.NewRunner() + } + + identity, err := resolveHostGlobalGitIdentity(ctx, runner) + if err != nil { + s.warnGitIdentitySyncSkipped(*vm, hostGlobalGitIdentitySource, err) + return nil + } + + vmCreateStage(ctx, "prepare_work_disk", "syncing git identity") + return writeGitIdentity(ctx, runner, vm.Runtime.WorkDiskPath, "/"+workDiskGitConfigRelativePath, identity) +} + +// runFileSync applies every [[file_sync]] entry from the daemon config +// to the VM's work disk. Missing host paths are skipped with a warn. +// Other errors abort the VM create (since the user explicitly asked +// for the sync). +// +// Operates directly on the ext4 image via the sudoless toolkit — no +// mount, no privileged install(1). Every write lands as root:root; +// file modes come from the [[file_sync]] entry (default 0600), +// directory modes from the source on the host. +func (s *WorkspaceService) runFileSync(ctx context.Context, vm *model.VMRecord) error { + if len(s.config.FileSync) == 0 { + return nil + } + + runner := s.runner + if runner == nil { + runner = system.NewRunner() + } + + hostHome := strings.TrimSpace(s.config.HostHomeDir) + if hostHome == "" { + var err error + hostHome, err = os.UserHomeDir() + if err != nil { + return fmt.Errorf("resolve host user home: %w", err) + } + } + + workDisk := vm.Runtime.WorkDiskPath + + for _, entry := range s.config.FileSync { + hostPath, err := config.ResolveFileSyncHostPath(entry.Host, hostHome) + if err != nil { + return fmt.Errorf("file_sync: %w", err) + } + guestRel := guestPathRelativeToRoot(entry.Guest) + guestImagePath := "/" + guestRel + + info, err := os.Stat(hostPath) + if err != nil { + if os.IsNotExist(err) { + s.warnFileSyncSkipped(*vm, hostPath, err) + continue + } + return fmt.Errorf("file_sync: stat %s: %w", hostPath, err) + } + hostPath, err = config.ResolveExistingFileSyncHostPath(entry.Host, hostHome) + if err != nil { + return fmt.Errorf("file_sync: %w", err) + } + + vmCreateStage(ctx, "prepare_work_disk", "file sync: "+entry.Host+" → "+entry.Guest) + + parent := path.Dir(guestImagePath) + if parent != "/" && parent != "." { + if err := system.MkdirAllExt4(ctx, runner, workDisk, parent, 0o755, 0, 0); err != nil { + return fmt.Errorf("file_sync: mkdir %s: %w", parent, err) + } + } + + if info.IsDir() { + if err := s.copyHostDir(ctx, *vm, runner, workDisk, hostPath, guestImagePath); err != nil { + return fmt.Errorf("file_sync: copy directory %s → %s: %w", hostPath, guestImagePath, err) + } + continue + } + + mode, err := parseFileSyncMode(entry.Mode) + if err != nil { + return fmt.Errorf("file_sync: %s: %w", entry.Host, err) + } + data, err := os.ReadFile(hostPath) + if err != nil { + return fmt.Errorf("file_sync: read %s: %w", hostPath, err) + } + if err := system.WriteExt4FileOwned(ctx, runner, workDisk, guestImagePath, mode, 0, 0, data); err != nil { + return fmt.Errorf("file_sync: write %s → %s: %w", hostPath, guestImagePath, err) + } + } + return nil +} + +// copyHostDir recursively copies hostDir into guestTarget on the +// ext4 image via the sudoless toolkit. Each file's source permissions +// are preserved; directories get 0755; ownership is forced to +// root:root. Symlinks are SKIPPED with a warning — os.Lstat identifies +// the entry itself as a link without resolving it, so a symlink +// inside ~/.aws that points at ~/secrets can't leak out of the tree +// the user named. Other special types (devices, FIFOs) are skipped +// silently. Top-level host paths go through os.Stat back in +// runFileSync and may still follow, but only when the resolved target +// stays under the configured owner home. +func (s *WorkspaceService) copyHostDir(ctx context.Context, vm model.VMRecord, runner system.CommandRunner, imagePath, hostDir, guestTarget string) error { + if err := system.MkdirExt4(ctx, runner, imagePath, guestTarget, 0o755, 0, 0); err != nil { + return err + } + entries, err := os.ReadDir(hostDir) + if err != nil { + return err + } + for _, entry := range entries { + hostChild := filepath.Join(hostDir, entry.Name()) + guestChild := path.Join(guestTarget, entry.Name()) + + info, err := os.Lstat(hostChild) + if err != nil { + return err + } + switch { + case info.Mode()&os.ModeSymlink != 0: + s.warnFileSyncSymlinkSkipped(vm, hostChild) + case info.IsDir(): + if err := s.copyHostDir(ctx, vm, runner, imagePath, hostChild, guestChild); err != nil { + return err + } + case info.Mode().IsRegular(): + data, err := os.ReadFile(hostChild) + if err != nil { + return err + } + if err := system.WriteExt4FileOwned(ctx, runner, imagePath, guestChild, info.Mode().Perm(), 0, 0, data); err != nil { + return err + } + } + } + return nil +} + +// parseFileSyncMode parses the [[file_sync]] mode field (octal string, +// default "0600"). Returns the parsed FileMode with only the permission +// bits set; callers OR in S_IFREG via WriteExt4FileOwned. +func parseFileSyncMode(raw string) (os.FileMode, error) { + raw = strings.TrimSpace(raw) + if raw == "" { + raw = "0600" + } + v, err := strconv.ParseUint(raw, 8, 32) + if err != nil { + return 0, fmt.Errorf("parse mode %q: %w", raw, err) + } + return os.FileMode(v) & os.ModePerm, nil +} + +// expandHostPath expands a leading "~/" against the host user's +// guestPathRelativeToRoot returns the guest path as a relative path +// under /root (banger's work disk is mounted at /root in the guest, +// so everything syncable lives there). "~/foo" and "/root/foo" both +// return "foo"; config validation rejects anything outside that +// scope, so the string prefixes are the only forms we see here. +func guestPathRelativeToRoot(raw string) string { + raw = strings.TrimSpace(raw) + switch { + case raw == "~" || raw == "/root": + return "" + case strings.HasPrefix(raw, "~/"): + return strings.TrimPrefix(raw, "~/") + case strings.HasPrefix(raw, "/root/"): + return strings.TrimPrefix(raw, "/root/") + } + return raw +} + +func resolveHostGlobalGitIdentity(ctx context.Context, runner system.CommandRunner) (gitIdentity, error) { + name, err := gitConfigValue(ctx, runner, nil, "user.name") + if err != nil { + return gitIdentity{}, err + } + if name == "" { + return gitIdentity{}, errors.New("host git user.name is empty") + } + + email, err := gitConfigValue(ctx, runner, nil, "user.email") + if err != nil { + return gitIdentity{}, err + } + if email == "" { + return gitIdentity{}, errors.New("host git user.email is empty") + } + + return gitIdentity{Name: name, Email: email}, nil +} + +func gitConfigValue(ctx context.Context, runner system.CommandRunner, extraArgs []string, key string) (string, error) { + args := []string{"config"} + args = append(args, extraArgs...) + args = append(args, "--default", "", "--get", key) + out, err := runner.Run(ctx, "git", args...) + if err != nil { + return "", err + } + return strings.TrimSpace(string(out)), nil +} + +// writeGitIdentity merges user.name + user.email into the on-image +// gitconfig at guestPath. Reads the existing bytes via the ext4 +// toolkit (no-op to empty if absent), edits via `git config --file` +// on a host tempfile so any pre-existing unrelated sections are +// preserved verbatim, then writes back through WriteExt4FileOwned +// at 0644 root:root. +func writeGitIdentity(ctx context.Context, runner system.CommandRunner, imagePath, guestPath string, identity gitIdentity) error { + var existing []byte + exists, err := system.Ext4PathExists(ctx, runner, imagePath, guestPath) + if err != nil { + return err + } + if exists { + existing, err = system.ReadExt4File(ctx, runner, imagePath, guestPath) + if err != nil { + return err + } + } + + tmpFile, err := os.CreateTemp("", "banger-gitconfig-*") + if err != nil { + return err + } + tmpPath := tmpFile.Name() + if _, err := tmpFile.Write(existing); err != nil { + _ = tmpFile.Close() + _ = os.Remove(tmpPath) + return err + } + if err := tmpFile.Close(); err != nil { + _ = os.Remove(tmpPath) + return err + } + defer os.Remove(tmpPath) + + if _, err := runner.Run(ctx, "git", "config", "--file", tmpPath, "user.name", identity.Name); err != nil { + return err + } + if _, err := runner.Run(ctx, "git", "config", "--file", tmpPath, "user.email", identity.Email); err != nil { + return err + } + merged, err := os.ReadFile(tmpPath) + if err != nil { + return err + } + return system.WriteExt4FileOwned(ctx, runner, imagePath, guestPath, 0o644, 0, 0, merged) +} + +func (s *WorkspaceService) warnFileSyncSkipped(vm model.VMRecord, hostPath string, err error) { + if s.logger == nil || err == nil { + return + } + s.logger.Warn("file_sync skipped", append(vmLogAttrs(vm), "host_path", hostPath, "error", err.Error())...) +} + +// warnFileSyncSymlinkSkipped surfaces a skipped nested symlink to the +// user through the daemon log. Skipping is deliberate — see +// copyHostDir's docstring — but invisible skips would hide a +// "why did my file not show up in the guest?" debugging trail. +func (s *WorkspaceService) warnFileSyncSymlinkSkipped(vm model.VMRecord, hostPath string) { + if s.logger == nil { + return + } + s.logger.Warn("file_sync skipped symlink (would escape the requested tree)", append(vmLogAttrs(vm), "host_path", hostPath)...) +} + +func (s *WorkspaceService) warnGitIdentitySyncSkipped(vm model.VMRecord, source string, err error) { + if s.logger == nil || err == nil { + return + } + s.logger.Warn("guest git identity sync skipped", append(vmLogAttrs(vm), "source", source, "error", err.Error())...) +} + +func mergeAuthorizedKey(existing, managed []byte) []byte { + managedLine := strings.TrimSpace(string(managed)) + if managedLine == "" { + return append([]byte(nil), existing...) + } + + lines := strings.Split(strings.ReplaceAll(string(existing), "\r\n", "\n"), "\n") + out := make([]string, 0, len(lines)+1) + found := false + for _, line := range lines { + line = strings.TrimRight(line, "\r") + trimmed := strings.TrimSpace(line) + if trimmed == "" { + continue + } + if trimmed == managedLine { + found = true + } + out = append(out, line) + } + if !found { + out = append(out, managedLine) + } + return []byte(strings.Join(out, "\n") + "\n") +} diff --git a/internal/daemon/vm_create.go b/internal/daemon/vm_create.go new file mode 100644 index 0000000..3ec3e34 --- /dev/null +++ b/internal/daemon/vm_create.go @@ -0,0 +1,226 @@ +package daemon + +import ( + "context" + "database/sql" + "errors" + "fmt" + "os" + "path/filepath" + "strings" + + "banger/internal/api" + "banger/internal/imagecat" + "banger/internal/model" + "banger/internal/vmdns" +) + +// CreateVM is split into three phases so the global createVMMu guards +// only the narrow name+IP reservation window, not the slow image +// resolution or the multi-second boot flow: +// +// 1. Validate + resolve image. No global lock. Image auto-pull +// self-locks via imageOpsMu (which is also now publication-only). +// 2. Reserve a row: generate id, pick next IP, claim the name, +// UpsertVM the "created" record. Held under createVMMu so two +// concurrent `vm create --name foo` calls can't both think they +// won. +// 3. Boot. Only the per-VM lock is held — parallel creates against +// different VMs fully overlap. +func (s *VMService) CreateVM(ctx context.Context, params api.VMCreateParams) (vm model.VMRecord, err error) { + op := s.beginOperation(ctx, "vm.create") + defer func() { + if err != nil { + op.fail(err) + return + } + op.done(vmLogAttrs(vm)...) + }() + if err := validateOptionalPositiveSetting("vcpu", params.VCPUCount); err != nil { + return model.VMRecord{}, err + } + if err := validateOptionalPositiveSetting("memory", params.MemoryMiB); err != nil { + return model.VMRecord{}, err + } + + imageName := params.ImageName + if imageName == "" { + imageName = s.config.DefaultImageName + } + vmCreateStage(ctx, "resolve_image", "resolving image") + image, err := s.findOrAutoPullImage(ctx, imageName) + if err != nil { + return model.VMRecord{}, err + } + vmCreateStage(ctx, "resolve_image", "using image "+image.Name) + op.stage("image_resolved", imageLogAttrs(image)...) + + systemOverlaySize := int64(model.DefaultSystemOverlaySize) + if params.SystemOverlaySize != "" { + systemOverlaySize, err = model.ParseSize(params.SystemOverlaySize) + if err != nil { + return model.VMRecord{}, err + } + } + workDiskSize := int64(model.DefaultWorkDiskSize) + if params.WorkDiskSize != "" { + workDiskSize, err = model.ParseSize(params.WorkDiskSize) + if err != nil { + return model.VMRecord{}, err + } + } + spec := model.VMSpec{ + VCPUCount: optionalIntOrDefault(params.VCPUCount, model.DefaultVCPUCount), + MemoryMiB: optionalIntOrDefault(params.MemoryMiB, model.DefaultMemoryMiB), + SystemOverlaySizeByte: systemOverlaySize, + WorkDiskSizeBytes: workDiskSize, + NATEnabled: params.NATEnabled, + } + + vm, err = s.reserveVM(ctx, strings.TrimSpace(params.Name), image, spec) + if err != nil { + return model.VMRecord{}, err + } + op.stage("persisted", vmLogAttrs(vm)...) + vmCreateBindVM(ctx, vm) + vmCreateStage(ctx, "reserve_vm", fmt.Sprintf("allocated %s (%s)", vm.Name, vm.Runtime.GuestIP)) + + unlockVM := s.lockVMID(vm.ID) + defer unlockVM() + + if params.NoStart { + vm.State = model.VMStateStopped + vm.Runtime.State = model.VMStateStopped + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil + } + return s.startVMLocked(ctx, vm, image) +} + +// reserveVM holds createVMMu only long enough to verify the name is +// free, allocate a guest IP from the store, and persist the "created" +// reservation row. Everything else (image resolution upstream, boot +// downstream) runs outside this lock. +func (s *VMService) reserveVM(ctx context.Context, requestedName string, image model.Image, spec model.VMSpec) (model.VMRecord, error) { + s.createVMMu.Lock() + defer s.createVMMu.Unlock() + + name := requestedName + if name == "" { + generated, err := s.generateName(ctx) + if err != nil { + return model.VMRecord{}, err + } + name = generated + } + // Defense in depth: CLI has already validated the flag, but any + // other RPC caller (SDK, direct JSON over the socket) lands here + // without going through the CLI flag parser. The name flows into + // /etc/hostname, kernel boot args, DNS records, and file paths — + // it has to be DNS-label-safe. + if err := model.ValidateVMName(name); err != nil { + return model.VMRecord{}, err + } + // Exact-name lookup. Using FindVM here would also match a new name + // that merely prefixes some existing VM's id or another VM's name, + // falsely rejecting perfectly valid names. + if _, err := s.store.GetVMByName(ctx, name); err == nil { + return model.VMRecord{}, fmt.Errorf("vm name already exists: %s", name) + } else if !errors.Is(err, sql.ErrNoRows) { + return model.VMRecord{}, err + } + + id, err := model.NewID() + if err != nil { + return model.VMRecord{}, err + } + guestIP, err := s.store.NextGuestIP(ctx, bridgePrefix(s.config.BridgeIP)) + if err != nil { + return model.VMRecord{}, err + } + vmDir := filepath.Join(s.layout.VMsDir, id) + if err := os.MkdirAll(vmDir, 0o755); err != nil { + return model.VMRecord{}, err + } + vsockCID, err := defaultVSockCID(guestIP) + if err != nil { + return model.VMRecord{}, err + } + now := model.Now() + vm := model.VMRecord{ + ID: id, + Name: name, + ImageID: image.ID, + State: model.VMStateCreated, + CreatedAt: now, + UpdatedAt: now, + LastTouchedAt: now, + Spec: spec, + Runtime: model.VMRuntime{ + State: model.VMStateCreated, + GuestIP: guestIP, + DNSName: vmdns.RecordName(name), + VMDir: vmDir, + VSockPath: defaultVSockPath(s.layout.RuntimeDir, id), + VSockCID: vsockCID, + SystemOverlay: filepath.Join(vmDir, "system.cow"), + WorkDiskPath: filepath.Join(vmDir, "root.ext4"), + LogPath: filepath.Join(vmDir, "firecracker.log"), + MetricsPath: filepath.Join(vmDir, "metrics.json"), + }, + } + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil +} + +// findOrAutoPullImage tries the local image store first; if the name +// isn't registered but matches an entry in the embedded imagecat +// catalog, it auto-pulls the bundle so `vm create --image foo` (and +// therefore `vm run`) works on a fresh host without the user having +// to run `image pull` first. +// +// Concurrency: parallel vm.create RPCs targeting the same missing +// image must not both run the full OCI fetch + ext4 build. The pull +// itself takes minutes, and the publishImage atom that closes it +// only protects the rename + upsert — by the time the second caller +// gets there, it has already done all the work, only to fail at the +// recheck with "image already exists". Hold a per-name pull lock +// around the recheck-and-pull section: the loser waits, sees the +// image already published on the post-lock recheck, and short- +// circuits with a FindImage. PullImage's own internal recheck stays +// in place as defense-in-depth for callers that bypass this path. +func (s *VMService) findOrAutoPullImage(ctx context.Context, idOrName string) (model.Image, error) { + if image, err := s.img.FindImage(ctx, idOrName); err == nil { + return image, nil + } + catalog, loadErr := imagecat.LoadEmbedded() + if loadErr != nil { + _, err := s.img.FindImage(ctx, idOrName) + return model.Image{}, err + } + entry, lookupErr := catalog.Lookup(idOrName) + if lookupErr != nil { + // Not in the catalog either — surface the original not-found. + _, err := s.img.FindImage(ctx, idOrName) + return model.Image{}, err + } + + release, err := s.img.acquireImagePullLock(ctx, entry.Name) + if err != nil { + return model.Image{}, err + } + defer release() + if image, err := s.img.FindImage(ctx, idOrName); err == nil { + return image, nil + } + + vmCreateStage(ctx, "auto_pull_image", fmt.Sprintf("pulling %s from image catalog", entry.Name)) + if _, pullErr := s.img.PullImage(ctx, api.ImagePullParams{Ref: entry.Name}); pullErr != nil { + return model.Image{}, fmt.Errorf("auto-pull image %q: %w", entry.Name, pullErr) + } + return s.img.FindImage(ctx, idOrName) +} diff --git a/internal/daemon/vm_create_ops.go b/internal/daemon/vm_create_ops.go index 0b856a3..0c8afe5 100644 --- a/internal/daemon/vm_create_ops.go +++ b/internal/daemon/vm_create_ops.go @@ -11,6 +11,11 @@ import ( "banger/internal/model" ) +func (op *vmCreateOperationState) ID() string { return op.snapshot().ID } +func (op *vmCreateOperationState) IsDone() bool { return op.snapshot().Done } +func (op *vmCreateOperationState) UpdatedAt() time.Time { return op.snapshot().UpdatedAt } +func (op *vmCreateOperationState) Cancel() { op.cancelOperation() } + type vmCreateProgressKey struct{} type vmCreateOperationState struct { @@ -19,10 +24,21 @@ type vmCreateOperationState struct { op api.VMCreateOperation } -func newVMCreateOperationState() (*vmCreateOperationState, error) { - id, err := model.NewID() - if err != nil { - return nil, err +// newVMCreateOperationState constructs the async-progress record for +// a vm.create.begin RPC. When the caller's context already carries a +// dispatch-assigned op id (the normal path), we reuse it so the +// operator-visible status id and the daemon-log op_id are the same +// string. Otherwise we mint a fresh op id — keeps the same shape on +// internal call sites that don't go through dispatch (tests, future +// background creators). +func newVMCreateOperationState(ctx context.Context) (*vmCreateOperationState, error) { + id := OpIDFromContext(ctx) + if id == "" { + var err error + id, err = model.NewOpID() + if err != nil { + return nil, err + } } now := model.Now() return &vmCreateOperationState{ @@ -141,27 +157,24 @@ func (op *vmCreateOperationState) cancelOperation() { } } -func (d *Daemon) BeginVMCreate(_ context.Context, params api.VMCreateParams) (api.VMCreateOperation, error) { - op, err := newVMCreateOperationState() +func (s *VMService) BeginVMCreate(ctx context.Context, params api.VMCreateParams) (api.VMCreateOperation, error) { + op, err := newVMCreateOperationState(ctx) if err != nil { return api.VMCreateOperation{}, err } - createCtx, cancel := context.WithCancel(context.Background()) + // Detach from the caller's deadline (the begin RPC returns + // immediately) but preserve the op id so every log line emitted + // by the goroutine carries the same identifier the client just + // got back. + createCtx, cancel := context.WithCancel(WithOpID(context.Background(), op.op.ID)) op.setCancel(cancel) - - d.createOpsMu.Lock() - if d.createOps == nil { - d.createOps = map[string]*vmCreateOperationState{} - } - d.createOps[op.op.ID] = op - d.createOpsMu.Unlock() - - go d.runVMCreateOperation(withVMCreateProgress(createCtx, op), op, params) + s.createOps.Insert(op) + go s.runVMCreateOperation(withVMCreateProgress(createCtx, op), op, params) return op.snapshot(), nil } -func (d *Daemon) runVMCreateOperation(ctx context.Context, op *vmCreateOperationState, params api.VMCreateParams) { - vm, err := d.CreateVM(ctx, params) +func (s *VMService) runVMCreateOperation(ctx context.Context, op *vmCreateOperationState, params api.VMCreateParams) { + vm, err := s.CreateVM(ctx, params) if err != nil { op.fail(err) return @@ -169,20 +182,16 @@ func (d *Daemon) runVMCreateOperation(ctx context.Context, op *vmCreateOperation op.done(vm) } -func (d *Daemon) VMCreateStatus(_ context.Context, id string) (api.VMCreateOperation, error) { - d.createOpsMu.Lock() - op, ok := d.createOps[strings.TrimSpace(id)] - d.createOpsMu.Unlock() +func (s *VMService) VMCreateStatus(_ context.Context, id string) (api.VMCreateOperation, error) { + op, ok := s.createOps.Get(strings.TrimSpace(id)) if !ok { return api.VMCreateOperation{}, fmt.Errorf("vm create operation not found: %s", id) } return op.snapshot(), nil } -func (d *Daemon) CancelVMCreate(_ context.Context, id string) error { - d.createOpsMu.Lock() - op, ok := d.createOps[strings.TrimSpace(id)] - d.createOpsMu.Unlock() +func (s *VMService) CancelVMCreate(_ context.Context, id string) error { + op, ok := s.createOps.Get(strings.TrimSpace(id)) if !ok { return fmt.Errorf("vm create operation not found: %s", id) } @@ -190,16 +199,6 @@ func (d *Daemon) CancelVMCreate(_ context.Context, id string) error { return nil } -func (d *Daemon) pruneVMCreateOperations(olderThan time.Time) { - d.createOpsMu.Lock() - defer d.createOpsMu.Unlock() - for id, op := range d.createOps { - snapshot := op.snapshot() - if !snapshot.Done { - continue - } - if snapshot.UpdatedAt.Before(olderThan) { - delete(d.createOps, id) - } - } +func (s *VMService) pruneVMCreateOperations(olderThan time.Time) { + s.createOps.Prune(olderThan) } diff --git a/internal/daemon/vm_create_test.go b/internal/daemon/vm_create_test.go new file mode 100644 index 0000000..2517033 --- /dev/null +++ b/internal/daemon/vm_create_test.go @@ -0,0 +1,125 @@ +package daemon + +import ( + "context" + "path/filepath" + "strings" + "testing" + + "banger/internal/model" + "banger/internal/paths" +) + +// TestReserveVMAllowsNameThatPrefixesExistingVM is a regression for a +// correctness bug in the name-uniqueness check: reserveVM used to +// route through FindVM, which falls back to prefix-matching on both +// ids and names. That meant a perfectly valid new name like "beta" +// could be rejected simply because an existing VM's id or name +// started with "beta". Exact-name lookup via store.GetVMByName fixes +// it. The test seeds a VM whose id and name are long strings, then +// tries to reserve a new VM with a name that's a prefix of each — +// both must succeed. +func TestReserveVMAllowsNameThatPrefixesExistingVM(t *testing.T) { + ctx := context.Background() + tmp := t.TempDir() + d := &Daemon{ + store: openDaemonStore(t), + layout: paths.Layout{VMsDir: filepath.Join(tmp, "vms"), RuntimeDir: filepath.Join(tmp, "runtime")}, + config: model.DaemonConfig{BridgeIP: model.DefaultBridgeIP}, + } + wireServices(d) + + existing := testVM("longname-sandbox-foobar", "image-x", "172.16.0.50") + upsertDaemonVM(t, ctx, d.store, existing) + + image := testImage("image-x") + image.ID = "image-x" + image.Name = "image-x" + if err := d.store.UpsertImage(ctx, image); err != nil { + t.Fatalf("UpsertImage: %v", err) + } + + // New VM name is a prefix of the existing id (which is + // "longname-sandbox-foobar-id" per testVM). Old FindVM-based check + // would reject this. + if vm, err := d.vm.reserveVM(ctx, "longname", image, model.VMSpec{VCPUCount: 1, MemoryMiB: 128}); err != nil { + t.Fatalf("reserveVM(prefix of id): %v", err) + } else if vm.Name != "longname" { + t.Fatalf("reserveVM returned name=%q, want longname", vm.Name) + } + + // Prefix of the existing name ("longname-sandbox") must also work. + if vm, err := d.vm.reserveVM(ctx, "longname-sandbox", image, model.VMSpec{VCPUCount: 1, MemoryMiB: 128}); err != nil { + t.Fatalf("reserveVM(prefix of name): %v", err) + } else if vm.Name != "longname-sandbox" { + t.Fatalf("reserveVM returned name=%q, want longname-sandbox", vm.Name) + } +} + +// TestReserveVMRejectsExactDuplicateName confirms the uniqueness +// check still catches actual collisions after the FindVM → GetVMByName +// switch. +func TestReserveVMRejectsExactDuplicateName(t *testing.T) { + ctx := context.Background() + tmp := t.TempDir() + d := &Daemon{ + store: openDaemonStore(t), + layout: paths.Layout{VMsDir: filepath.Join(tmp, "vms"), RuntimeDir: filepath.Join(tmp, "runtime")}, + config: model.DaemonConfig{BridgeIP: model.DefaultBridgeIP}, + } + wireServices(d) + existing := testVM("sandbox", "image-x", "172.16.0.51") + upsertDaemonVM(t, ctx, d.store, existing) + + image := testImage("image-x") + image.ID = "image-x" + image.Name = "image-x" + if err := d.store.UpsertImage(ctx, image); err != nil { + t.Fatalf("UpsertImage: %v", err) + } + + _, err := d.vm.reserveVM(ctx, "sandbox", image, model.VMSpec{VCPUCount: 1, MemoryMiB: 128}) + if err == nil { + t.Fatal("reserveVM with duplicate name should have failed") + } + if !strings.Contains(err.Error(), "already exists") { + t.Fatalf("err = %v, want 'already exists'", err) + } +} + +// TestReserveVMRejectsInvalidName pins defense-in-depth: the CLI +// already validates, but any other RPC caller (banger SDK, direct +// JSON over the socket) lands here without going through the CLI. +// The name ends up in /etc/hostname, kernel boot args, DNS records, +// and file paths — the daemon must refuse anything that isn't a +// valid DNS label. +func TestReserveVMRejectsInvalidName(t *testing.T) { + ctx := context.Background() + tmp := t.TempDir() + d := &Daemon{ + store: openDaemonStore(t), + layout: paths.Layout{VMsDir: filepath.Join(tmp, "vms"), RuntimeDir: filepath.Join(tmp, "runtime")}, + config: model.DaemonConfig{BridgeIP: model.DefaultBridgeIP}, + } + wireServices(d) + + image := testImage("image-x") + image.ID = "image-x" + image.Name = "image-x" + if err := d.store.UpsertImage(ctx, image); err != nil { + t.Fatalf("UpsertImage: %v", err) + } + + for _, bad := range []string{ + "MyBox", // uppercase + "my box", // space + "my.box", // dot + "box\n", // newline + "-box", // leading hyphen + "box/../evil", // path separator + traversal + } { + if _, err := d.vm.reserveVM(ctx, bad, image, model.VMSpec{VCPUCount: 1, MemoryMiB: 128}); err == nil { + t.Fatalf("reserveVM(%q) = nil error, want rejection", bad) + } + } +} diff --git a/internal/daemon/vm_disk.go b/internal/daemon/vm_disk.go new file mode 100644 index 0000000..fe5db6d --- /dev/null +++ b/internal/daemon/vm_disk.go @@ -0,0 +1,183 @@ +package daemon + +import ( + "context" + "fmt" + "strconv" + "strings" + + "banger/internal/guestconfig" + "banger/internal/guestnet" + "banger/internal/model" + "banger/internal/roothelper" + "banger/internal/system" +) + +type workDiskPreparation struct { + ClonedFromSeed bool +} + +func (s *VMService) ensureSystemOverlay(ctx context.Context, vm *model.VMRecord) error { + if exists(vm.Runtime.SystemOverlay) { + return nil + } + _, err := s.runner.Run(ctx, "truncate", "-s", strconv.FormatInt(vm.Spec.SystemOverlaySizeByte, 10), vm.Runtime.SystemOverlay) + return err +} + +// patchRootOverlay writes the per-VM config files (resolv.conf, +// hostname, hosts, sshd drop-in, network bootstrap, fstab) into the +// rootfs overlay. The start flow passes the DM device path explicitly so the +// owner daemon can hand the privileged ext4 work to the root helper without +// rereading mutable process state. +func (s *VMService) patchRootOverlay(ctx context.Context, vm model.VMRecord, image model.Image, dmDev string) error { + if strings.TrimSpace(dmDev) == "" { + return fmt.Errorf("vm %q: DM device is required", vm.ID) + } + resolv := []byte(fmt.Sprintf("nameserver %s\n", s.config.DefaultDNS)) + hostname := []byte(vm.Name + "\n") + hosts := []byte(fmt.Sprintf("127.0.0.1 localhost\n127.0.1.1 %s\n", vm.Name)) + sshdConfig := []byte(sshdGuestConfig()) + fstabBytes, err := s.privOps().ReadExt4File(ctx, dmDev, "/etc/fstab") + fstab := string(fstabBytes) + if err != nil { + fstab = "" + } + builder := guestconfig.NewBuilder() + builder.WriteFile("/etc/resolv.conf", resolv) + builder.WriteFile("/etc/hostname", hostname) + builder.WriteFile("/etc/hosts", hosts) + builder.WriteFile(guestnet.ConfigPath, guestnet.ConfigFile(vm.Runtime.GuestIP, s.config.BridgeIP, s.config.DefaultDNS)) + builder.WriteFile(guestnet.GuestScriptPath, []byte(guestnet.BootstrapScript())) + builder.WriteFile("/etc/ssh/sshd_config.d/99-banger.conf", sshdConfig) + builder.DropMountTarget("/home") + builder.DropMountTarget("/var") + builder.AddMount(guestconfig.MountSpec{ + Source: "tmpfs", + Target: "/run", + FSType: "tmpfs", + Options: []string{"defaults", "nodev", "nosuid", "mode=0755"}, + Dump: 0, + Pass: 0, + }) + builder.AddMount(guestconfig.MountSpec{ + Source: "tmpfs", + Target: "/tmp", + FSType: "tmpfs", + Options: []string{"defaults", "nodev", "nosuid", "mode=1777"}, + Dump: 0, + Pass: 0, + }) + s.capHooks.contributeGuest(builder, vm, image) + builder.WriteFile("/etc/fstab", []byte(builder.RenderFSTab(fstab))) + files := builder.Files() + writes := make([]roothelper.Ext4Write, 0, len(files)) + for _, guestPath := range builder.FilePaths() { + mode := uint32(0o644) + if guestPath == guestnet.GuestScriptPath { + mode = 0o755 + } + writes = append(writes, roothelper.Ext4Write{ + GuestPath: guestPath, + Data: files[guestPath], + Mode: mode, + }) + } + return s.privOps().WriteExt4Files(ctx, dmDev, writes) +} + +func (s *VMService) ensureWorkDisk(ctx context.Context, vm *model.VMRecord, image model.Image) (workDiskPreparation, error) { + if exists(vm.Runtime.WorkDiskPath) { + return workDiskPreparation{}, nil + } + if exists(image.WorkSeedPath) { + vmCreateStage(ctx, "prepare_work_disk", "applying work seed") + // Old flow used CopyFilePreferClone + (e2fsck + resize2fs). + // On filesystems without reflink support that meant pushing + // every byte of a 512+ MiB seed through the kernel followed + // by a full fsck/resize, even though the seed itself only + // holds a few KB of dotfiles. mkfs + ingest runs in roughly + // a second regardless of seed or work-disk size. + if err := system.MaterializeWorkDisk(ctx, s.runner, image.WorkSeedPath, vm.Runtime.WorkDiskPath, vm.Spec.WorkDiskSizeBytes); err != nil { + return workDiskPreparation{}, err + } + return workDiskPreparation{ClonedFromSeed: true}, nil + } + // No seed: build an empty work disk. `-E root_owner=0:0` stamps + // inode 2 (the fs root, which becomes /root inside the guest) as + // root:root:0755 up front. sshd's StrictModes walks that dir's + // ownership and mode, so getting it right from mkfs means the + // authsync step can just write authorized_keys without any + // repair pass. + // + // Unlike the pre-refactor flow there is no "copy /root from the + // base rootfs" step. The no-seed path is the degraded fallback + // (the common case has a work-seed artifact and hits the branch + // above). Dropping the copy eliminates 4 sudo call sites — mount + // base ro, mount work rw, sudo cp -a, flattenNestedWorkHome — + // at the cost of losing default distro dotfiles on no-seed VMs. + // Users who need those should either rebuild the image with a + // work-seed (the documented path) or land them via [[file_sync]]. + vmCreateStage(ctx, "prepare_work_disk", "creating empty work disk") + if _, err := s.runner.Run(ctx, "truncate", "-s", strconv.FormatInt(vm.Spec.WorkDiskSizeBytes, 10), vm.Runtime.WorkDiskPath); err != nil { + return workDiskPreparation{}, err + } + if _, err := s.runner.Run(ctx, "mkfs.ext4", "-F", "-E", system.MkfsExtraOptions, vm.Runtime.WorkDiskPath); err != nil { + return workDiskPreparation{}, err + } + return workDiskPreparation{}, nil +} + +// sshdGuestConfig is the banger-authored drop-in that lands at +// /etc/ssh/sshd_config.d/99-banger.conf inside every guest. +// +// Banger VMs are single-user root sandboxes reachable only through the +// host bridge (default 172.16.0.0/24). The drop-in sets the minimum +// needed to make that usable while keeping the posture tight enough +// that a misconfigured host bridge does not immediately hand over an +// unauthenticated root shell. +// +// Why each line is here: +// +// - PermitRootLogin prohibit-password +// The guest IS root — there's no other account. prohibit-password +// allows pubkey login and blocks password auth at the source even +// if some future config flips PasswordAuthentication on. +// +// - PubkeyAuthentication yes +// The only auth method we expect. Explicit in case a future +// Debian default or distro package flips it off. +// +// - PasswordAuthentication no +// +// - KbdInteractiveAuthentication no +// Belt-and-braces: every interactive auth path is off, not just +// the PermitRootLogin path. These are already Debian defaults but +// stating them here means the drop-in documents the intent. +// +// - AuthorizedKeysFile /root/.ssh/authorized_keys +// Pins the lookup path so the banger-written file always wins, +// regardless of distro default ($HOME/.ssh/authorized_keys) and +// regardless of any per-image weirdness. +// +// - PrintMotd no / PrintLastLog no +// Banger VMs are short-lived sandboxes. The Debian-style MOTD +// ("Linux ... GNU/Linux comes with ABSOLUTELY NO WARRANTY …") and +// the "Last login" line are pure noise for `vm run -- echo hi` +// style invocations. Pair this with the .hushlogin landed on the +// work disk (see ensureHushLoginOnWorkDisk) so pam_motd also stays +// silent on distros that read /etc/motd through PAM rather than +// sshd. The work disk mounts at /root, so the file has to live on +// that disk — a write to the rootfs overlay would be shadowed. +func sshdGuestConfig() string { + return strings.Join([]string{ + "PermitRootLogin prohibit-password", + "PubkeyAuthentication yes", + "PasswordAuthentication no", + "KbdInteractiveAuthentication no", + "AuthorizedKeysFile /root/.ssh/authorized_keys", + "PrintMotd no", + "PrintLastLog no", + "", + }, "\n") +} diff --git a/internal/daemon/vm_handles.go b/internal/daemon/vm_handles.go new file mode 100644 index 0000000..1362c90 --- /dev/null +++ b/internal/daemon/vm_handles.go @@ -0,0 +1,223 @@ +package daemon + +import ( + "context" + "encoding/json" + "errors" + "fmt" + "os" + "path/filepath" + "sync" + + "banger/internal/model" +) + +// handleCache is the daemon's in-memory map of per-VM transient +// handles. It is the sole runtime source of truth for PID / tap / +// loop / DM state — persistent storage (the per-VM handles.json +// scratch file) exists only so the daemon can rebuild the cache +// after a restart. +type handleCache struct { + mu sync.RWMutex + m map[string]model.VMHandles +} + +func newHandleCache() *handleCache { + return &handleCache{m: make(map[string]model.VMHandles)} +} + +// get returns the cached handles for vmID and whether an entry +// exists. A missing entry means "no live handles tracked," which is +// the correct state for stopped VMs. +func (c *handleCache) get(vmID string) (model.VMHandles, bool) { + c.mu.RLock() + defer c.mu.RUnlock() + h, ok := c.m[vmID] + return h, ok +} + +func (c *handleCache) set(vmID string, h model.VMHandles) { + c.mu.Lock() + defer c.mu.Unlock() + c.m[vmID] = h +} + +func (c *handleCache) clear(vmID string) { + c.mu.Lock() + defer c.mu.Unlock() + delete(c.m, vmID) +} + +// handlesFilePath returns the scratch file path inside the VM +// directory where the daemon writes the last-known handles. +func handlesFilePath(vmDir string) string { + return filepath.Join(vmDir, "handles.json") +} + +// writeHandlesFile persists h to /handles.json. Called +// whenever the daemon successfully transitions a VM to running +// (after all handles are acquired). Best-effort: a write failure is +// logged, not propagated — the in-memory cache is authoritative +// while the daemon is up. +func writeHandlesFile(vmDir string, h model.VMHandles) error { + if vmDir == "" { + return errors.New("vm dir is required") + } + if err := os.MkdirAll(vmDir, 0o755); err != nil { + return err + } + data, err := json.MarshalIndent(h, "", " ") + if err != nil { + return err + } + return os.WriteFile(handlesFilePath(vmDir), data, 0o600) +} + +// readHandlesFile loads the scratch file written at the last start. +// Returns a zero-value handles + (false, nil) if the file doesn't +// exist — that's the normal case for stopped VMs. +func readHandlesFile(vmDir string) (model.VMHandles, bool, error) { + if vmDir == "" { + return model.VMHandles{}, false, nil + } + data, err := os.ReadFile(handlesFilePath(vmDir)) + if os.IsNotExist(err) { + return model.VMHandles{}, false, nil + } + if err != nil { + return model.VMHandles{}, false, err + } + var h model.VMHandles + if err := json.Unmarshal(data, &h); err != nil { + return model.VMHandles{}, false, fmt.Errorf("parse handles.json: %w", err) + } + return h, true, nil +} + +func removeHandlesFile(vmDir string) { + if vmDir == "" { + return + } + _ = os.Remove(handlesFilePath(vmDir)) +} + +// ensureHandleCache lazily constructs the cache so direct +// `&Daemon{}` literals (common in tests) don't have to initialise +// it. Production code goes through Open(), which also builds it. +func (s *VMService) ensureHandleCache() { + if s.handles == nil { + s.handles = newHandleCache() + } +} + +// setVMHandlesInMemory is a test-only cache seed that skips the +// scratch-file write. Production callers should use setVMHandles so +// the filesystem survives a daemon restart. +func (s *VMService) setVMHandlesInMemory(vmID string, h model.VMHandles) { + if s == nil { + return + } + s.ensureHandleCache() + s.handles.set(vmID, h) +} + +// vmHandles returns the cached handles for vm (zero-value if no +// entry). The in-process handle cache is the authoritative source +// for PID and live kernel/network handles; VMRecord.Runtime only +// mirrors teardown-critical fields for restart recovery. +func (s *VMService) vmHandles(vmID string) model.VMHandles { + if s == nil { + return model.VMHandles{} + } + s.ensureHandleCache() + h, _ := s.handles.get(vmID) + return h +} + +// setVMHandles updates the in-memory cache, mirrors teardown-critical +// fields onto VMRuntime, and writes the per-VM scratch file. +// Scratch-file errors are logged but not returned; the cache remains +// authoritative while the daemon is alive. +// +// Write order: file first, cache second. A daemon crash between the +// two leaves the on-disk scratch file ahead of the in-memory cache — +// which is the recoverable direction, since reconcile re-seeds the +// cache from the file on the next start. The reverse order would let +// a crash strand handles the daemon already saw as live but never +// persisted, breaking the next-start teardown of DM/loops/tap. +func (s *VMService) setVMHandles(vm *model.VMRecord, h model.VMHandles) { + if s == nil || vm == nil { + return + } + persistRuntimeTeardownState(vm, h) + s.ensureHandleCache() + if err := writeHandlesFile(vm.Runtime.VMDir, h); err != nil && s.logger != nil { + s.logger.Warn("persist handles.json failed", "vm_id", vm.ID, "error", err.Error()) + } + s.handles.set(vm.ID, h) +} + +// clearVMHandles drops the cache entry and removes the scratch +// file. Called on stop / delete / after a failed start. +func (s *VMService) clearVMHandles(vm model.VMRecord) { + if s == nil { + return + } + s.ensureHandleCache() + s.handles.clear(vm.ID) + removeHandlesFile(vm.Runtime.VMDir) +} + +// vmAlive is the canonical "is this VM actually running?" check. +// Unlike the old `system.ProcessRunning(vm.Runtime.PID, apiSock)` +// pattern, this reads the PID from the handle cache — which is +// authoritative in-process — and verifies the PID against the api +// socket so a recycled PID can't false-positive. +func (s *VMService) vmAlive(vm model.VMRecord) bool { + if vm.State != model.VMStateRunning { + return false + } + h := s.vmHandles(vm.ID) + if h.PID <= 0 { + return false + } + running, err := s.privOps().ProcessRunning(context.Background(), h.PID, vm.Runtime.APISockPath) + return err == nil && running +} + +// rediscoverHandles loads what the last daemon start knew about a VM +// from its handles.json scratch file and verifies the firecracker +// process is still alive. Returns: +// +// - handles: the scratch-file contents (zero-value if no file). +// ALWAYS returned, even when alive=false, because the caller +// needs them to tear down kernel state (dm-snapshot, loops, tap) +// that the previous daemon left behind when it died. +// - alive: true iff a firecracker process matching the api sock is +// currently running. +// - err: unexpected failure (file exists but is corrupt). +// +// Strategy: pgrep by api sock path first (handles the case where +// the daemon crashed but the PID changed on respawn — unlikely for +// firecracker, but cheap insurance); fall back to verifying the +// scratch file's PID directly. +func (s *VMService) rediscoverHandles(ctx context.Context, vm model.VMRecord) (model.VMHandles, bool, error) { + saved, _, err := readHandlesFile(vm.Runtime.VMDir) + if err != nil { + return model.VMHandles{}, false, err + } + apiSock := vm.Runtime.APISockPath + if apiSock == "" { + return saved, false, nil + } + if pid, pidErr := s.net.findFirecrackerPID(ctx, apiSock); pidErr == nil && pid > 0 { + saved.PID = pid + return saved, true, nil + } + if saved.PID > 0 { + if running, runErr := s.privOps().ProcessRunning(ctx, saved.PID, apiSock); runErr == nil && running { + return saved, true, nil + } + } + return saved, false, nil +} diff --git a/internal/daemon/vm_handles_test.go b/internal/daemon/vm_handles_test.go new file mode 100644 index 0000000..a1340e8 --- /dev/null +++ b/internal/daemon/vm_handles_test.go @@ -0,0 +1,224 @@ +package daemon + +import ( + "context" + "os" + "path/filepath" + "strings" + "testing" + + "banger/internal/model" +) + +func TestHandlesFileRoundtrip(t *testing.T) { + t.Parallel() + dir := t.TempDir() + want := model.VMHandles{ + PID: 4242, + TapDevice: "tap-fc-abcd", + BaseLoop: "/dev/loop9", + COWLoop: "/dev/loop10", + DMName: "fc-rootfs-abcd", + DMDev: "/dev/mapper/fc-rootfs-abcd", + } + if err := writeHandlesFile(dir, want); err != nil { + t.Fatalf("writeHandlesFile: %v", err) + } + got, present, err := readHandlesFile(dir) + if err != nil { + t.Fatalf("readHandlesFile: %v", err) + } + if !present { + t.Fatal("readHandlesFile reported no file after write") + } + if got != want { + t.Fatalf("roundtrip mismatch:\n got %+v\n want %+v", got, want) + } +} + +func TestSetVMHandlesMirrorsRuntimeTeardownState(t *testing.T) { + t.Parallel() + + d := &Daemon{} + wireServices(d) + + vmDir := t.TempDir() + vm := testVM("mirror", "image-mirror", "172.16.0.77") + vm.Runtime.VMDir = vmDir + + want := model.VMHandles{ + TapDevice: "tap-fc-0077", + BaseLoop: "/dev/loop17", + COWLoop: "/dev/loop18", + DMName: "fc-rootfs-0077", + DMDev: "/dev/mapper/fc-rootfs-0077", + } + d.vm.setVMHandles(&vm, want) + + if vm.Runtime.TapDevice != want.TapDevice || vm.Runtime.BaseLoop != want.BaseLoop || vm.Runtime.COWLoop != want.COWLoop || vm.Runtime.DMName != want.DMName || vm.Runtime.DMDev != want.DMDev { + t.Fatalf("runtime teardown state not mirrored: got %+v want %+v", vm.Runtime, want) + } +} + +func TestHandlesFileMissingReturnsZero(t *testing.T) { + t.Parallel() + h, present, err := readHandlesFile(t.TempDir()) + if err != nil { + t.Fatalf("readHandlesFile (missing): %v", err) + } + if present { + t.Fatal("present = true for missing file") + } + if !h.IsZero() { + t.Fatalf("expected zero-value handles, got %+v", h) + } +} + +func TestHandlesFileCorruptReturnsError(t *testing.T) { + t.Parallel() + dir := t.TempDir() + if err := os.WriteFile(handlesFilePath(dir), []byte("{not json"), 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + if _, _, err := readHandlesFile(dir); err == nil { + t.Fatal("expected parse error for corrupt file") + } +} + +func TestHandleCacheConcurrent(t *testing.T) { + t.Parallel() + c := newHandleCache() + done := make(chan struct{}) + // One writer, multiple readers — prove the RWMutex usage. + go func() { + for i := 0; i < 1000; i++ { + c.set("vm-1", model.VMHandles{PID: i}) + } + close(done) + }() + for i := 0; i < 1000; i++ { + _, _ = c.get("vm-1") + } + <-done + c.clear("vm-1") + if _, ok := c.get("vm-1"); ok { + t.Fatal("cache entry still present after clear") + } +} + +// TestRediscoverHandlesLoadsScratchWhenProcessDead proves the stale- +// cleanup path: the firecracker process is gone, but the scratch +// file tells us which kernel resources the previous daemon still +// owes us a teardown on. +func TestRediscoverHandlesLoadsScratchWhenProcessDead(t *testing.T) { + t.Parallel() + + vmDir := t.TempDir() + apiSock := filepath.Join(t.TempDir(), "fc.sock") + stale := model.VMHandles{ + PID: 999999, + BaseLoop: "/dev/loop99", + COWLoop: "/dev/loop100", + DMName: "fc-rootfs-gone", + DMDev: "/dev/mapper/fc-rootfs-gone", + } + if err := writeHandlesFile(vmDir, stale); err != nil { + t.Fatalf("writeHandlesFile: %v", err) + } + + // A scripted runner that reports "no such process" when reconcile + // probes via pgrep. + runner := &scriptedRunner{ + t: t, + steps: []runnerStep{ + {call: runnerCall{name: "pgrep", args: []string{"-n", "-f", apiSock}}, err: &exitErr{code: 1}}, + }, + } + d := &Daemon{runner: runner} + wireServices(d) + vm := testVM("gone", "image-gone", "172.16.0.250") + vm.Runtime.APISockPath = apiSock + vm.Runtime.VMDir = vmDir + + got, alive, err := d.vm.rediscoverHandles(context.Background(), vm) + if err != nil { + t.Fatalf("rediscoverHandles: %v", err) + } + if alive { + t.Fatal("alive = true, want false (process dead)") + } + // Even when dead, the scratch handles must be returned so + // cleanupRuntime can tear DM + loops + tap down. + if got.DMName != stale.DMName || got.BaseLoop != stale.BaseLoop || got.COWLoop != stale.COWLoop { + t.Fatalf("stale handles lost: got %+v, want fields from %+v", got, stale) + } + runner.assertExhausted() +} + +// TestRediscoverHandlesPrefersLivePIDOverScratch: scratch file has an +// old PID, but pgrep finds the actual current PID via the api sock. +func TestRediscoverHandlesPrefersLivePIDOverScratch(t *testing.T) { + t.Parallel() + + vmDir := t.TempDir() + apiSock := filepath.Join(t.TempDir(), "fc.sock") + if err := writeHandlesFile(vmDir, model.VMHandles{PID: 111, DMName: "dm-x"}); err != nil { + t.Fatalf("writeHandlesFile: %v", err) + } + + runner := &scriptedRunner{ + t: t, + steps: []runnerStep{ + {call: runnerCall{name: "pgrep", args: []string{"-n", "-f", apiSock}}, out: []byte("222\n")}, + }, + } + d := &Daemon{runner: runner} + wireServices(d) + vm := testVM("moved", "image-moved", "172.16.0.251") + vm.Runtime.APISockPath = apiSock + vm.Runtime.VMDir = vmDir + + got, alive, err := d.vm.rediscoverHandles(context.Background(), vm) + if err != nil { + t.Fatalf("rediscoverHandles: %v", err) + } + if !alive { + t.Fatal("alive = false, want true (pgrep found a PID)") + } + if got.PID != 222 { + t.Fatalf("PID = %d, want 222 (from pgrep, not scratch)", got.PID) + } + if got.DMName != "dm-x" { + t.Fatalf("scratch fields dropped: %+v", got) + } + runner.assertExhausted() +} + +// TestClearVMHandlesRemovesScratchFile proves the cleanup contract. +func TestClearVMHandlesRemovesScratchFile(t *testing.T) { + t.Parallel() + vmDir := t.TempDir() + if err := writeHandlesFile(vmDir, model.VMHandles{PID: 42}); err != nil { + t.Fatalf("writeHandlesFile: %v", err) + } + + d := &Daemon{} + wireServices(d) + vm := testVM("sweep", "image-sweep", "172.16.0.252") + vm.Runtime.VMDir = vmDir + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: 42}) + d.vm.clearVMHandles(vm) + + if _, err := os.Stat(handlesFilePath(vmDir)); !os.IsNotExist(err) { + t.Fatalf("scratch file still present: %v", err) + } + if h, ok := d.vm.handles.get(vm.ID); ok && !h.IsZero() { + t.Fatalf("cache entry survives clear: %+v", h) + } +} + +// exitErr is a minimal stand-in for an exec-style non-zero exit. +// Used by scripted runners to simulate "pgrep found nothing". +type exitErr struct{ code int } + +func (e *exitErr) Error() string { return "exit status " + strings.Repeat("1", 1) } diff --git a/internal/daemon/vm_lifecycle.go b/internal/daemon/vm_lifecycle.go new file mode 100644 index 0000000..ca0aad7 --- /dev/null +++ b/internal/daemon/vm_lifecycle.go @@ -0,0 +1,342 @@ +package daemon + +import ( + "context" + "errors" + "io" + "net" + "os" + "path/filepath" + "strings" + "time" + + "banger/internal/api" + "banger/internal/guest" + "banger/internal/model" + "banger/internal/system" +) + +func (s *VMService) StartVM(ctx context.Context, idOrName string) (model.VMRecord, error) { + return s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + image, err := s.store.GetImageByID(ctx, vm.ImageID) + if err != nil { + return model.VMRecord{}, err + } + if s.vmAlive(vm) { + if s.logger != nil { + s.logger.Info("vm already running", vmLogAttrs(vm)...) + } + return vm, nil + } + return s.startVMLocked(ctx, vm, image) + }) +} + +func (s *VMService) startVMLocked(ctx context.Context, vm model.VMRecord, image model.Image) (_ model.VMRecord, err error) { + op := s.beginOperation(ctx, "vm.start", append(vmLogAttrs(vm), imageLogAttrs(image)...)...) + defer func() { + if err != nil { + err = annotateLogPath(err, vm.Runtime.LogPath) + op.fail(err, vmLogAttrs(vm)...) + return + } + op.done(vmLogAttrs(vm)...) + }() + + // Derive per-VM paths/names up front so every step sees the same + // values. Shortening vm.ID mirrors how the pre-refactor inline + // code did it. + shortID := system.ShortID(vm.ID) + apiSock := filepath.Join(s.layout.RuntimeDir, "fc-"+shortID+".sock") + dmName := "fc-rootfs-" + shortID + tapName := "tap-fc-" + shortID + if strings.TrimSpace(vm.Runtime.VSockPath) == "" { + vm.Runtime.VSockPath = defaultVSockPath(s.layout.RuntimeDir, vm.ID) + } + if vm.Runtime.VSockCID == 0 { + vm.Runtime.VSockCID, err = defaultVSockCID(vm.Runtime.GuestIP) + if err != nil { + return model.VMRecord{}, err + } + } + + live := model.VMHandles{} + sc := &startContext{ + vm: &vm, + image: image, + live: &live, + apiSock: apiSock, + dmName: dmName, + tapName: tapName, + } + + if runErr := s.runStartSteps(ctx, op, sc, s.buildStartSteps(op, sc)); runErr != nil { + // The step driver already ran rollback in reverse for every + // succeeded step. All that's left is to persist the ERROR + // state so operators see the failure via `vm show`. Use a + // fresh context in case the request ctx is cancelled — DB + // writes past this point are recovery, not user-driven. + // + // The store check is for tests that construct a bare Daemon + // without a DB; production always has s.store non-nil. + vm.State = model.VMStateError + vm.Runtime.State = model.VMStateError + vm.Runtime.LastError = runErr.Error() + clearRuntimeTeardownState(&vm) + s.clearVMHandles(vm) + if s.store != nil { + // We're in the recovery path: the start has already + // failed, and the user will see runErr. A persist + // failure here only affects what 'banger vm show' + // reads on the next call, so we keep returning runErr + // — but a silent swallow leaves operators unable to + // debug "why does the record still say running?". Log + // at warn instead. + if persistErr := s.store.UpsertVM(context.Background(), vm); persistErr != nil && s.logger != nil { + s.logger.Warn("persist vm error state failed", append(vmLogAttrs(vm), "error", persistErr.Error())...) + } + } + return model.VMRecord{}, runErr + } + return vm, nil +} + +func (s *VMService) StopVM(ctx context.Context, idOrName string) (model.VMRecord, error) { + return s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + return s.stopVMLocked(ctx, vm) + }) +} + +func (s *VMService) stopVMLocked(ctx context.Context, current model.VMRecord) (vm model.VMRecord, err error) { + vm = current + op := s.beginOperation(ctx, "vm.stop", "vm_ref", vm.ID) + defer func() { + if err != nil { + op.fail(err, vmLogAttrs(vm)...) + return + } + op.done(vmLogAttrs(vm)...) + }() + if !s.vmAlive(vm) { + op.stage("cleanup_stale_runtime") + if err := s.cleanupRuntime(ctx, vm, true); err != nil { + return model.VMRecord{}, err + } + vm.State = model.VMStateStopped + vm.Runtime.State = model.VMStateStopped + clearRuntimeTeardownState(&vm) + s.clearVMHandles(vm) + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil + } + op.stage("graceful_shutdown") + // Reach into the guest over SSH to force a sync + queue a poweroff. + // The sync is what keeps stop() from losing data: every dirty page + // the guest hasn't flushed through virtio-blk to the work disk is + // written out before this RPC returns. Once sync completes, + // root.ext4 on the host is consistent and cleanupRuntime's SIGKILL + // is safe — there is no benefit to waiting for the guest's + // poweroff.target to finish, so we skip waitForExit entirely. + // + // When SSH is unreachable (broken sshd, network down, drifted host + // key) we drop straight to SIGKILL via cleanupRuntime. The + // previous fallback was SendCtrlAltDel + a 10-second wait for FC + // to exit, but on Debian ctrl+alt+del routes to reboot.target, so + // FC never exits on it — the wait was always a wasted 10s. We pay + // the data-loss cost we already paid before (after the timeout + // expired the old code SIGKILLed too), but without the latency. + if err := s.requestGuestPoweroff(ctx, vm); err != nil { + if s.logger != nil { + s.logger.Warn("guest ssh poweroff failed; SIGKILL without sync", + append(vmLogAttrs(vm), "error", err.Error())...) + } + } + op.stage("cleanup_runtime") + if err := s.cleanupRuntime(ctx, vm, true); err != nil { + return model.VMRecord{}, err + } + vm.State = model.VMStateStopped + vm.Runtime.State = model.VMStateStopped + clearRuntimeTeardownState(&vm) + s.clearVMHandles(vm) + system.TouchNow(&vm) + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil +} + +// requestGuestPoweroff dials the guest over SSH and runs a sync + +// queues a poweroff job. The sync is the load-bearing piece — see the +// comment in stopVMLocked. Returns the dial / SSH error if the guest +// is unreachable; the caller treats that as a fallback signal. +// +// Bounded by a hard 2-second SSH-dial timeout. A reachable guest on +// the host bridge dials in single-digit milliseconds; if we haven't +// connected in 2s the guest is effectively gone, so we fail fast and +// let the caller SIGKILL rather than burning latency on a doomed dial. +func (s *VMService) requestGuestPoweroff(ctx context.Context, vm model.VMRecord) error { + guestIP := strings.TrimSpace(vm.Runtime.GuestIP) + if guestIP == "" { + return errors.New("guest IP unknown") + } + dialCtx, cancel := context.WithTimeout(ctx, 2*time.Second) + defer cancel() + address := net.JoinHostPort(guestIP, "22") + client, err := guest.Dial(dialCtx, address, s.config.SSHKeyPath, s.layout.KnownHostsPath) + if err != nil { + return err + } + defer client.Close() + // `sync` runs synchronously and blocks RunScript until every dirty + // page hits virtio-blk → root.ext4. That's the persistence + // guarantee. The `systemctl --no-block poweroff` queues a job and + // returns; whether poweroff.target completes before the SIGKILL + // fallback fires is incidental — by then sync has already done + // its work. The `|| /sbin/poweroff -f` is the last-ditch fallback + // when systemd itself is wedged. + const script = "sync; systemctl --no-block poweroff || /sbin/poweroff -f &" + return client.RunScript(ctx, script, io.Discard) +} + +func (s *VMService) KillVM(ctx context.Context, params api.VMKillParams) (model.VMRecord, error) { + return s.withVMLockByRef(ctx, params.IDOrName, func(vm model.VMRecord) (model.VMRecord, error) { + return s.killVMLocked(ctx, vm, params.Signal) + }) +} + +func (s *VMService) killVMLocked(ctx context.Context, current model.VMRecord, signalValue string) (vm model.VMRecord, err error) { + vm = current + op := s.beginOperation(ctx, "vm.kill", "vm_ref", vm.ID, "signal", signalValue) + defer func() { + if err != nil { + op.fail(err, vmLogAttrs(vm)...) + return + } + op.done(vmLogAttrs(vm)...) + }() + if !s.vmAlive(vm) { + op.stage("cleanup_stale_runtime") + if err := s.cleanupRuntime(ctx, vm, true); err != nil { + return model.VMRecord{}, err + } + vm.State = model.VMStateStopped + vm.Runtime.State = model.VMStateStopped + clearRuntimeTeardownState(&vm) + s.clearVMHandles(vm) + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil + } + + signal := strings.TrimSpace(signalValue) + if signal == "" { + signal = "TERM" + } + pid := s.vmHandles(vm.ID).PID + op.stage("send_signal", "pid", pid, "signal", signal) + if err := s.privOps().SignalProcess(ctx, pid, signal); err != nil { + return model.VMRecord{}, err + } + op.stage("wait_for_exit", "pid", pid) + if err := s.net.waitForExit(ctx, pid, vm.Runtime.APISockPath, 30*time.Second); err != nil { + if !errors.Is(err, errWaitForExitTimeout) { + return model.VMRecord{}, err + } + op.stage("signal_timeout", "pid", pid, "signal", signal) + } + op.stage("cleanup_runtime") + if err := s.cleanupRuntime(ctx, vm, true); err != nil { + return model.VMRecord{}, err + } + vm.State = model.VMStateStopped + vm.Runtime.State = model.VMStateStopped + clearRuntimeTeardownState(&vm) + s.clearVMHandles(vm) + system.TouchNow(&vm) + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil +} + +func (s *VMService) RestartVM(ctx context.Context, idOrName string) (vm model.VMRecord, err error) { + op := s.beginOperation(ctx, "vm.restart", "vm_ref", idOrName) + defer func() { + if err != nil { + op.fail(err, vmLogAttrs(vm)...) + return + } + op.done(vmLogAttrs(vm)...) + }() + resolved, err := s.FindVM(ctx, idOrName) + if err != nil { + return model.VMRecord{}, err + } + return s.withVMLockByID(ctx, resolved.ID, func(vm model.VMRecord) (model.VMRecord, error) { + op.stage("stop") + vm, err = s.stopVMLocked(ctx, vm) + if err != nil { + return model.VMRecord{}, err + } + image, err := s.store.GetImageByID(ctx, vm.ImageID) + if err != nil { + return model.VMRecord{}, err + } + op.stage("start", vmLogAttrs(vm)...) + return s.startVMLocked(ctx, vm, image) + }) +} + +func (s *VMService) DeleteVM(ctx context.Context, idOrName string) (model.VMRecord, error) { + return s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + return s.deleteVMLocked(ctx, vm) + }) +} + +func (s *VMService) deleteVMLocked(ctx context.Context, current model.VMRecord) (vm model.VMRecord, err error) { + vm = current + op := s.beginOperation(ctx, "vm.delete", "vm_ref", vm.ID) + defer func() { + if err != nil { + op.fail(err, vmLogAttrs(vm)...) + return + } + op.done(vmLogAttrs(vm)...) + }() + if s.vmAlive(vm) { + pid := s.vmHandles(vm.ID).PID + op.stage("kill_running_vm", "pid", pid) + // Best-effort: cleanupRuntime below tears the process down + // regardless. A kill failure here only matters when it + // surfaces something operators should see (permission + // denied, etc.), so promote it from a silent _ to a Warn + // without changing the control flow. + if killErr := s.net.killVMProcess(ctx, pid); killErr != nil && s.logger != nil { + s.logger.Warn("kill vm process during delete failed", append(vmLogAttrs(vm), "pid", pid, "error", killErr.Error())...) + } + } + op.stage("cleanup_runtime") + if err := s.cleanupRuntime(ctx, vm, false); err != nil { + return model.VMRecord{}, err + } + clearRuntimeTeardownState(&vm) + op.stage("delete_store_record") + if err := s.store.DeleteVM(ctx, vm.ID); err != nil { + return model.VMRecord{}, err + } + if vm.Runtime.VMDir != "" { + op.stage("delete_vm_dir", "vm_dir", vm.Runtime.VMDir) + if err := os.RemoveAll(vm.Runtime.VMDir); err != nil { + return model.VMRecord{}, err + } + } + // Drop any host-key pins. A future VM reusing this IP or name + // would otherwise trip the TOFU mismatch branch in + // TOFUHostKeyCallback and fail to connect. + removeVMKnownHosts(s.layout.KnownHostsPath, vm, s.logger) + return vm, nil +} diff --git a/internal/daemon/vm_lifecycle_steps.go b/internal/daemon/vm_lifecycle_steps.go new file mode 100644 index 0000000..30f2c02 --- /dev/null +++ b/internal/daemon/vm_lifecycle_steps.go @@ -0,0 +1,442 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "os" + "strings" + + "banger/internal/firecracker" + "banger/internal/imagepull" + "banger/internal/model" + "banger/internal/roothelper" + "banger/internal/system" +) + +// jailerOpts returns the jailer launch options to bundle in the firecracker +// launch request, or nil when the jailer is disabled or misconfigured. +// nil makes the launch fall back to the legacy direct-firecracker path. +func (s *VMService) jailerOpts() *roothelper.JailerLaunchOpts { + if !s.config.JailerEnabled { + return nil + } + if strings.TrimSpace(s.config.JailerBin) == "" || strings.TrimSpace(s.config.JailerChrootBase) == "" { + return nil + } + return &roothelper.JailerLaunchOpts{ + Binary: s.config.JailerBin, + ChrootBaseDir: s.config.JailerChrootBase, + UID: os.Getuid(), + GID: os.Getgid(), + } +} + +// buildKernelArgs assembles the kernel command line for a start. +// Direct-boot images (no initrd) get kernel-level IP config so the +// network is up before init, plus init= pointing at the universal +// first-boot wrapper. Anything else uses the plain variant. +func buildKernelArgs(vm model.VMRecord, image model.Image, bridgeIP, defaultDNS string) string { + if strings.TrimSpace(image.InitrdPath) == "" { + return system.BuildBootArgsWithKernelIP( + vm.Name, vm.Runtime.GuestIP, bridgeIP, defaultDNS, + ) + " init=" + imagepull.FirstBootScriptPath + } + return system.BuildBootArgs(vm.Name) +} + +// startContext is the mutable state threaded through every start +// step. `vm` and `live` are pointers so steps mutate in place — +// dodges returning redundant copies and keeps step bodies readable. +// Values computed by `startVMLocked` before the driver runs +// (apiSock, dmName, tapName) live here too so each step can read +// them without rederiving. +type startContext struct { + vm *model.VMRecord + image model.Image + live *model.VMHandles + apiSock string + dmName string + tapName string + fcPath string + + // systemOverlayCreated records whether the system_overlay step + // actually created the file (vs. the file existing from a crashed + // prior attempt). The undo honours it so a leftover-but-valid + // overlay isn't deleted under us. + systemOverlayCreated bool +} + +// startStep is one phase in the start-VM pipeline. Phases with no +// rollback obligation leave `undo` nil — the driver simply skips +// them on the rollback path. `createStage` / `createDetail` are +// forwarded to `vmCreateStage` so the async-create RPC caller sees +// progress; they're "" for phases that were never part of the +// user-facing progress stream. +type startStep struct { + name string + attrs []any + createStage string + createDetail string + run func(ctx context.Context, sc *startContext) error + undo func(ctx context.Context, sc *startContext) error +} + +// runStartSteps walks steps in order, logging each via `op.stage` +// (and `vmCreateStage` when the step opted in). On the first +// run-err, it rolls back the prefix (including the failing step, so +// a step that acquired resources before erroring gets its undo +// fired) and returns the original err joined with any rollback err. +// +// Contract: `undo` must be safe to call even when `run` returned +// an error — check zero-value guards rather than assuming success. +// This is cheaper than a two-phase acquire/commit per step and +// matches how `cleanupPreparedCapabilities` in capabilities.go +// treats partial-success rollback. +func (s *VMService) runStartSteps(ctx context.Context, op *operationLog, sc *startContext, steps []startStep) error { + done := make([]startStep, 0, len(steps)) + for _, step := range steps { + if step.createStage != "" { + vmCreateStage(ctx, step.createStage, step.createDetail) + } + op.stage(step.name, step.attrs...) + if err := step.run(ctx, sc); err != nil { + done = append(done, step) // include the failing step — see contract above + if rollbackErr := s.rollbackStartSteps(op, sc, done); rollbackErr != nil { + err = errors.Join(err, rollbackErr) + } + return err + } + done = append(done, step) + } + return nil +} + +// rollbackStartSteps iterates completed steps in reverse, calling +// each non-nil `undo` with a detached context — the original ctx +// may already be cancelled (RPC client disconnect), but cleanup +// still needs to run. Undo errors are joined together; one step's +// failure doesn't short-circuit the rest. +func (s *VMService) rollbackStartSteps(op *operationLog, sc *startContext, done []startStep) error { + var err error + for i := len(done) - 1; i >= 0; i-- { + step := done[i] + if step.undo == nil { + continue + } + op.stage("rollback_" + step.name) + if undoErr := step.undo(context.Background(), sc); undoErr != nil { + err = errors.Join(err, fmt.Errorf("rollback %s: %w", step.name, undoErr)) + } + } + return err +} + +// buildStartSteps returns the ordered list of phases startVMLocked +// drives. Keeping the list as data (vs. a long linear method body) +// makes the phase inventory diff-readable and lets a test driver +// substitute its own step slice. +// +// Phase names MUST stay 1:1 with the prior inline version — they +// appear in daemon logs, smoke-log greps, and the async-create +// progress stream that clients read. +func (s *VMService) buildStartSteps(op *operationLog, sc *startContext) []startStep { + return []startStep{ + { + name: "preflight", + createStage: "preflight", + createDetail: "checking host prerequisites", + run: func(ctx context.Context, sc *startContext) error { + if err := s.validateStartPrereqs(ctx, *sc.vm, sc.image); err != nil { + return err + } + return os.MkdirAll(sc.vm.Runtime.VMDir, 0o755) + }, + }, + { + name: "cleanup_runtime", + run: func(ctx context.Context, sc *startContext) error { + if err := s.cleanupRuntime(ctx, *sc.vm, true); err != nil { + return err + } + s.clearVMHandles(*sc.vm) + return nil + }, + }, + { + name: "bridge", + run: func(ctx context.Context, _ *startContext) error { + return s.net.ensureBridge(ctx) + }, + }, + { + name: "socket_dir", + run: func(_ context.Context, _ *startContext) error { + return s.net.ensureSocketDir() + }, + }, + { + // prepare_sockets is a new op.stage label — the prior + // inline code ran these `os.RemoveAll` calls before the + // system_overlay stage without a stage marker. Keeping a + // distinct name makes the log trace and rollback (if any + // later step fails) unambiguous. + name: "prepare_sockets", + run: func(_ context.Context, sc *startContext) error { + if err := os.RemoveAll(sc.apiSock); err != nil && !os.IsNotExist(err) { + return err + } + if err := os.RemoveAll(sc.vm.Runtime.VSockPath); err != nil && !os.IsNotExist(err) { + return err + } + return nil + }, + }, + { + name: "system_overlay", + attrs: []any{"overlay_path", sc.vm.Runtime.SystemOverlay}, + createStage: "prepare_rootfs", + createDetail: "preparing system overlay", + run: func(ctx context.Context, sc *startContext) error { + // Record ownership BEFORE the call so a partial-truncate + // failure still triggers cleanup of the half-created file. + if !exists(sc.vm.Runtime.SystemOverlay) { + sc.systemOverlayCreated = true + } + return s.ensureSystemOverlay(ctx, sc.vm) + }, + undo: func(_ context.Context, sc *startContext) error { + if !sc.systemOverlayCreated { + return nil + } + if err := os.Remove(sc.vm.Runtime.SystemOverlay); err != nil && !os.IsNotExist(err) { + return err + } + return nil + }, + }, + { + name: "dm_snapshot", + attrs: []any{"dm_name", sc.dmName}, + createStage: "prepare_rootfs", + createDetail: "creating root filesystem snapshot", + run: func(ctx context.Context, sc *startContext) error { + snapHandles, err := s.net.createDMSnapshot(ctx, sc.image.RootfsPath, sc.vm.Runtime.SystemOverlay, sc.dmName) + if err != nil { + // createDMSnapshot cleans up its own partial state on + // err; leave sc.live zero so the undo is a no-op. + return err + } + sc.live.BaseLoop = snapHandles.BaseLoop + sc.live.COWLoop = snapHandles.COWLoop + sc.live.DMName = snapHandles.DMName + sc.live.DMDev = snapHandles.DMDev + s.setVMHandles(sc.vm, *sc.live) + // Fields that used to land next to the (now-deleted) + // cleanupOnErr closure. They belong with the DM + // snapshot because that's the first step producing + // runtime identity the downstream code reads back. + sc.vm.Runtime.APISockPath = sc.apiSock + sc.vm.Runtime.State = model.VMStateRunning + sc.vm.State = model.VMStateRunning + sc.vm.Runtime.LastError = "" + return nil + }, + undo: func(ctx context.Context, sc *startContext) error { + if sc.live.DMName == "" && sc.live.BaseLoop == "" && sc.live.COWLoop == "" { + return nil + } + return s.net.cleanupDMSnapshot(ctx, dmSnapshotHandles{ + BaseLoop: sc.live.BaseLoop, + COWLoop: sc.live.COWLoop, + DMName: sc.live.DMName, + DMDev: sc.live.DMDev, + }) + }, + }, + { + // e2fsck protects against stale bitmaps in a COW reused + // from a prior aborted start — without it, e2cp/e2rm in + // patch_root_overlay refuse to touch the snapshot. On a + // freshly-created COW (system_overlay just truncated + + // created the file this run) there are no stale bitmaps + // to repair and e2fsck is pure overhead. Skip it in that + // case. Exit codes 0 + 1 are both "ok" when we do run it. + name: "fsck_snapshot", + run: func(ctx context.Context, sc *startContext) error { + if sc.systemOverlayCreated { + return nil + } + return s.privOps().FsckSnapshot(ctx, sc.live.DMDev) + }, + }, + { + name: "patch_root_overlay", + createStage: "prepare_rootfs", + createDetail: "writing guest configuration", + run: func(ctx context.Context, sc *startContext) error { + return s.patchRootOverlay(ctx, *sc.vm, sc.image, sc.live.DMDev) + }, + }, + { + name: "prepare_host_features", + createStage: "prepare_host_features", + createDetail: "preparing host-side vm features", + run: func(ctx context.Context, sc *startContext) error { + return s.capHooks.prepareHosts(ctx, sc.vm, sc.image) + }, + // On err, prepareHosts already cleaned up the prefix that + // succeeded before the failing capability. On success, any + // LATER step failure triggers this undo, which tears down + // ALL prepared caps via their Cleanup hooks. + undo: func(ctx context.Context, sc *startContext) error { + return s.capHooks.cleanupState(ctx, *sc.vm) + }, + }, + { + name: "tap", + run: func(ctx context.Context, sc *startContext) error { + tap, err := s.net.acquireTap(ctx, sc.tapName) + if err != nil { + return err + } + sc.live.TapDevice = tap + s.setVMHandles(sc.vm, *sc.live) + return nil + }, + undo: func(ctx context.Context, sc *startContext) error { + if sc.live.TapDevice == "" { + return nil + } + return s.net.releaseTap(ctx, sc.live.TapDevice) + }, + }, + { + name: "metrics_file", + attrs: []any{"metrics_path", sc.vm.Runtime.MetricsPath}, + run: func(_ context.Context, sc *startContext) error { + return os.WriteFile(sc.vm.Runtime.MetricsPath, nil, 0o644) + }, + undo: func(_ context.Context, sc *startContext) error { + if err := os.Remove(sc.vm.Runtime.MetricsPath); err != nil && !os.IsNotExist(err) { + return err + } + return nil + }, + }, + { + name: "firecracker_binary", + run: func(ctx context.Context, sc *startContext) error { + fcPath, err := s.net.firecrackerBinary(ctx) + if err != nil { + return err + } + sc.fcPath = fcPath + return nil + }, + }, + { + name: "firecracker_launch", + attrs: []any{"log_path", sc.vm.Runtime.LogPath, "metrics_path", sc.vm.Runtime.MetricsPath}, + createStage: "boot_firecracker", + createDetail: "starting firecracker", + run: func(ctx context.Context, sc *startContext) error { + kernelArgs := buildKernelArgs(*sc.vm, sc.image, s.config.BridgeIP, s.config.DefaultDNS) + launchReq := roothelper.FirecrackerLaunchRequest{ + BinaryPath: sc.fcPath, + VMID: sc.vm.ID, + SocketPath: sc.apiSock, + LogPath: sc.vm.Runtime.LogPath, + MetricsPath: sc.vm.Runtime.MetricsPath, + KernelImagePath: sc.image.KernelPath, + InitrdPath: sc.image.InitrdPath, + KernelArgs: kernelArgs, + Drives: []firecracker.DriveConfig{{ + ID: "rootfs", + Path: sc.live.DMDev, + ReadOnly: false, + IsRoot: true, + }}, + TapDevice: sc.live.TapDevice, + VSockPath: sc.vm.Runtime.VSockPath, + VSockCID: sc.vm.Runtime.VSockCID, + VCPUCount: sc.vm.Spec.VCPUCount, + MemoryMiB: sc.vm.Spec.MemoryMiB, + Jailer: s.jailerOpts(), + } + machineConfig := firecracker.MachineConfig{Drives: launchReq.Drives} + s.capHooks.contributeMachine(&machineConfig, *sc.vm, sc.image) + launchReq.Drives = machineConfig.Drives + pid, err := s.privOps().LaunchFirecracker(ctx, launchReq) + if err != nil { + return err + } + sc.live.PID = pid + s.setVMHandles(sc.vm, *sc.live) + op.debugStage("firecracker_started", "pid", sc.live.PID) + return nil + }, + undo: func(ctx context.Context, sc *startContext) error { + var errs []error + if sc.live.PID > 0 { + if err := s.net.killVMProcess(ctx, sc.live.PID); err != nil { + errs = append(errs, err) + } + } + if err := os.Remove(sc.apiSock); err != nil && !os.IsNotExist(err) { + errs = append(errs, err) + } + if err := os.Remove(sc.vm.Runtime.VSockPath); err != nil && !os.IsNotExist(err) { + errs = append(errs, err) + } + return errors.Join(errs...) + }, + }, + { + name: "socket_access", + attrs: []any{"api_socket", sc.apiSock}, + run: func(ctx context.Context, sc *startContext) error { + return s.net.ensureSocketAccess(ctx, sc.apiSock, "firecracker api socket") + }, + }, + { + name: "vsock_access", + attrs: []any{"vsock_path", sc.vm.Runtime.VSockPath, "vsock_cid", sc.vm.Runtime.VSockCID}, + run: func(ctx context.Context, sc *startContext) error { + return s.net.ensureSocketAccess(ctx, sc.vm.Runtime.VSockPath, "firecracker vsock socket") + }, + }, + { + name: "wait_vsock_agent", + createStage: "wait_vsock_agent", + createDetail: "waiting for guest vsock agent", + run: func(ctx context.Context, sc *startContext) error { + return s.net.waitForGuestVSockAgent(ctx, sc.vm.Runtime.VSockPath, vsockReadyWait) + }, + }, + { + name: "post_start_features", + createStage: "wait_guest_ready", + createDetail: "waiting for guest services", + run: func(ctx context.Context, sc *startContext) error { + return s.capHooks.postStart(ctx, *sc.vm, sc.image) + }, + // Capability Cleanup hooks are designed to be idempotent + // (check feature-enabled flag, no-op if nothing to undo), + // so calling cleanupState here is safe whether postStart + // reached every cap or bailed midway. + undo: func(ctx context.Context, sc *startContext) error { + return s.capHooks.cleanupState(ctx, *sc.vm) + }, + }, + { + name: "persist", + createStage: "finalize", + createDetail: "saving vm state", + run: func(ctx context.Context, sc *startContext) error { + system.TouchNow(sc.vm) + return s.store.UpsertVM(ctx, *sc.vm) + }, + }, + } +} diff --git a/internal/daemon/vm_lifecycle_steps_test.go b/internal/daemon/vm_lifecycle_steps_test.go new file mode 100644 index 0000000..f6998a6 --- /dev/null +++ b/internal/daemon/vm_lifecycle_steps_test.go @@ -0,0 +1,164 @@ +package daemon + +import ( + "context" + "errors" + "io" + "log/slog" + "strings" + "testing" +) + +// TestRunStartSteps_RollsBackInReverseOnFailure pins the driver +// contract at the heart of commit 1's refactor: on a step failure +// (a) every step that succeeded BEFORE the failing one gets its +// undo fired in reverse order; (b) the failing step's undo also +// fires, because steps may acquire partial state before returning +// err; (c) the final error wraps both the run error and any +// rollback errors via errors.Join. +func TestRunStartSteps_RollsBackInReverseOnFailure(t *testing.T) { + s := &VMService{} + op := &operationLog{logger: slog.New(slog.NewTextHandler(io.Discard, nil))} + sc := &startContext{} + + var events []string + record := func(label string) func(context.Context, *startContext) error { + return func(context.Context, *startContext) error { + events = append(events, label) + return nil + } + } + recordErr := func(label string, err error) func(context.Context, *startContext) error { + return func(context.Context, *startContext) error { + events = append(events, label) + return err + } + } + + steps := []startStep{ + {name: "first", run: record("run-first"), undo: record("undo-first")}, + {name: "second", run: record("run-second"), undo: record("undo-second")}, + {name: "third", run: recordErr("run-third", errors.New("boom")), undo: record("undo-third")}, + {name: "fourth", run: record("run-fourth"), undo: record("undo-fourth")}, + } + + err := s.runStartSteps(context.Background(), op, sc, steps) + if err == nil || !strings.Contains(err.Error(), "boom") { + t.Fatalf("runStartSteps err = %v, want containing 'boom'", err) + } + + want := []string{ + // Forward run: first, second, third (fails — fourth never runs). + "run-first", "run-second", "run-third", + // Reverse undo: third, second, first. Fourth never ran so no undo-fourth. + "undo-third", "undo-second", "undo-first", + } + if len(events) != len(want) { + t.Fatalf("events length = %d, want %d:\n got: %v\n want: %v", len(events), len(want), events, want) + } + for i := range want { + if events[i] != want[i] { + t.Fatalf("events[%d] = %q, want %q\n got: %v\n want: %v", i, events[i], want[i], events, want) + } + } +} + +// TestRunStartSteps_SkipsNilUndos proves the optional-undo contract: +// steps without teardown obligations leave `undo` nil and the driver +// must silently skip them during rollback rather than panicking. +func TestRunStartSteps_SkipsNilUndos(t *testing.T) { + s := &VMService{} + op := &operationLog{logger: slog.New(slog.NewTextHandler(io.Discard, nil))} + sc := &startContext{} + + var undoCalls []string + undo := func(label string) func(context.Context, *startContext) error { + return func(context.Context, *startContext) error { + undoCalls = append(undoCalls, label) + return nil + } + } + noop := func(context.Context, *startContext) error { return nil } + + steps := []startStep{ + {name: "has-undo", run: noop, undo: undo("has-undo")}, + {name: "no-undo", run: noop}, // undo nil intentionally + {name: "failing", run: func(context.Context, *startContext) error { return errors.New("x") }, undo: undo("failing")}, + } + + if err := s.runStartSteps(context.Background(), op, sc, steps); err == nil { + t.Fatal("runStartSteps err = nil, want failure") + } + + // Rollback order: failing (acquired state, so its undo runs), no-undo + // (skipped — nil), has-undo. + want := []string{"failing", "has-undo"} + if len(undoCalls) != len(want) || undoCalls[0] != want[0] || undoCalls[1] != want[1] { + t.Fatalf("undo calls = %v, want %v", undoCalls, want) + } +} + +// TestRunStartSteps_JoinsRollbackErrors asserts that undo errors are +// joined onto the original run error rather than hiding it — the +// caller must always see the root cause ("boom") even when the +// rollback path itself is messy. +func TestRunStartSteps_JoinsRollbackErrors(t *testing.T) { + s := &VMService{} + op := &operationLog{logger: slog.New(slog.NewTextHandler(io.Discard, nil))} + sc := &startContext{} + + rootErr := errors.New("boom") + undoErr := errors.New("undo-fail") + + steps := []startStep{ + { + name: "ok", + run: func(context.Context, *startContext) error { return nil }, + undo: func(context.Context, *startContext) error { return undoErr }, + }, + { + name: "fail", + run: func(context.Context, *startContext) error { return rootErr }, + }, + } + + err := s.runStartSteps(context.Background(), op, sc, steps) + if err == nil { + t.Fatal("err = nil, want joined error") + } + if !errors.Is(err, rootErr) { + t.Fatalf("err does not wrap rootErr; got: %v", err) + } + if !errors.Is(err, undoErr) { + t.Fatalf("err does not wrap undoErr; got: %v", err) + } +} + +// TestRunStartSteps_HappyPathNoRollback confirms that when every +// step's run returns nil, no undo fires — rollback is strictly a +// failure-path concern. +func TestRunStartSteps_HappyPathNoRollback(t *testing.T) { + s := &VMService{} + op := &operationLog{logger: slog.New(slog.NewTextHandler(io.Discard, nil))} + sc := &startContext{} + + var undoCalled bool + steps := []startStep{ + { + name: "a", + run: func(context.Context, *startContext) error { return nil }, + undo: func(context.Context, *startContext) error { undoCalled = true; return nil }, + }, + { + name: "b", + run: func(context.Context, *startContext) error { return nil }, + }, + } + + if err := s.runStartSteps(context.Background(), op, sc, steps); err != nil { + t.Fatalf("runStartSteps err = %v, want nil", err) + } + if undoCalled { + t.Fatal("undo fired on happy path — rollback must only run on failure") + } +} diff --git a/internal/daemon/vm_locks.go b/internal/daemon/vm_locks.go new file mode 100644 index 0000000..0c731a7 --- /dev/null +++ b/internal/daemon/vm_locks.go @@ -0,0 +1,19 @@ +package daemon + +import "sync" + +// vmLockSet maps VM IDs to per-VM mutexes. Concurrent operations on different +// VMs run in parallel; concurrent operations on the same VM serialise. +type vmLockSet struct { + byID sync.Map // map[string]*sync.Mutex +} + +// lock acquires the mutex for the given VM ID and returns its unlock func. +// LoadOrStore is atomic — exactly one *sync.Mutex wins for each ID, so there +// is no release-then-reacquire TOCTOU window. +func (s *vmLockSet) lock(id string) func() { + val, _ := s.byID.LoadOrStore(id, &sync.Mutex{}) + mu := val.(*sync.Mutex) + mu.Lock() + return mu.Unlock +} diff --git a/internal/daemon/vm_service.go b/internal/daemon/vm_service.go new file mode 100644 index 0000000..86908a6 --- /dev/null +++ b/internal/daemon/vm_service.go @@ -0,0 +1,239 @@ +package daemon + +import ( + "context" + "database/sql" + "errors" + "fmt" + "log/slog" + "strings" + "sync" + + "banger/internal/daemon/opstate" + "banger/internal/firecracker" + "banger/internal/guestconfig" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/store" + "banger/internal/system" +) + +// VMService owns VM lifecycle — create / start / stop / restart / +// kill / delete / set — plus the handle cache, create-operation +// registry, stats polling, disk provisioning, ports query, and the +// SSH-client test seams. +// +// It holds pointers to its peer services (HostNetwork, ImageService, +// WorkspaceService) because VM lifecycle really does orchestrate +// across them (start needs bridge + tap + firecracker + auth sync + +// boot). Defining narrow function-typed interfaces for every peer +// method VMService calls would balloon the diff for no real win — +// services remain unexported within the package so nothing outside +// the daemon can see them. +// +// Capability dispatch goes through the capHooks seam rather than a +// *Daemon pointer, so VMService has no path back to the composition +// root. Daemon.buildCapabilityHooks() populates the seam at wiring +// time with the registered-capabilities loops from capabilities.go. +type VMService struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + store *store.Store + + // vmLocks is the per-VM mutex set. Held across entire lifecycle + // ops (start, stop, delete, set) — not just the validation window. + // Workspace.prepare intentionally splits off onto its own lock + // scope; see WorkspaceService. + vmLocks vmLockSet + createVMMu sync.Mutex + createOps opstate.Registry[*vmCreateOperationState] + + // handles caches per-VM transient kernel/process state (PID, tap, + // loop devices, DM name/device). Rebuildable at daemon startup + // from a per-VM handles.json scratch file plus OS inspection. + handles *handleCache + + // Peer services. VMService orchestrates across all three during + // start/stop/delete; pointer fields keep call sites direct without + // promoting the peer API to package-level interfaces. + net *HostNetwork + img *ImageService + ws *WorkspaceService + priv privilegedOps + + // vsockHostDevice is the path preflight + doctor expect to find for + // the vhost-vsock device. Defaults to defaultVsockHostDevice; tests + // point at a tempfile so RequireFile passes without needing the + // real kernel module loaded. + vsockHostDevice string + + // Capability hook dispatch. VMService invokes capabilities via + // these seams, populated by Daemon.buildCapabilityHooks() at + // wiring time. Capability implementations themselves are + // structs with explicit service-pointer fields (see capabilities.go); + // VMService never reaches back to *Daemon. + capHooks capabilityHooks + + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog +} + +// capabilityHooks bundles the capability-dispatch entry points that +// VMService needs. Populated by Daemon.buildCapabilityHooks() at +// service construction; stubbable in tests that don't care about +// capability side effects. +type capabilityHooks struct { + addStartPrereqs func(ctx context.Context, checks *system.Preflight, vm model.VMRecord, image model.Image) + contributeGuest func(builder *guestconfig.Builder, vm model.VMRecord, image model.Image) + contributeMachine func(cfg *firecracker.MachineConfig, vm model.VMRecord, image model.Image) + prepareHosts func(ctx context.Context, vm *model.VMRecord, image model.Image) error + postStart func(ctx context.Context, vm model.VMRecord, image model.Image) error + cleanupState func(ctx context.Context, vm model.VMRecord) error + applyConfigChanges func(ctx context.Context, before, after model.VMRecord) error +} + +type vmServiceDeps struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + store *store.Store + net *HostNetwork + img *ImageService + ws *WorkspaceService + priv privilegedOps + capHooks capabilityHooks + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog + vsockHostDevice string +} + +func newVMService(deps vmServiceDeps) *VMService { + vsockPath := deps.vsockHostDevice + if vsockPath == "" { + vsockPath = defaultVsockHostDevice + } + return &VMService{ + runner: deps.runner, + logger: deps.logger, + config: deps.config, + layout: deps.layout, + store: deps.store, + net: deps.net, + img: deps.img, + ws: deps.ws, + priv: deps.priv, + capHooks: deps.capHooks, + beginOperation: deps.beginOperation, + vsockHostDevice: vsockPath, + handles: newHandleCache(), + } +} + +// buildCapabilityHooks adapts Daemon's existing capability-dispatch +// methods into the capabilityHooks bag VMService takes. Keeps the +// registry + capability types on *Daemon while letting VMService call +// into them through explicit function seams. +func (d *Daemon) buildCapabilityHooks() capabilityHooks { + return capabilityHooks{ + addStartPrereqs: d.addCapabilityStartPrereqs, + contributeGuest: d.contributeGuestConfig, + contributeMachine: d.contributeMachineConfig, + prepareHosts: d.prepareCapabilityHosts, + postStart: d.postStartCapabilities, + cleanupState: d.cleanupCapabilityState, + applyConfigChanges: d.applyCapabilityConfigChanges, + } +} + +// FindVM resolves an ID-or-name against the store with the historical +// precedence: exact-ID / exact-name first, then unambiguous prefix +// match. Returns an error when no match is found or when a prefix +// matches more than one record. +func (s *VMService) FindVM(ctx context.Context, idOrName string) (model.VMRecord, error) { + if idOrName == "" { + return model.VMRecord{}, errors.New("vm id or name is required") + } + if vm, err := s.store.GetVM(ctx, idOrName); err == nil { + return vm, nil + } + vms, err := s.store.ListVMs(ctx) + if err != nil { + return model.VMRecord{}, err + } + matchCount := 0 + var match model.VMRecord + for _, vm := range vms { + if strings.HasPrefix(vm.ID, idOrName) || strings.HasPrefix(vm.Name, idOrName) { + match = vm + matchCount++ + } + } + if matchCount == 1 { + return match, nil + } + if matchCount > 1 { + return model.VMRecord{}, fmt.Errorf("multiple VMs match %q", idOrName) + } + return model.VMRecord{}, fmt.Errorf("vm %q not found", idOrName) +} + +// TouchVM bumps a VM's updated-at timestamp under the per-VM lock. +func (s *VMService) TouchVM(ctx context.Context, idOrName string) (model.VMRecord, error) { + return s.withVMLockByRef(ctx, idOrName, func(vm model.VMRecord) (model.VMRecord, error) { + system.TouchNow(&vm) + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil + }) +} + +// withVMLockByRef resolves idOrName then serialises fn under the +// per-VM lock. Every mutating VM operation funnels through here. +func (s *VMService) withVMLockByRef(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) { + vm, err := s.FindVM(ctx, idOrName) + if err != nil { + return model.VMRecord{}, err + } + return s.withVMLockByID(ctx, vm.ID, fn) +} + +// withVMLockByID locks on the stable VM ID (so a rename mid-flight +// doesn't drop the lock) and re-reads the record under the lock so +// fn sees the committed state. +func (s *VMService) withVMLockByID(ctx context.Context, id string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) { + if strings.TrimSpace(id) == "" { + return model.VMRecord{}, errors.New("vm id is required") + } + unlock := s.lockVMID(id) + defer unlock() + + vm, err := s.store.GetVMByID(ctx, id) + if err != nil { + if errors.Is(err, sql.ErrNoRows) { + return model.VMRecord{}, fmt.Errorf("vm %q not found", id) + } + return model.VMRecord{}, err + } + return fn(vm) +} + +// withVMLockByIDErr is the error-only variant of withVMLockByID for +// callers that don't need the returned record. +func (s *VMService) withVMLockByIDErr(ctx context.Context, id string, fn func(model.VMRecord) error) error { + _, err := s.withVMLockByID(ctx, id, func(vm model.VMRecord) (model.VMRecord, error) { + if err := fn(vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil + }) + return err +} + +// lockVMID exposes the per-VM mutex for callers that need to hold it +// outside the usual withVMLockByRef/withVMLockByID helpers +// (workspace prepare, for example). +func (s *VMService) lockVMID(id string) func() { + return s.vmLocks.lock(id) +} diff --git a/internal/daemon/vm_set.go b/internal/daemon/vm_set.go new file mode 100644 index 0000000..0acf4c4 --- /dev/null +++ b/internal/daemon/vm_set.go @@ -0,0 +1,87 @@ +package daemon + +import ( + "context" + "errors" + + "banger/internal/api" + "banger/internal/model" + "banger/internal/system" +) + +func (s *VMService) SetVM(ctx context.Context, params api.VMSetParams) (model.VMRecord, error) { + return s.withVMLockByRef(ctx, params.IDOrName, func(vm model.VMRecord) (model.VMRecord, error) { + return s.setVMLocked(ctx, vm, params) + }) +} + +func (s *VMService) setVMLocked(ctx context.Context, current model.VMRecord, params api.VMSetParams) (vm model.VMRecord, err error) { + vm = current + op := s.beginOperation(ctx, "vm.set", "vm_ref", vm.ID) + defer func() { + if err != nil { + op.fail(err, vmLogAttrs(vm)...) + return + } + op.done(vmLogAttrs(vm)...) + }() + running := s.vmAlive(vm) + if params.VCPUCount != nil { + if err := validateOptionalPositiveSetting("vcpu", params.VCPUCount); err != nil { + return model.VMRecord{}, err + } + if running { + return model.VMRecord{}, errors.New("vcpu changes require the VM to be stopped") + } + op.stage("update_vcpu", "vcpu_count", *params.VCPUCount) + vm.Spec.VCPUCount = *params.VCPUCount + } + if params.MemoryMiB != nil { + if err := validateOptionalPositiveSetting("memory", params.MemoryMiB); err != nil { + return model.VMRecord{}, err + } + if running { + return model.VMRecord{}, errors.New("memory changes require the VM to be stopped") + } + op.stage("update_memory", "memory_mib", *params.MemoryMiB) + vm.Spec.MemoryMiB = *params.MemoryMiB + } + if params.WorkDiskSize != "" { + size, err := model.ParseSize(params.WorkDiskSize) + if err != nil { + return model.VMRecord{}, err + } + if running { + return model.VMRecord{}, errors.New("disk changes require the VM to be stopped") + } + if size < vm.Spec.WorkDiskSizeBytes { + return model.VMRecord{}, errors.New("disk size can only grow") + } + if size > vm.Spec.WorkDiskSizeBytes { + if exists(vm.Runtime.WorkDiskPath) { + op.stage("resize_work_disk", "from_bytes", vm.Spec.WorkDiskSizeBytes, "to_bytes", size) + if err := s.validateWorkDiskResizePrereqs(); err != nil { + return model.VMRecord{}, err + } + if err := system.ResizeExt4Image(ctx, s.runner, vm.Runtime.WorkDiskPath, size); err != nil { + return model.VMRecord{}, err + } + } + vm.Spec.WorkDiskSizeBytes = size + } + } + if params.NATEnabled != nil { + op.stage("update_nat", "nat_enabled", *params.NATEnabled) + vm.Spec.NATEnabled = *params.NATEnabled + } + if running { + if err := s.capHooks.applyConfigChanges(ctx, current, vm); err != nil { + return model.VMRecord{}, err + } + } + system.TouchNow(&vm) + if err := s.store.UpsertVM(ctx, vm); err != nil { + return model.VMRecord{}, err + } + return vm, nil +} diff --git a/internal/daemon/vm_test.go b/internal/daemon/vm_test.go index a3ddc76..a747104 100644 --- a/internal/daemon/vm_test.go +++ b/internal/daemon/vm_test.go @@ -35,6 +35,7 @@ func TestFindVMPrefixResolution(t *testing.T) { ctx := context.Background() db := openDaemonStore(t) d := &Daemon{store: db} + wireServices(d) for _, vm := range []model.VMRecord{ testVM("alpha", "image-alpha", "172.16.0.2"), @@ -71,6 +72,7 @@ func TestFindImagePrefixResolution(t *testing.T) { ctx := context.Background() db := openDaemonStore(t) d := &Daemon{store: db} + wireServices(d) for _, image := range []model.Image{ testImage("base"), @@ -112,21 +114,36 @@ func TestReconcileStopsStaleRunningVMAndClearsRuntimeHandles(t *testing.T) { if err := os.WriteFile(apiSock, []byte{}, 0o644); err != nil { t.Fatalf("WriteFile(api sock): %v", err) } + vmDir := t.TempDir() vm := testVM("stale", "image-stale", "172.16.0.9") vm.State = model.VMStateRunning vm.Runtime.State = model.VMStateRunning - vm.Runtime.PID = 999999 vm.Runtime.APISockPath = apiSock - vm.Runtime.DMName = "fc-rootfs-stale" - vm.Runtime.DMDev = "/dev/mapper/fc-rootfs-stale" - vm.Runtime.COWLoop = "/dev/loop11" - vm.Runtime.BaseLoop = "/dev/loop10" + vm.Runtime.VMDir = vmDir vm.Runtime.DNSName = "" upsertDaemonVM(t, ctx, db, vm) + // Simulate the prior daemon crashing while this VM was running: + // the handles.json scratch file survives and names a stale PID + + // DM snapshot. Reconcile should discover the PID is gone, tear + // the kernel state down via the runner, and clear the scratch. + stale := model.VMHandles{ + PID: 999999, + BaseLoop: "/dev/loop10", + COWLoop: "/dev/loop11", + DMName: "fc-rootfs-stale", + DMDev: "/dev/mapper/fc-rootfs-stale", + } + if err := writeHandlesFile(vmDir, stale); err != nil { + t.Fatalf("writeHandlesFile: %v", err) + } + runner := &scriptedRunner{ t: t, steps: []runnerStep{ + // First pgrep: rediscoverHandles tries to verify the PID. + {call: runnerCall{name: "pgrep", args: []string{"-n", "-f", apiSock}}, err: errors.New("exit status 1")}, + // Second pgrep: cleanupRuntime asks again before killing. {call: runnerCall{name: "pgrep", args: []string{"-n", "-f", apiSock}}, err: errors.New("exit status 1")}, sudoStep("", nil, "dmsetup", "remove", "fc-rootfs-stale"), sudoStep("", nil, "losetup", "-d", "/dev/loop11"), @@ -134,6 +151,7 @@ func TestReconcileStopsStaleRunningVMAndClearsRuntimeHandles(t *testing.T) { }, } d := &Daemon{store: db, runner: runner} + wireServices(d) if err := d.reconcile(ctx); err != nil { t.Fatalf("reconcile: %v", err) @@ -147,8 +165,73 @@ func TestReconcileStopsStaleRunningVMAndClearsRuntimeHandles(t *testing.T) { if got.State != model.VMStateStopped || got.Runtime.State != model.VMStateStopped { t.Fatalf("vm state after reconcile = %s/%s, want stopped", got.State, got.Runtime.State) } - if got.Runtime.PID != 0 || got.Runtime.APISockPath != "" || got.Runtime.DMName != "" || got.Runtime.COWLoop != "" || got.Runtime.BaseLoop != "" { - t.Fatalf("runtime handles not cleared after reconcile: %+v", got.Runtime) + // The scratch file must be gone — stopped VMs don't carry handles. + if _, err := os.Stat(handlesFilePath(vmDir)); !os.IsNotExist(err) { + t.Fatalf("handles.json still present after reconcile: %v", err) + } + // And the in-memory cache must be empty. + if h, ok := d.vm.handles.get(vm.ID); ok && !h.IsZero() { + t.Fatalf("handle cache not cleared after reconcile: %+v", h) + } +} + +func TestReconcileWithCorruptHandlesFileFallsBackToPersistedRuntimeTeardownState(t *testing.T) { + t.Parallel() + + ctx := context.Background() + db := openDaemonStore(t) + apiSock := filepath.Join(t.TempDir(), "fc.sock") + if err := os.WriteFile(apiSock, []byte{}, 0o644); err != nil { + t.Fatalf("WriteFile(api sock): %v", err) + } + vmDir := t.TempDir() + vm := testVM("corrupt", "image-corrupt", "172.16.0.10") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + vm.Runtime.VMDir = vmDir + vm.Runtime.DNSName = "" + vm.Runtime.TapDevice = "tap-fc-corrupt" + vm.Runtime.BaseLoop = "/dev/loop20" + vm.Runtime.COWLoop = "/dev/loop21" + vm.Runtime.DMName = "fc-rootfs-corrupt" + vm.Runtime.DMDev = "/dev/mapper/fc-rootfs-corrupt" + upsertDaemonVM(t, ctx, db, vm) + + if err := os.WriteFile(handlesFilePath(vmDir), []byte("{not json"), 0o600); err != nil { + t.Fatalf("WriteFile(handles.json): %v", err) + } + + runner := &scriptedRunner{ + t: t, + steps: []runnerStep{ + {call: runnerCall{name: "pgrep", args: []string{"-n", "-f", apiSock}}, err: errors.New("exit status 1")}, + sudoStep("", nil, "dmsetup", "remove", "fc-rootfs-corrupt"), + sudoStep("", nil, "losetup", "-d", "/dev/loop21"), + sudoStep("", nil, "losetup", "-d", "/dev/loop20"), + sudoStep("", nil, "ip", "link", "del", "tap-fc-corrupt"), + }, + } + d := &Daemon{store: db, runner: runner} + wireServices(d) + + if err := d.reconcile(ctx); err != nil { + t.Fatalf("reconcile: %v", err) + } + runner.assertExhausted() + + got, err := db.GetVM(ctx, vm.ID) + if err != nil { + t.Fatalf("GetVM: %v", err) + } + if got.State != model.VMStateStopped || got.Runtime.State != model.VMStateStopped { + t.Fatalf("vm state after reconcile = %s/%s, want stopped", got.State, got.Runtime.State) + } + if got.Runtime.TapDevice != "" || got.Runtime.BaseLoop != "" || got.Runtime.COWLoop != "" || got.Runtime.DMName != "" || got.Runtime.DMDev != "" { + t.Fatalf("runtime teardown state not cleared after reconcile: %+v", got.Runtime) + } + if _, err := os.Stat(handlesFilePath(vmDir)); !os.IsNotExist(err) { + t.Fatalf("handles.json still present after reconcile: %v", err) } } @@ -168,13 +251,11 @@ func TestRebuildDNSIncludesOnlyLiveRunningVMs(t *testing.T) { live := testVM("live", "image-live", "172.16.0.21") live.State = model.VMStateRunning live.Runtime.State = model.VMStateRunning - live.Runtime.PID = liveCmd.Process.Pid live.Runtime.APISockPath = liveSock stale := testVM("stale", "image-stale", "172.16.0.22") stale.State = model.VMStateRunning stale.Runtime.State = model.VMStateRunning - stale.Runtime.PID = 999999 stale.Runtime.APISockPath = filepath.Join(t.TempDir(), "stale.sock") stopped := testVM("stopped", "image-stopped", "172.16.0.23") @@ -194,8 +275,14 @@ func TestRebuildDNSIncludesOnlyLiveRunningVMs(t *testing.T) { } }) - d := &Daemon{store: db, vmDNS: server} - if err := d.rebuildDNS(ctx); err != nil { + d := &Daemon{store: db, net: &HostNetwork{vmDNS: server}} + wireServices(d) + // rebuildDNS reads the alive check from the handle cache. Seed + // the live VM with its real PID; leave the stale entry with a PID + // that definitely isn't running (999999 ≫ max PID on most hosts). + d.vm.setVMHandlesInMemory(live.ID, model.VMHandles{PID: liveCmd.Process.Pid}) + d.vm.setVMHandlesInMemory(stale.ID, model.VMHandles{PID: 999999}) + if err := d.vm.rebuildDNS(ctx); err != nil { t.Fatalf("rebuildDNS: %v", err) } @@ -225,11 +312,12 @@ func TestSetVMRejectsStoppedOnlyChangesForRunningVM(t *testing.T) { vm := testVM("running", "image-run", "172.16.0.10") vm.State = model.VMStateRunning vm.Runtime.State = model.VMStateRunning - vm.Runtime.PID = cmd.Process.Pid vm.Runtime.APISockPath = apiSock upsertDaemonVM(t, ctx, db, vm) d := &Daemon{store: db} + wireServices(d) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: cmd.Process.Pid}) tests := []struct { name string params api.VMSetParams @@ -254,7 +342,7 @@ func TestSetVMRejectsStoppedOnlyChangesForRunningVM(t *testing.T) { for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { - _, err := d.SetVM(ctx, tt.params) + _, err := d.vm.SetVM(ctx, tt.params) if err == nil || !strings.Contains(err.Error(), tt.want) { t.Fatalf("SetVM(%s) error = %v, want %q", tt.name, err, tt.want) } @@ -330,21 +418,23 @@ func TestHealthVMReturnsHealthyForRunningGuest(t *testing.T) { vm := testVM("alive", "image-alive", "172.16.0.41") vm.State = model.VMStateRunning vm.Runtime.State = model.VMStateRunning - vm.Runtime.PID = fake.Process.Pid vm.Runtime.APISockPath = apiSock vm.Runtime.VSockPath = vsockSock vm.Runtime.VSockCID = 10041 upsertDaemonVM(t, ctx, db, vm) + handlePID := fake.Process.Pid runner := &scriptedRunner{ t: t, steps: []runnerStep{ - sudoStep("", nil, "chown", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), vsockSock), sudoStep("", nil, "chmod", "600", vsockSock), + sudoStep("", nil, "chown", "-h", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), vsockSock), }, } d := &Daemon{store: db, runner: runner} - result, err := d.HealthVM(ctx, vm.Name) + wireServices(d) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: handlePID}) + result, err := d.stats.HealthVM(ctx, vm.Name) if err != nil { t.Fatalf("HealthVM: %v", err) } @@ -393,7 +483,6 @@ func TestPingVMAliasReturnsAliveForHealthyVM(t *testing.T) { vm := testVM("healthy-ping", "image-healthy", "172.16.0.42") vm.State = model.VMStateRunning vm.Runtime.State = model.VMStateRunning - vm.Runtime.PID = fake.Process.Pid vm.Runtime.APISockPath = apiSock vm.Runtime.VSockPath = vsockSock vm.Runtime.VSockCID = 10042 @@ -402,12 +491,14 @@ func TestPingVMAliasReturnsAliveForHealthyVM(t *testing.T) { runner := &scriptedRunner{ t: t, steps: []runnerStep{ - sudoStep("", nil, "chown", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), vsockSock), sudoStep("", nil, "chmod", "600", vsockSock), + sudoStep("", nil, "chown", "-h", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), vsockSock), }, } d := &Daemon{store: db, runner: runner} - result, err := d.PingVM(ctx, vm.Name) + wireServices(d) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: fake.Process.Pid}) + result, err := d.stats.PingVM(ctx, vm.Name) if err != nil { t.Fatalf("PingVM: %v", err) } @@ -488,7 +579,8 @@ func TestWaitForGuestVSockAgentRetriesUntilHealthy(t *testing.T) { serverDone <- errors.New("health probe did not retry") }() - if err := waitForGuestVSockAgent(context.Background(), nil, socketPath, time.Second); err != nil { + n := &HostNetwork{} + if err := n.waitForGuestVSockAgent(context.Background(), socketPath, time.Second); err != nil { t.Fatalf("waitForGuestVSockAgent: %v", err) } if err := <-serverDone; err != nil { @@ -505,7 +597,8 @@ func TestHealthVMReturnsFalseForStoppedVM(t *testing.T) { upsertDaemonVM(t, ctx, db, vm) d := &Daemon{store: db} - result, err := d.HealthVM(ctx, vm.Name) + wireServices(d) + result, err := d.stats.HealthVM(ctx, vm.Name) if err != nil { t.Fatalf("HealthVM: %v", err) } @@ -590,7 +683,6 @@ func TestPortsVMReturnsEnrichedPortsAndWebSchemes(t *testing.T) { vm := testVM("ports", "image-ports", "127.0.0.1") vm.State = model.VMStateRunning vm.Runtime.State = model.VMStateRunning - vm.Runtime.PID = fake.Process.Pid vm.Runtime.APISockPath = apiSock vm.Runtime.VSockPath = vsockSock vm.Runtime.VSockCID = 10043 @@ -599,13 +691,15 @@ func TestPortsVMReturnsEnrichedPortsAndWebSchemes(t *testing.T) { runner := &scriptedRunner{ t: t, steps: []runnerStep{ - sudoStep("", nil, "chown", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), vsockSock), sudoStep("", nil, "chmod", "600", vsockSock), + sudoStep("", nil, "chown", "-h", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), vsockSock), }, } d := &Daemon{store: db, runner: runner} + wireServices(d) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: fake.Process.Pid}) - result, err := d.PortsVM(ctx, vm.Name) + result, err := d.stats.PortsVM(ctx, vm.Name) if err != nil { t.Fatalf("PortsVM: %v", err) } @@ -652,7 +746,8 @@ func TestPortsVMReturnsErrorForStoppedVM(t *testing.T) { upsertDaemonVM(t, ctx, db, vm) d := &Daemon{store: db} - _, err := d.PortsVM(ctx, vm.Name) + wireServices(d) + _, err := d.stats.PortsVM(ctx, vm.Name) if err == nil || !strings.Contains(err.Error(), "is not running") { t.Fatalf("PortsVM error = %v, want not running", err) } @@ -715,284 +810,529 @@ func TestSetVMDiskResizeFailsPreflightWhenToolsMissing(t *testing.T) { t.Setenv("PATH", t.TempDir()) d := &Daemon{store: db} - _, err := d.SetVM(ctx, api.VMSetParams{IDOrName: vm.ID, WorkDiskSize: "16G"}) + wireServices(d) + _, err := d.vm.SetVM(ctx, api.VMSetParams{IDOrName: vm.ID, WorkDiskSize: "16G"}) if err == nil || !strings.Contains(err.Error(), "work disk resize preflight failed") { t.Fatalf("SetVM() error = %v, want preflight failure", err) } } -func TestFlattenNestedWorkHomeCopiesEntriesIndividually(t *testing.T) { - t.Parallel() - - workMount := t.TempDir() - nestedHome := filepath.Join(workMount, "root") - if err := os.MkdirAll(filepath.Join(nestedHome, ".ssh"), 0o755); err != nil { - t.Fatalf("MkdirAll(.ssh): %v", err) - } - if err := os.WriteFile(filepath.Join(nestedHome, "notes.txt"), []byte("seed"), 0o644); err != nil { - t.Fatalf("WriteFile(notes.txt): %v", err) +func TestEnsureGitIdentityOnWorkDiskCopiesHostGlobalIdentity(t *testing.T) { + if _, err := exec.LookPath("git"); err != nil { + t.Skip("git not installed") } - runner := &scriptedRunner{ - t: t, - steps: []runnerStep{ - sudoStep("", nil, "chmod", "755", nestedHome), - sudoStep("", nil, "cp", "-a", filepath.Join(nestedHome, ".ssh"), workMount+"/"), - sudoStep("", nil, "cp", "-a", filepath.Join(nestedHome, "notes.txt"), workMount+"/"), - sudoStep("", nil, "rm", "-rf", nestedHome), + hostConfigPath := filepath.Join(t.TempDir(), "host.gitconfig") + t.Setenv("GIT_CONFIG_GLOBAL", hostConfigPath) + testSetGitConfig(t, "user.name", "Banger Host") + testSetGitConfig(t, "user.email", "host@example.com") + + workDiskDir := t.TempDir() + d := &Daemon{runner: &filesystemRunner{t: t}} + wireServices(d) + vm := testVM("git-identity", "image-git-identity", "172.16.0.67") + vm.Runtime.WorkDiskPath = workDiskDir + + if err := d.ws.ensureGitIdentityOnWorkDisk(context.Background(), &vm); err != nil { + t.Fatalf("ensureGitIdentityOnWorkDisk: %v", err) + } + + guestConfigPath := filepath.Join(workDiskDir, workDiskGitConfigRelativePath) + if got := testGitConfigValue(t, guestConfigPath, "user.name"); got != "Banger Host" { + t.Fatalf("guest user.name = %q, want Banger Host", got) + } + if got := testGitConfigValue(t, guestConfigPath, "user.email"); got != "host@example.com" { + t.Fatalf("guest user.email = %q, want host@example.com", got) + } +} + +func TestEnsureGitIdentityOnWorkDiskPreservesExistingGuestConfig(t *testing.T) { + if _, err := exec.LookPath("git"); err != nil { + t.Skip("git not installed") + } + + hostConfigPath := filepath.Join(t.TempDir(), "host.gitconfig") + t.Setenv("GIT_CONFIG_GLOBAL", hostConfigPath) + testSetGitConfig(t, "user.name", "Fresh Name") + testSetGitConfig(t, "user.email", "fresh@example.com") + + workDiskDir := t.TempDir() + guestConfigPath := filepath.Join(workDiskDir, workDiskGitConfigRelativePath) + if err := os.WriteFile(guestConfigPath, []byte("[safe]\n\tdirectory = /root/repo\n[user]\n\tname = stale\n"), 0o644); err != nil { + t.Fatalf("WriteFile(guest .gitconfig): %v", err) + } + + d := &Daemon{runner: &filesystemRunner{t: t}} + wireServices(d) + vm := testVM("git-identity-preserve", "image-git-identity", "172.16.0.68") + vm.Runtime.WorkDiskPath = workDiskDir + + if err := d.ws.ensureGitIdentityOnWorkDisk(context.Background(), &vm); err != nil { + t.Fatalf("ensureGitIdentityOnWorkDisk: %v", err) + } + + if got := testGitConfigValue(t, guestConfigPath, "user.name"); got != "Fresh Name" { + t.Fatalf("guest user.name = %q, want Fresh Name", got) + } + if got := testGitConfigValue(t, guestConfigPath, "user.email"); got != "fresh@example.com" { + t.Fatalf("guest user.email = %q, want fresh@example.com", got) + } + if got := testGitConfigValue(t, guestConfigPath, "safe.directory"); got != "/root/repo" { + t.Fatalf("guest safe.directory = %q, want /root/repo", got) + } +} + +func TestEnsureGitIdentityOnWorkDiskWarnsAndSkipsWhenHostIdentityIncomplete(t *testing.T) { + if _, err := exec.LookPath("git"); err != nil { + t.Skip("git not installed") + } + + hostConfigPath := filepath.Join(t.TempDir(), "host.gitconfig") + t.Setenv("GIT_CONFIG_GLOBAL", hostConfigPath) + testSetGitConfig(t, "user.name", "Only Name") + + workDiskDir := t.TempDir() + guestConfigPath := filepath.Join(workDiskDir, workDiskGitConfigRelativePath) + original := []byte("[user]\n\temail = keep@example.com\n") + if err := os.WriteFile(guestConfigPath, original, 0o644); err != nil { + t.Fatalf("WriteFile(guest .gitconfig): %v", err) + } + + var buf bytes.Buffer + logger, _, err := newDaemonLogger(&buf, "info") + if err != nil { + t.Fatalf("newDaemonLogger: %v", err) + } + + d := &Daemon{ + runner: &filesystemRunner{t: t}, + logger: logger, + } + wireServices(d) + vm := testVM("git-identity-missing", "image-git-identity", "172.16.0.69") + vm.Runtime.WorkDiskPath = workDiskDir + + if err := d.ws.ensureGitIdentityOnWorkDisk(context.Background(), &vm); err != nil { + t.Fatalf("ensureGitIdentityOnWorkDisk: %v", err) + } + + got, err := os.ReadFile(guestConfigPath) + if err != nil { + t.Fatalf("ReadFile(guest .gitconfig): %v", err) + } + if string(got) != string(original) { + t.Fatalf("guest .gitconfig = %q, want preserved %q", string(got), string(original)) + } + + entries := parseLogEntries(t, buf.Bytes()) + if !hasLogEntry(entries, map[string]string{ + "msg": "guest git identity sync skipped", + "vm_name": vm.Name, + "source": hostGlobalGitIdentitySource, + "error": "host git user.email is empty", + }) { + t.Fatalf("expected warn log, got %v", entries) + } +} + +func TestRunFileSyncNoOpWhenConfigEmpty(t *testing.T) { + d := &Daemon{runner: &filesystemRunner{t: t}} + wireServices(d) + vm := testVM("no-sync", "image", "172.16.0.70") + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) + } +} + +func TestRunFileSyncCopiesFile(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + srcPath := filepath.Join(homeDir, ".secrets", "token") + if err := os.MkdirAll(filepath.Dir(srcPath), 0o755); err != nil { + t.Fatal(err) + } + srcData := []byte(`{"token":"abc"}`) + if err := os.WriteFile(srcPath, srcData, 0o600); err != nil { + t.Fatal(err) + } + + workDisk := t.TempDir() + d := &Daemon{ + runner: &filesystemRunner{t: t}, + config: model.DaemonConfig{ + FileSync: []model.FileSyncEntry{ + {Host: "~/.secrets/token", Guest: "~/.secrets/token"}, + }, }, } - d := &Daemon{runner: runner} - - if err := d.flattenNestedWorkHome(context.Background(), workMount); err != nil { - t.Fatalf("flattenNestedWorkHome: %v", err) + wireServices(d) + vm := testVM("sync-file", "image", "172.16.0.71") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) + } + + dst := filepath.Join(workDisk, ".secrets", "token") + got, err := os.ReadFile(dst) + if err != nil { + t.Fatal(err) + } + if string(got) != string(srcData) { + t.Fatalf("dst = %q, want %q", got, srcData) + } + info, err := os.Stat(dst) + if err != nil { + t.Fatal(err) + } + if info.Mode().Perm() != 0o600 { + t.Fatalf("mode = %v, want 0600", info.Mode().Perm()) } - runner.assertExhausted() } -func TestEnsureAuthorizedKeyOnWorkDiskRepairsNestedRootLayout(t *testing.T) { - t.Parallel() - - workDiskDir := t.TempDir() - nestedHome := filepath.Join(workDiskDir, "root") - if err := os.MkdirAll(filepath.Join(nestedHome, ".ssh"), 0o700); err != nil { - t.Fatalf("MkdirAll(.ssh): %v", err) - } - if err := os.WriteFile(filepath.Join(nestedHome, ".bashrc"), []byte("export TEST_PROMPT=1\n"), 0o644); err != nil { - t.Fatalf("WriteFile(.bashrc): %v", err) - } - legacyKey := "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILEgacykey legacy@test\n" - if err := os.WriteFile(filepath.Join(nestedHome, ".ssh", "authorized_keys"), []byte(legacyKey), 0o600); err != nil { - t.Fatalf("WriteFile(authorized_keys): %v", err) +func TestRunFileSyncRespectsCustomMode(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + srcPath := filepath.Join(homeDir, "script") + if err := os.WriteFile(srcPath, []byte("#!/bin/sh\nexit 0\n"), 0o600); err != nil { + t.Fatal(err) } - privateKey, err := rsa.GenerateKey(rand.Reader, 1024) + workDisk := t.TempDir() + d := &Daemon{ + runner: &filesystemRunner{t: t}, + config: model.DaemonConfig{ + FileSync: []model.FileSyncEntry{ + {Host: "~/script", Guest: "~/bin/my-script", Mode: "0755"}, + }, + }, + } + wireServices(d) + vm := testVM("sync-mode", "image", "172.16.0.72") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) + } + + info, err := os.Stat(filepath.Join(workDisk, "bin", "my-script")) if err != nil { - t.Fatalf("GenerateKey: %v", err) + t.Fatal(err) } - privateKeyPEM := pem.EncodeToMemory(&pem.Block{ - Type: "RSA PRIVATE KEY", - Bytes: x509.MarshalPKCS1PrivateKey(privateKey), - }) - sshKeyPath := filepath.Join(t.TempDir(), "id_rsa") - if err := os.WriteFile(sshKeyPath, privateKeyPEM, 0o600); err != nil { - t.Fatalf("WriteFile(private key): %v", err) + if info.Mode().Perm() != 0o755 { + t.Fatalf("mode = %v, want 0755", info.Mode().Perm()) + } +} + +func TestRunFileSyncSkipsMissingHostPath(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + var buf bytes.Buffer + logger, _, err := newDaemonLogger(&buf, "info") + if err != nil { + t.Fatal(err) + } + + workDisk := t.TempDir() + d := &Daemon{ + runner: &filesystemRunner{t: t}, + logger: logger, + config: model.DaemonConfig{ + FileSync: []model.FileSyncEntry{ + {Host: "~/does-not-exist", Guest: "~/wherever"}, + }, + }, + } + wireServices(d) + vm := testVM("sync-missing", "image", "172.16.0.73") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) + } + + entries := parseLogEntries(t, buf.Bytes()) + if !hasLogEntry(entries, map[string]string{ + "msg": "file_sync skipped", + "vm_name": vm.Name, + "host_path": filepath.Join(homeDir, "does-not-exist"), + }) { + t.Fatalf("expected skipped log, got %v", entries) + } +} + +func TestRunFileSyncOverwritesExistingGuestFile(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + srcPath := filepath.Join(homeDir, "token") + if err := os.WriteFile(srcPath, []byte("fresh"), 0o600); err != nil { + t.Fatal(err) + } + workDisk := t.TempDir() + // Work disk is mounted at /root in the guest, so the guest path + // "/root/token" maps to workDisk/token here. + existing := filepath.Join(workDisk, "token") + if err := os.WriteFile(existing, []byte("stale"), 0o600); err != nil { + t.Fatal(err) } d := &Daemon{ runner: &filesystemRunner{t: t}, - config: model.DaemonConfig{SSHKeyPath: sshKeyPath}, + config: model.DaemonConfig{ + FileSync: []model.FileSyncEntry{ + {Host: "~/token", Guest: "/root/token"}, + }, + }, + } + wireServices(d) + vm := testVM("sync-overwrite", "image", "172.16.0.74") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) } - vm := testVM("seed-repair", "image-seed-repair", "172.16.0.61") - vm.Runtime.WorkDiskPath = workDiskDir - if err := d.ensureAuthorizedKeyOnWorkDisk(context.Background(), &vm, model.Image{}, workDiskPreparation{}); err != nil { - t.Fatalf("ensureAuthorizedKeyOnWorkDisk: %v", err) - } - if _, err := os.Stat(filepath.Join(workDiskDir, "root")); !os.IsNotExist(err) { - t.Fatalf("nested root still exists: %v", err) - } - if _, err := os.Stat(filepath.Join(workDiskDir, ".bashrc")); err != nil { - t.Fatalf(".bashrc missing at top level: %v", err) - } - data, err := os.ReadFile(filepath.Join(workDiskDir, ".ssh", "authorized_keys")) + got, err := os.ReadFile(existing) if err != nil { - t.Fatalf("ReadFile(authorized_keys): %v", err) + t.Fatal(err) } - content := string(data) - if !strings.Contains(content, strings.TrimSpace(legacyKey)) { - t.Fatalf("authorized_keys missing legacy key: %q", content) - } - if !strings.Contains(content, "ssh-rsa ") { - t.Fatalf("authorized_keys missing managed key: %q", content) + if string(got) != "fresh" { + t.Fatalf("guest file = %q, want fresh", got) } } -func TestEnsureOpencodeAuthOnWorkDiskCopiesHostAuth(t *testing.T) { +func TestRunFileSyncCopiesDirectoryRecursively(t *testing.T) { homeDir := t.TempDir() t.Setenv("HOME", homeDir) - hostAuthPath := filepath.Join(homeDir, workDiskOpencodeAuthRelativePath) - if err := os.MkdirAll(filepath.Dir(hostAuthPath), 0o755); err != nil { - t.Fatalf("MkdirAll(host auth dir): %v", err) + srcDir := filepath.Join(homeDir, ".aws") + if err := os.MkdirAll(srcDir, 0o755); err != nil { + t.Fatal(err) } - hostAuth := []byte("{\"provider\":\"openai\"}\n") - if err := os.WriteFile(hostAuthPath, hostAuth, 0o600); err != nil { - t.Fatalf("WriteFile(host auth): %v", err) + if err := os.WriteFile(filepath.Join(srcDir, "credentials"), []byte("access"), 0o600); err != nil { + t.Fatal(err) + } + sub := filepath.Join(srcDir, "sso", "cache") + if err := os.MkdirAll(sub, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(sub, "token.json"), []byte("sso-token"), 0o600); err != nil { + t.Fatal(err) } - workDiskDir := t.TempDir() - d := &Daemon{runner: &filesystemRunner{t: t}} - vm := testVM("auth-sync", "image-auth-sync", "172.16.0.63") - vm.Runtime.WorkDiskPath = workDiskDir - - if err := d.ensureOpencodeAuthOnWorkDisk(context.Background(), &vm); err != nil { - t.Fatalf("ensureOpencodeAuthOnWorkDisk: %v", err) + workDisk := t.TempDir() + d := &Daemon{ + runner: &filesystemRunner{t: t}, + config: model.DaemonConfig{ + FileSync: []model.FileSyncEntry{ + {Host: "~/.aws", Guest: "~/.aws"}, + }, + }, + } + wireServices(d) + vm := testVM("sync-dir", "image", "172.16.0.75") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) } - guestAuthPath := filepath.Join(workDiskDir, workDiskOpencodeAuthRelativePath) - got, err := os.ReadFile(guestAuthPath) + creds, err := os.ReadFile(filepath.Join(workDisk, ".aws", "credentials")) if err != nil { - t.Fatalf("ReadFile(guest auth): %v", err) + t.Fatal(err) } - if string(got) != string(hostAuth) { - t.Fatalf("guest auth = %q, want %q", string(got), string(hostAuth)) + if string(creds) != "access" { + t.Fatalf("credentials = %q, want access", creds) } - info, err := os.Stat(guestAuthPath) + ssoToken, err := os.ReadFile(filepath.Join(workDisk, ".aws", "sso", "cache", "token.json")) if err != nil { - t.Fatalf("Stat(guest auth): %v", err) + t.Fatal(err) } - if gotMode := info.Mode().Perm(); gotMode != 0o600 { - t.Fatalf("guest auth mode = %o, want 600", gotMode) + if string(ssoToken) != "sso-token" { + t.Fatalf("sso token = %q, want sso-token", ssoToken) } } -func TestEnsureOpencodeAuthOnWorkDiskReplacesExistingGuestAuth(t *testing.T) { +func TestRunFileSyncAllowsTopLevelSymlinkWithinHome(t *testing.T) { homeDir := t.TempDir() t.Setenv("HOME", homeDir) - hostAuthPath := filepath.Join(homeDir, workDiskOpencodeAuthRelativePath) - if err := os.MkdirAll(filepath.Dir(hostAuthPath), 0o755); err != nil { - t.Fatalf("MkdirAll(host auth dir): %v", err) + + targetDir := filepath.Join(homeDir, ".config", "gh") + if err := os.MkdirAll(targetDir, 0o755); err != nil { + t.Fatal(err) } - hostAuth := []byte("{\"token\":\"fresh\"}\n") - if err := os.WriteFile(hostAuthPath, hostAuth, 0o600); err != nil { - t.Fatalf("WriteFile(host auth): %v", err) + targetPath := filepath.Join(targetDir, "hosts.yml") + if err := os.WriteFile(targetPath, []byte("github.com"), 0o600); err != nil { + t.Fatal(err) + } + linkPath := filepath.Join(homeDir, "gh-hosts.yml") + if err := os.Symlink(targetPath, linkPath); err != nil { + t.Skipf("symlink unsupported on this filesystem: %v", err) } - workDiskDir := t.TempDir() - guestAuthPath := filepath.Join(workDiskDir, workDiskOpencodeAuthRelativePath) - if err := os.MkdirAll(filepath.Dir(guestAuthPath), 0o755); err != nil { - t.Fatalf("MkdirAll(guest auth dir): %v", err) + workDisk := t.TempDir() + d := &Daemon{ + runner: &filesystemRunner{t: t}, + config: model.DaemonConfig{ + HostHomeDir: homeDir, + FileSync: []model.FileSyncEntry{ + {Host: "~/gh-hosts.yml", Guest: "~/.config/gh/hosts.yml"}, + }, + }, } - if err := os.WriteFile(guestAuthPath, []byte("{\"token\":\"stale\"}\n"), 0o600); err != nil { - t.Fatalf("WriteFile(guest auth): %v", err) + wireServices(d) + vm := testVM("sync-top-level-symlink-ok", "image", "172.16.0.77") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) } - d := &Daemon{runner: &filesystemRunner{t: t}} - vm := testVM("auth-replace", "image-auth-replace", "172.16.0.64") - vm.Runtime.WorkDiskPath = workDiskDir - - if err := d.ensureOpencodeAuthOnWorkDisk(context.Background(), &vm); err != nil { - t.Fatalf("ensureOpencodeAuthOnWorkDisk: %v", err) - } - - got, err := os.ReadFile(guestAuthPath) + got, err := os.ReadFile(filepath.Join(workDisk, ".config", "gh", "hosts.yml")) if err != nil { - t.Fatalf("ReadFile(guest auth): %v", err) + t.Fatal(err) } - if string(got) != string(hostAuth) { - t.Fatalf("guest auth = %q, want %q", string(got), string(hostAuth)) + if string(got) != "github.com" { + t.Fatalf("guest file = %q, want github.com", got) } } -func TestEnsureOpencodeAuthOnWorkDiskWarnsAndSkipsWhenHostAuthMissing(t *testing.T) { +func TestRunFileSyncRejectsTopLevelSymlinkOutsideHome(t *testing.T) { homeDir := t.TempDir() t.Setenv("HOME", homeDir) - workDiskDir := t.TempDir() - guestAuthPath := filepath.Join(workDiskDir, workDiskOpencodeAuthRelativePath) - if err := os.MkdirAll(filepath.Dir(guestAuthPath), 0o755); err != nil { - t.Fatalf("MkdirAll(guest auth dir): %v", err) + outsideDir := t.TempDir() + targetPath := filepath.Join(outsideDir, "secret.txt") + if err := os.WriteFile(targetPath, []byte("must-stay-outside"), 0o600); err != nil { + t.Fatal(err) } - original := []byte("{\"token\":\"keep\"}\n") - if err := os.WriteFile(guestAuthPath, original, 0o600); err != nil { - t.Fatalf("WriteFile(guest auth): %v", err) + linkPath := filepath.Join(homeDir, "secret-link") + if err := os.Symlink(targetPath, linkPath); err != nil { + t.Skipf("symlink unsupported on this filesystem: %v", err) + } + + workDisk := t.TempDir() + d := &Daemon{ + runner: &filesystemRunner{t: t}, + config: model.DaemonConfig{ + HostHomeDir: homeDir, + FileSync: []model.FileSyncEntry{ + {Host: "~/secret-link", Guest: "~/secret.txt"}, + }, + }, + } + wireServices(d) + vm := testVM("sync-top-level-symlink-reject", "image", "172.16.0.78") + vm.Runtime.WorkDiskPath = workDisk + err := d.ws.runFileSync(context.Background(), &vm) + if err == nil || !strings.Contains(err.Error(), "owner home") { + t.Fatalf("runFileSync error = %v, want owner-home rejection", err) + } + if _, statErr := os.Stat(filepath.Join(workDisk, "secret.txt")); !os.IsNotExist(statErr) { + t.Fatalf("guest file exists after rejected sync (stat err = %v)", statErr) + } +} + +// TestRunFileSyncSkipsNestedSymlinks pins the anti-sprawl contract: +// a symlink INSIDE a synced directory is not followed, even if the +// target holds real files. Without this, a user syncing ~/.aws with +// a ~/.aws/session -> ~/other-creds symlink would copy the unrelated +// creds into the guest. Top-level entries are resolved separately: +// they may still follow, but only when the real target stays under +// the configured owner home. +func TestRunFileSyncSkipsNestedSymlinks(t *testing.T) { + homeDir := t.TempDir() + t.Setenv("HOME", homeDir) + + // Target the user DID NOT name — lives outside the synced tree. + outsideDir := filepath.Join(homeDir, "other-creds") + if err := os.MkdirAll(outsideDir, 0o700); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(outsideDir, "leaked.txt"), []byte("must-not-escape"), 0o600); err != nil { + t.Fatal(err) + } + + // The synced directory. + srcDir := filepath.Join(homeDir, ".aws") + if err := os.MkdirAll(srcDir, 0o700); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(srcDir, "credentials"), []byte("access"), 0o600); err != nil { + t.Fatal(err) + } + // File symlink inside .aws pointing OUT of the tree. + if err := os.Symlink(filepath.Join(outsideDir, "leaked.txt"), filepath.Join(srcDir, "session")); err != nil { + t.Skipf("symlink unsupported on this filesystem: %v", err) + } + // Directory symlink inside .aws pointing OUT of the tree — must + // not be recursed into. + if err := os.Symlink(outsideDir, filepath.Join(srcDir, "linked-dir")); err != nil { + t.Fatal(err) } var buf bytes.Buffer logger, _, err := newDaemonLogger(&buf, "info") if err != nil { - t.Fatalf("newDaemonLogger: %v", err) + t.Fatal(err) } + workDisk := t.TempDir() d := &Daemon{ runner: &filesystemRunner{t: t}, logger: logger, + config: model.DaemonConfig{ + FileSync: []model.FileSyncEntry{ + {Host: "~/.aws", Guest: "~/.aws"}, + }, + }, } - vm := testVM("auth-missing", "image-auth-missing", "172.16.0.65") - vm.Runtime.WorkDiskPath = workDiskDir - - if err := d.ensureOpencodeAuthOnWorkDisk(context.Background(), &vm); err != nil { - t.Fatalf("ensureOpencodeAuthOnWorkDisk: %v", err) + wireServices(d) + vm := testVM("sync-symlink", "image", "172.16.0.76") + vm.Runtime.WorkDiskPath = workDisk + if err := d.ws.runFileSync(context.Background(), &vm); err != nil { + t.Fatalf("runFileSync: %v", err) } - got, err := os.ReadFile(guestAuthPath) + // The real file inside the tree must copy. + creds, err := os.ReadFile(filepath.Join(workDisk, ".aws", "credentials")) if err != nil { - t.Fatalf("ReadFile(guest auth): %v", err) + t.Fatalf("credentials not copied: %v", err) } - if string(got) != string(original) { - t.Fatalf("guest auth = %q, want preserved %q", string(got), string(original)) + if string(creds) != "access" { + t.Fatalf("credentials = %q, want access", creds) } + // Neither the file symlink nor anything reached through the + // directory symlink should have been materialised in the guest + // path. + for _, shouldNotExist := range []string{ + filepath.Join(workDisk, ".aws", "session"), + filepath.Join(workDisk, ".aws", "linked-dir"), + filepath.Join(workDisk, ".aws", "linked-dir", "leaked.txt"), + } { + if _, err := os.Stat(shouldNotExist); !os.IsNotExist(err) { + t.Fatalf("symlinked path %s was materialised in guest tree (stat err = %v); secret leakage path open", shouldNotExist, err) + } + } + + // Each skipped symlink must be warned. entries := parseLogEntries(t, buf.Bytes()) - if !hasLogEntry(entries, map[string]string{ - "msg": "guest opencode auth sync skipped", - "vm_name": vm.Name, - "host_path": filepath.Join(homeDir, workDiskOpencodeAuthRelativePath), - }) { - t.Fatalf("expected warn log, got %v", entries) - } -} - -func TestEnsureOpencodeAuthOnWorkDiskWarnsAndSkipsWhenHostAuthUnreadable(t *testing.T) { - homeDir := t.TempDir() - t.Setenv("HOME", homeDir) - hostAuthPath := filepath.Join(homeDir, workDiskOpencodeAuthRelativePath) - if err := os.MkdirAll(hostAuthPath, 0o755); err != nil { - t.Fatalf("MkdirAll(host auth path as dir): %v", err) - } - - workDiskDir := t.TempDir() - guestAuthPath := filepath.Join(workDiskDir, workDiskOpencodeAuthRelativePath) - if err := os.MkdirAll(filepath.Dir(guestAuthPath), 0o755); err != nil { - t.Fatalf("MkdirAll(guest auth dir): %v", err) - } - original := []byte("{\"token\":\"keep\"}\n") - if err := os.WriteFile(guestAuthPath, original, 0o600); err != nil { - t.Fatalf("WriteFile(guest auth): %v", err) - } - - var buf bytes.Buffer - logger, _, err := newDaemonLogger(&buf, "info") - if err != nil { - t.Fatalf("newDaemonLogger: %v", err) - } - - d := &Daemon{ - runner: &filesystemRunner{t: t}, - logger: logger, - } - vm := testVM("auth-unreadable", "image-auth-unreadable", "172.16.0.66") - vm.Runtime.WorkDiskPath = workDiskDir - - if err := d.ensureOpencodeAuthOnWorkDisk(context.Background(), &vm); err != nil { - t.Fatalf("ensureOpencodeAuthOnWorkDisk: %v", err) - } - - got, err := os.ReadFile(guestAuthPath) - if err != nil { - t.Fatalf("ReadFile(guest auth): %v", err) - } - if string(got) != string(original) { - t.Fatalf("guest auth = %q, want preserved %q", string(got), string(original)) - } - - entries := parseLogEntries(t, buf.Bytes()) - if !hasLogEntry(entries, map[string]string{ - "msg": "guest opencode auth sync skipped", - "vm_name": vm.Name, - "host_path": hostAuthPath, - "error": "is a directory", - }) { - t.Fatalf("expected warn log, got %v", entries) + for _, want := range []string{ + filepath.Join(srcDir, "session"), + filepath.Join(srcDir, "linked-dir"), + } { + if !hasLogEntry(entries, map[string]string{ + "msg": "file_sync skipped symlink (would escape the requested tree)", + "vm_name": vm.Name, + "host_path": want, + }) { + t.Fatalf("expected warn log for skipped symlink %s; got %v", want, entries) + } } } func TestCreateVMRejectsNonPositiveCPUAndMemory(t *testing.T) { d := &Daemon{} - if _, err := d.CreateVM(context.Background(), api.VMCreateParams{VCPUCount: ptr(0)}); err == nil || !strings.Contains(err.Error(), "vcpu must be a positive integer") { + wireServices(d) + if _, err := d.vm.CreateVM(context.Background(), api.VMCreateParams{VCPUCount: ptr(0)}); err == nil || !strings.Contains(err.Error(), "vcpu must be a positive integer") { t.Fatalf("CreateVM(vcpu=0) error = %v", err) } - if _, err := d.CreateVM(context.Background(), api.VMCreateParams{MemoryMiB: ptr(-1)}); err == nil || !strings.Contains(err.Error(), "memory must be a positive integer") { + if _, err := d.vm.CreateVM(context.Background(), api.VMCreateParams{MemoryMiB: ptr(-1)}); err == nil || !strings.Contains(err.Error(), "memory must be a positive integer") { t.Fatalf("CreateVM(memory=-1) error = %v", err) } } @@ -1019,8 +1359,9 @@ func TestBeginVMCreateCompletesAndReturnsStatus(t *testing.T) { BridgeIP: model.DefaultBridgeIP, }, } + wireServices(d) - op, err := d.BeginVMCreate(ctx, api.VMCreateParams{Name: "queued", NoStart: true}) + op, err := d.vm.BeginVMCreate(ctx, api.VMCreateParams{Name: "queued", NoStart: true}) if err != nil { t.Fatalf("BeginVMCreate: %v", err) } @@ -1030,7 +1371,7 @@ func TestBeginVMCreateCompletesAndReturnsStatus(t *testing.T) { deadline := time.Now().Add(2 * time.Second) for time.Now().Before(deadline) { - status, err := d.VMCreateStatus(ctx, op.ID) + status, err := d.vm.VMCreateStatus(ctx, op.ID) if err != nil { t.Fatalf("VMCreateStatus: %v", err) } @@ -1069,8 +1410,9 @@ func TestCreateVMUsesDefaultsWhenCPUAndMemoryOmitted(t *testing.T) { BridgeIP: model.DefaultBridgeIP, }, } + wireServices(d) - vm, err := d.CreateVM(ctx, api.VMCreateParams{Name: "defaults", ImageName: image.Name, NoStart: true}) + vm, err := d.vm.CreateVM(ctx, api.VMCreateParams{Name: "defaults", ImageName: image.Name, NoStart: true}) if err != nil { t.Fatalf("CreateVM: %v", err) } @@ -1088,11 +1430,12 @@ func TestSetVMRejectsNonPositiveCPUAndMemory(t *testing.T) { vm := testVM("validate", "image-validate", "172.16.0.13") upsertDaemonVM(t, ctx, db, vm) d := &Daemon{store: db} + wireServices(d) - if _, err := d.SetVM(ctx, api.VMSetParams{IDOrName: vm.ID, VCPUCount: ptr(0)}); err == nil || !strings.Contains(err.Error(), "vcpu must be a positive integer") { + if _, err := d.vm.SetVM(ctx, api.VMSetParams{IDOrName: vm.ID, VCPUCount: ptr(0)}); err == nil || !strings.Contains(err.Error(), "vcpu must be a positive integer") { t.Fatalf("SetVM(vcpu=0) error = %v", err) } - if _, err := d.SetVM(ctx, api.VMSetParams{IDOrName: vm.ID, MemoryMiB: ptr(0)}); err == nil || !strings.Contains(err.Error(), "memory must be a positive integer") { + if _, err := d.vm.SetVM(ctx, api.VMSetParams{IDOrName: vm.ID, MemoryMiB: ptr(0)}); err == nil || !strings.Contains(err.Error(), "memory must be a positive integer") { t.Fatalf("SetVM(memory=0) error = %v", err) } } @@ -1113,7 +1456,8 @@ func TestCollectStatsIgnoresMalformedMetricsFile(t *testing.T) { } d := &Daemon{} - stats, err := d.collectStats(context.Background(), model.VMRecord{ + wireServices(d) + stats, err := d.stats.collectStats(context.Background(), model.VMRecord{ Runtime: model.VMRuntime{ SystemOverlay: overlay, WorkDiskPath: workDisk, @@ -1162,6 +1506,7 @@ func TestValidateStartPrereqsReportsNATUplinkFailure(t *testing.T) { FirecrackerBin: firecrackerBin, }, } + wireServices(d) vm := testVM("nat", "image-nat", "172.16.0.12") vm.Spec.NATEnabled = true vm.Runtime.WorkDiskPath = filepath.Join(t.TempDir(), "missing-root.ext4") @@ -1169,7 +1514,7 @@ func TestValidateStartPrereqsReportsNATUplinkFailure(t *testing.T) { image.RootfsPath = rootfsPath image.KernelPath = kernelPath - err := d.validateStartPrereqs(ctx, vm, image) + err := d.vm.validateStartPrereqs(ctx, vm, image) if err == nil || !strings.Contains(err.Error(), "uplink interface for NAT") { t.Fatalf("validateStartPrereqs() error = %v, want NAT uplink failure", err) } @@ -1197,11 +1542,14 @@ func TestCleanupRuntimeRediscoversLiveFirecrackerPID(t *testing.T) { proc: fake, } d := &Daemon{runner: runner} + wireServices(d) vm := testVM("cleanup", "image-cleanup", "172.16.0.22") - vm.Runtime.PID = fake.Process.Pid + 999 vm.Runtime.APISockPath = apiSock + // Seed a stale PID so cleanupRuntime's findFirecrackerPID pgrep + // fallback wins — it rediscovers fake.Process.Pid from apiSock. + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: fake.Process.Pid + 999}) - if err := d.cleanupRuntime(context.Background(), vm, true); err != nil { + if err := d.vm.cleanupRuntime(context.Background(), vm, true); err != nil { t.Fatalf("cleanupRuntime returned error: %v", err) } runner.assertExhausted() @@ -1223,13 +1571,13 @@ func TestDeleteStoppedNATVMDoesNotFailWithoutTapDevice(t *testing.T) { vm := testVM("stopped-nat", "image-stopped-nat", "172.16.0.24") vm.Spec.NATEnabled = true vm.Runtime.VMDir = vmDir - vm.Runtime.TapDevice = "" vm.State = model.VMStateStopped vm.Runtime.State = model.VMStateStopped upsertDaemonVM(t, ctx, db, vm) d := &Daemon{store: db} - deleted, err := d.DeleteVM(ctx, vm.Name) + wireServices(d) + deleted, err := d.vm.DeleteVM(ctx, vm.Name) if err != nil { t.Fatalf("DeleteVM: %v", err) } @@ -1244,7 +1592,7 @@ func TestDeleteStoppedNATVMDoesNotFailWithoutTapDevice(t *testing.T) { } } -func TestStopVMFallsBackToForcedCleanupAfterGracefulTimeout(t *testing.T) { +func TestStopVMSIGKILLsWhenSSHUnreachable(t *testing.T) { ctx := context.Background() db := openDaemonStore(t) apiSock := filepath.Join(t.TempDir(), "fc.sock") @@ -1258,16 +1606,9 @@ func TestStopVMFallsBackToForcedCleanupAfterGracefulTimeout(t *testing.T) { } }) - oldGracefulWait := gracefulShutdownWait - gracefulShutdownWait = 50 * time.Millisecond - t.Cleanup(func() { - gracefulShutdownWait = oldGracefulWait - }) - vm := testVM("stubborn", "image-stubborn", "172.16.0.23") vm.State = model.VMStateRunning vm.Runtime.State = model.VMStateRunning - vm.Runtime.PID = fake.Process.Pid vm.Runtime.APISockPath = apiSock upsertDaemonVM(t, ctx, db, vm) @@ -1275,8 +1616,6 @@ func TestStopVMFallsBackToForcedCleanupAfterGracefulTimeout(t *testing.T) { scriptedRunner: &scriptedRunner{ t: t, steps: []runnerStep{ - sudoStep("", nil, "chown", fmt.Sprintf("%d:%d", os.Getuid(), os.Getgid()), apiSock), - sudoStep("", nil, "chmod", "600", apiSock), {call: runnerCall{name: "pgrep", args: []string{"-n", "-f", apiSock}}, out: []byte(strconv.Itoa(fake.Process.Pid) + "\n")}, sudoStep("", nil, "kill", "-KILL", strconv.Itoa(fake.Process.Pid)), }, @@ -1284,8 +1623,10 @@ func TestStopVMFallsBackToForcedCleanupAfterGracefulTimeout(t *testing.T) { proc: fake, } d := &Daemon{store: db, runner: runner} + wireServices(d) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: fake.Process.Pid}) - got, err := d.StopVM(ctx, vm.ID) + got, err := d.vm.StopVM(ctx, vm.ID) if err != nil { t.Fatalf("StopVM returned error: %v", err) } @@ -1293,8 +1634,11 @@ func TestStopVMFallsBackToForcedCleanupAfterGracefulTimeout(t *testing.T) { if got.State != model.VMStateStopped || got.Runtime.State != model.VMStateStopped { t.Fatalf("StopVM state = %s/%s, want stopped", got.State, got.Runtime.State) } - if got.Runtime.PID != 0 || got.Runtime.APISockPath != "" { - t.Fatalf("runtime handles not cleared: %+v", got.Runtime) + // APISockPath + VSock paths are deterministic — they stay on the + // record for debugging and next-start reuse even after stop. The + // post-stop invariant is that the in-memory cache is empty. + if h, ok := d.vm.handles.get(vm.ID); ok && !h.IsZero() { + t.Fatalf("handle cache not cleared: %+v", h) } } @@ -1304,6 +1648,7 @@ func TestWithVMLockByIDSerializesSameVM(t *testing.T) { vm := testVM("serial", "image-serial", "172.16.0.30") upsertDaemonVM(t, ctx, db, vm) d := &Daemon{store: db} + wireServices(d) firstEntered := make(chan struct{}) releaseFirst := make(chan struct{}) @@ -1311,7 +1656,7 @@ func TestWithVMLockByIDSerializesSameVM(t *testing.T) { errCh := make(chan error, 2) go func() { - _, err := d.withVMLockByID(ctx, vm.ID, func(vm model.VMRecord) (model.VMRecord, error) { + _, err := d.vm.withVMLockByID(ctx, vm.ID, func(vm model.VMRecord) (model.VMRecord, error) { close(firstEntered) <-releaseFirst return vm, nil @@ -1326,7 +1671,7 @@ func TestWithVMLockByIDSerializesSameVM(t *testing.T) { } go func() { - _, err := d.withVMLockByID(ctx, vm.ID, func(vm model.VMRecord) (model.VMRecord, error) { + _, err := d.vm.withVMLockByID(ctx, vm.ID, func(vm model.VMRecord) (model.VMRecord, error) { close(secondEntered) return vm, nil }) @@ -1363,12 +1708,13 @@ func TestWithVMLockByIDAllowsDifferentVMsConcurrently(t *testing.T) { upsertDaemonVM(t, ctx, db, vm) } d := &Daemon{store: db} + wireServices(d) started := make(chan string, 2) release := make(chan struct{}) errCh := make(chan error, 2) run := func(id string) { - _, err := d.withVMLockByID(ctx, id, func(vm model.VMRecord) (model.VMRecord, error) { + _, err := d.vm.withVMLockByID(ctx, id, func(vm model.VMRecord) (model.VMRecord, error) { started <- vm.ID <-release return vm, nil @@ -1599,6 +1945,27 @@ func startHTTPSServerOnTCP4(t *testing.T, handler http.Handler) *net.TCPAddr { return listener.Addr().(*net.TCPAddr) } +func testSetGitConfig(t *testing.T, key, value string) { + t.Helper() + + cmd := exec.Command("git", "config", "--global", key, value) + output, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("git config --global %s: %v: %s", key, err, strings.TrimSpace(string(output))) + } +} + +func testGitConfigValue(t *testing.T, configPath, key string) string { + t.Helper() + + cmd := exec.Command("git", "config", "--file", configPath, "--get", key) + output, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("git config --file %s --get %s: %v: %s", configPath, key, err, strings.TrimSpace(string(output))) + } + return strings.TrimSpace(string(output)) +} + type processKillingRunner struct { *scriptedRunner proc *exec.Cmd @@ -1610,6 +1977,20 @@ type filesystemRunner struct { func (r *filesystemRunner) Run(ctx context.Context, name string, args ...string) ([]byte, error) { r.t.Helper() + if name == "git" { + cmd := exec.CommandContext(ctx, name, args...) + var stdout bytes.Buffer + var stderr bytes.Buffer + cmd.Stdout = &stdout + cmd.Stderr = &stderr + if err := cmd.Run(); err != nil { + if stderr.Len() > 0 { + return stdout.Bytes(), fmt.Errorf("%w: %s", err, strings.TrimSpace(stderr.String())) + } + return stdout.Bytes(), err + } + return stdout.Bytes(), nil + } return nil, fmt.Errorf("unexpected Run call: %s %v", name, args) } @@ -1663,26 +2044,204 @@ func (r *filesystemRunner) RunSudo(ctx context.Context, args ...string) ([]byte, } return os.ReadFile(args[1]) case "install": - if len(args) != 5 || args[1] != "-m" { - return nil, fmt.Errorf("unexpected install args: %v", args) - } - mode, err := strconv.ParseUint(args[2], 8, 32) + // Minimal install(1): expected forms are + // install -m MODE SRC DST (5 args) + // install -o 0 -g 0 -m MODE SRC DST (9 args, ignored owners) + src, dst, mode, err := parseInstallArgs(args) if err != nil { return nil, err } - data, err := os.ReadFile(args[3]) + data, err := os.ReadFile(src) if err != nil { return nil, err } - if err := os.MkdirAll(filepath.Dir(args[4]), 0o755); err != nil { + if err := os.MkdirAll(filepath.Dir(dst), 0o755); err != nil { return nil, err } - return nil, os.WriteFile(args[4], data, os.FileMode(mode)) + return nil, os.WriteFile(dst, data, os.FileMode(mode)) + case "chown": + // Recognised forms, all no-op under test (we run as the test + // user and os.Chown would need CAP_CHOWN): + // chown OWNER TARGET + // chown -R OWNER TARGET + // chown -h OWNER TARGET (symlink-no-follow; required by + // fcproc.chownChmodNoFollow) + switch { + case len(args) == 3: + return nil, nil + case len(args) == 4 && (args[1] == "-R" || args[1] == "-h"): + return nil, nil + default: + return nil, fmt.Errorf("unexpected chown args: %v", args) + } + case "debugfs": + return runFakeDebugfs(args[1:]) + case "e2cp": + // e2cp SRC IMAGE:/GUEST → plain file copy into IMAGE dir + if len(args) != 3 { + return nil, fmt.Errorf("unexpected e2cp args: %v", args) + } + image, guest, ok := splitImageColonPath(args[2]) + if !ok { + return nil, fmt.Errorf("e2cp dst missing image:path separator: %v", args) + } + target := filepath.Join(image, guest) + data, err := os.ReadFile(args[1]) + if err != nil { + return nil, err + } + if err := os.MkdirAll(filepath.Dir(target), 0o755); err != nil { + return nil, err + } + return nil, os.WriteFile(target, data, 0o600) + case "e2rm": + // e2rm IMAGE:/GUEST → plain file delete; missing is not fatal + if len(args) != 2 { + return nil, fmt.Errorf("unexpected e2rm args: %v", args) + } + image, guest, ok := splitImageColonPath(args[1]) + if !ok { + return nil, fmt.Errorf("e2rm missing image:path separator: %v", args) + } + target := filepath.Join(image, guest) + if err := os.Remove(target); err != nil && !os.IsNotExist(err) { + return nil, err + } + return nil, nil default: return nil, fmt.Errorf("unexpected sudo command: %v", args) } } +// runFakeDebugfs emulates the subset of debugfs commands the ext4 +// toolkit drives in per-line mode (the stdin-batched path doesn't run +// under filesystemRunner because it doesn't implement StdinRunner). +// Supported: stat/cat, plus -w mkdir/set_inode_field. Inode 2 <2> +// set_inode_field is a no-op — tests don't care about root-inode mode +// beyond it not exploding. +func runFakeDebugfs(args []string) ([]byte, error) { + // Forms: + // debugfs -R "" (read-only) + // debugfs -w -R "" (single write) + if len(args) < 3 { + return nil, fmt.Errorf("unexpected debugfs args: %v", args) + } + write := false + rest := args + if rest[0] == "-w" { + write = true + rest = rest[1:] + } + if len(rest) != 3 || rest[0] != "-R" { + return nil, fmt.Errorf("unexpected debugfs args: %v", args) + } + cmdLine := strings.TrimSpace(rest[1]) + image := rest[2] + + fields := strings.Fields(cmdLine) + if len(fields) == 0 { + return nil, fmt.Errorf("empty debugfs command") + } + switch fields[0] { + case "stat": + if len(fields) != 2 { + return nil, fmt.Errorf("unexpected debugfs stat: %q", cmdLine) + } + target := filepath.Join(image, strings.Trim(fields[1], `"`)) + if _, err := os.Stat(target); err != nil { + if os.IsNotExist(err) { + return []byte("stat: File not found by ext2_lookup while starting pathname"), nil + } + return nil, err + } + return []byte("Inode: 12 Type: directory"), nil + case "cat": + if len(fields) != 2 { + return nil, fmt.Errorf("unexpected debugfs cat: %q", cmdLine) + } + target := filepath.Join(image, strings.Trim(fields[1], `"`)) + data, err := os.ReadFile(target) + if err != nil { + return nil, err + } + return data, nil + case "mkdir": + if !write { + return nil, fmt.Errorf("debugfs mkdir requires -w: %q", cmdLine) + } + if len(fields) != 2 { + return nil, fmt.Errorf("unexpected debugfs mkdir: %q", cmdLine) + } + target := filepath.Join(image, strings.Trim(fields[1], `"`)) + return nil, os.MkdirAll(target, 0o755) + case "set_inode_field": + // set_inode_field > + // Mode changes on non-root targets: honour the perm bits so + // tests can assert file mode. Root inode <2>, uid, gid are + // no-ops — tests don't inspect them. + if !write { + return nil, fmt.Errorf("debugfs set_inode_field requires -w: %q", cmdLine) + } + if len(fields) != 4 { + return nil, fmt.Errorf("unexpected set_inode_field: %q", cmdLine) + } + target := strings.Trim(fields[1], `"`) + if target == "<2>" || fields[2] != "mode" { + return nil, nil + } + raw := strings.TrimPrefix(fields[3], "0") + v, err := strconv.ParseUint(raw, 8, 32) + if err != nil { + return nil, fmt.Errorf("parse set_inode_field mode %q: %w", fields[3], err) + } + return nil, os.Chmod(filepath.Join(image, target), os.FileMode(v)&os.ModePerm) + case "rdump": + // rdump + return nil, fmt.Errorf("rdump not supported in filesystemRunner") + default: + return nil, fmt.Errorf("unsupported debugfs cmd: %q", cmdLine) + } +} + +// splitImageColonPath splits an e2cp/e2rm "image:path" argument. +// Returns image, path, true on success. Only the LAST colon is split +// on since image paths on disk may contain one (rare) and guest paths +// always start with "/". +func splitImageColonPath(arg string) (string, string, bool) { + idx := strings.LastIndex(arg, ":/") + if idx < 0 { + return "", "", false + } + return arg[:idx], arg[idx+1:], true +} + +// parseInstallArgs recognises the `install` invocations banger emits +// and returns (source, destination, parsed mode). Anything else is an +// error so the test stub stays a closed set. +func parseInstallArgs(args []string) (string, string, os.FileMode, error) { + switch len(args) { + case 5: + if args[1] != "-m" { + return "", "", 0, fmt.Errorf("unexpected install args: %v", args) + } + mode, err := strconv.ParseUint(args[2], 8, 32) + if err != nil { + return "", "", 0, err + } + return args[3], args[4], os.FileMode(mode), nil + case 9: + if args[1] != "-o" || args[3] != "-g" || args[5] != "-m" { + return "", "", 0, fmt.Errorf("unexpected install args: %v", args) + } + mode, err := strconv.ParseUint(args[6], 8, 32) + if err != nil { + return "", "", 0, err + } + return args[7], args[8], os.FileMode(mode), nil + } + return "", "", 0, fmt.Errorf("unexpected install args: %v", args) +} + func copyIntoDir(sourcePath, targetDir string) error { targetDir = strings.TrimSuffix(targetDir, "/") info, err := os.Stat(sourcePath) diff --git a/internal/daemon/web.go b/internal/daemon/web.go deleted file mode 100644 index 11cc951..0000000 --- a/internal/daemon/web.go +++ /dev/null @@ -1,65 +0,0 @@ -package daemon - -import ( - "context" - "errors" - "fmt" - "net" - "net/http" - "strings" - "time" - - "banger/internal/model" - "banger/internal/paths" - "banger/internal/webui" -) - -func (d *Daemon) startWebServer() error { - listenAddr := strings.TrimSpace(d.config.WebListenAddr) - if listenAddr == "" { - d.webURL = "" - return nil - } - listener, err := net.Listen("tcp", listenAddr) - if err != nil { - if d.logger != nil { - d.logger.Error("web ui listen failed", "addr", listenAddr, "error", err.Error()) - } - return fmt.Errorf("web ui listen on %s: %w", listenAddr, err) - } - d.webListener = listener - d.webURL = "http://" + listener.Addr().String() - d.webServer = &http.Server{ - Handler: webui.NewHandler(d), - ReadHeaderTimeout: 5 * time.Second, - } - if d.logger != nil { - d.logger.Info("web ui serving", "addr", listener.Addr().String(), "url", d.webURL) - } - go func() { - err := d.webServer.Serve(listener) - if err == nil || errors.Is(err, http.ErrServerClosed) { - return - } - if d.logger != nil { - d.logger.Error("web ui serve failed", "addr", listener.Addr().String(), "error", err.Error()) - } - }() - return nil -} - -func (d *Daemon) Layout() paths.Layout { - return d.layout -} - -func (d *Daemon) Config() model.DaemonConfig { - return d.config -} - -func (d *Daemon) ListVMs(ctx context.Context) ([]model.VMRecord, error) { - return d.store.ListVMs(ctx) -} - -func (d *Daemon) ListImages(ctx context.Context) ([]model.Image, error) { - return d.store.ListImages(ctx) -} diff --git a/internal/daemon/workspace.go b/internal/daemon/workspace.go new file mode 100644 index 0000000..17a2fd1 --- /dev/null +++ b/internal/daemon/workspace.go @@ -0,0 +1,275 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "net" + "strings" + "time" + + "banger/internal/api" + ws "banger/internal/daemon/workspace" + "banger/internal/model" +) + +// workspaceInspectRepoHook + workspaceImportHook dispatch through the +// per-instance Daemon seams when set, falling back to the real +// workspace package implementations. Keeping the fallbacks here (as +// opposed to always requiring callers to populate s.workspaceInspectRepo +// in a constructor) lets tests selectively override one hook without +// having to wire both. +func (s *WorkspaceService) workspaceInspectRepoHook(ctx context.Context, sourcePath, branchName, fromRef string, includeUntracked bool) (ws.RepoSpec, error) { + if s != nil && s.workspaceInspectRepo != nil { + return s.workspaceInspectRepo(ctx, sourcePath, branchName, fromRef, includeUntracked) + } + return s.inspector().InspectRepo(ctx, sourcePath, branchName, fromRef, includeUntracked) +} + +func (s *WorkspaceService) workspaceImportHook(ctx context.Context, client ws.GuestClient, spec ws.RepoSpec, guestPath string, mode model.WorkspacePrepareMode) error { + if s != nil && s.workspaceImport != nil { + return s.workspaceImport(ctx, client, spec, guestPath, mode) + } + return s.inspector().ImportRepoToGuest(ctx, client, spec, guestPath, mode) +} + +// inspector returns the service's workspace Inspector, falling back to +// a fresh real-runner Inspector when callers constructed the service +// without wiring one. Keeping the fallback here lets test literals +// that don't care about the Inspector still function without a manual +// NewInspector() call. +func (s *WorkspaceService) inspector() *ws.Inspector { + if s != nil && s.repoInspector != nil { + return s.repoInspector + } + return ws.NewInspector() +} + +func (s *WorkspaceService) ExportVMWorkspace(ctx context.Context, params api.WorkspaceExportParams) (api.WorkspaceExportResult, error) { + guestPath := strings.TrimSpace(params.GuestPath) + if guestPath == "" { + guestPath = "/root/repo" + } + vm, err := s.vmResolver(ctx, params.IDOrName) + if err != nil { + return api.WorkspaceExportResult{}, err + } + if !s.aliveChecker(vm) { + return api.WorkspaceExportResult{}, fmt.Errorf("vm %q is not running", vm.Name) + } + // Serialise with any in-flight workspace.prepare on the same VM so + // we never snapshot a half-streamed tar. Does not block vm stop / + // delete / restart — those only take the VM mutex. + unlock := s.workspaceLocks.lock(vm.ID) + defer unlock() + + client, err := s.dialGuest(ctx, net.JoinHostPort(vm.Runtime.GuestIP, "22")) + if err != nil { + return api.WorkspaceExportResult{}, fmt.Errorf("dial guest: %w", err) + } + defer client.Close() + + // diffRef is the git ref everything is diffed against. + // When the caller supplies BaseCommit (the HEAD at workspace.prepare time), + // we diff against that fixed point so committed guest changes are included. + // Without it we fall back to HEAD, which silently drops them. + diffRef := strings.TrimSpace(params.BaseCommit) + if diffRef == "" { + diffRef = "HEAD" + } + + // Both scripts run `git add -A` to capture the working tree + // (committed deltas + uncommitted modifications + untracked files + // minus .gitignore), but they route it through a throwaway index + // file instead of .git/index. Export is an observation step; the + // user's real staging area must stay exactly as they left it. + patchScript := exportScript(guestPath, diffRef, "--binary") + patch, err := client.RunScriptOutput(ctx, patchScript) + if err != nil { + return api.WorkspaceExportResult{}, fmt.Errorf("export workspace diff: %w", err) + } + + namesScript := exportScript(guestPath, diffRef, "--name-only") + namesOut, _ := client.RunScriptOutput(ctx, namesScript) + var changed []string + for _, line := range strings.Split(strings.TrimSpace(string(namesOut)), "\n") { + if line = strings.TrimSpace(line); line != "" { + changed = append(changed, line) + } + } + + return api.WorkspaceExportResult{ + GuestPath: guestPath, + BaseCommit: diffRef, + Patch: patch, + ChangedFiles: changed, + HasChanges: len(patch) > 0, + }, nil +} + +// exportScript emits a shell snippet that diffs the working tree at +// guestPath against diffRef (HEAD or a commit SHA) WITHOUT touching +// the repo's real index. diffFlag selects the git-diff output mode +// ("--binary" for the patch body, "--name-only" for the file list). +// +// Mechanics: seed a temp index from diffRef's tree via git read-tree, +// restage the working tree into that temp index with GIT_INDEX_FILE, +// then emit the diff. The temp index is rm'd on exit via trap. +// +// The temp index must live on the same filesystem as the repo's +// real .git directory. `git read-tree --index-output=PATH` uses a +// lockfile + rename + hardlink sequence that fails with "unable to +// write new index file" when PATH is on a different filesystem — +// reliably reproducible on Debian bookworm guests where /tmp is +// tmpfs and the workspace overlay is on a separate FS. mktemp'ing +// inside `$(git rev-parse --git-dir)` keeps the temp index on the +// same FS as .git/index without polluting the working tree. +func exportScript(guestPath, diffRef, diffFlag string) string { + return fmt.Sprintf( + "set -euo pipefail\n"+ + "cd %s\n"+ + "git_dir=\"$(git rev-parse --git-dir)\"\n"+ + "tmp_idx=\"$(mktemp \"$git_dir/banger-export-idx.XXXXXX\")\"\n"+ + "trap 'rm -f \"$tmp_idx\"' EXIT\n"+ + "git read-tree %s --index-output=\"$tmp_idx\"\n"+ + "GIT_INDEX_FILE=\"$tmp_idx\" git add -A\n"+ + "GIT_INDEX_FILE=\"$tmp_idx\" git diff --cached %s %s\n", + ws.ShellQuote(guestPath), + ws.ShellQuote(diffRef), + ws.ShellQuote(diffRef), + diffFlag, + ) +} + +func (s *WorkspaceService) PrepareVMWorkspace(ctx context.Context, params api.VMWorkspacePrepareParams) (model.WorkspacePrepareResult, error) { + mode, err := ws.ParsePrepareMode(params.Mode) + if err != nil { + return model.WorkspacePrepareResult{}, err + } + guestPath := strings.TrimSpace(params.GuestPath) + if guestPath == "" { + guestPath = "/root/repo" + } + branchName := strings.TrimSpace(params.Branch) + fromRef := strings.TrimSpace(params.From) + if branchName != "" && fromRef == "" { + fromRef = "HEAD" + } + if branchName == "" && strings.TrimSpace(params.From) != "" { + return model.WorkspacePrepareResult{}, errors.New("workspace from requires branch") + } + + // Phase 1: acquire the VM mutex ONLY long enough to verify state + // and snapshot the fields we need (IP, PID, api sock). Release it + // before any SSH or tar I/O so this slow operation cannot block + // vm stop / vm delete / vm restart on the same VM. + vm, err := s.withVMLockByRef(ctx, params.IDOrName, func(vm model.VMRecord) (model.VMRecord, error) { + if !s.aliveChecker(vm) { + return model.VMRecord{}, fmt.Errorf("vm %q is not running", vm.Name) + } + return vm, nil + }) + if err != nil { + return model.WorkspacePrepareResult{}, err + } + + // Phase 2: serialise concurrent workspace operations on THIS vm + // (so two prepares don't interleave tar streams), but do not + // block lifecycle ops. If the VM gets stopped or deleted mid- + // flight, the SSH dial or stream will fail naturally; ctx + // cancellation propagates through. + unlock := s.workspaceLocks.lock(vm.ID) + defer unlock() + + return s.prepareVMWorkspaceGuestIO(ctx, vm, strings.TrimSpace(params.SourcePath), guestPath, branchName, fromRef, mode, params.IncludeUntracked) +} + +// miseTrustGuestRepo runs `mise trust` against guestPath inside the +// guest so any .mise.toml / .tool-versions / mise.toml files in the +// imported repo become trusted without an interactive prompt. Best +// effort: a missing mise binary, a non-zero exit, or a trust that +// finds nothing all log at debug only and don't fail prepare. +// +// The guest is single-tenant root@.vm and the repo just came +// from the host user's own checkout, so auto-trust is safe in this +// context — the user has already validated the repo on the host. +func (s *WorkspaceService) miseTrustGuestRepo(ctx context.Context, client ws.GuestClient, guestPath string) { + script := miseTrustScript(guestPath) + if err := client.RunScript(ctx, script, miseTrustLogSink{}); err != nil && s.logger != nil { + s.logger.Debug("mise trust on imported workspace skipped", "guest_path", guestPath, "error", err.Error()) + } +} + +// miseTrustScript is the exact shell run inside the guest. Kept +// separate so a unit test can pin the string and confirm a future +// edit doesn't accidentally drop the `command -v` guard. +func miseTrustScript(guestPath string) string { + return fmt.Sprintf( + "if command -v mise >/dev/null 2>&1; then cd %s && mise trust --quiet --all 2>/dev/null || true; fi\n", + ws.ShellQuote(guestPath), + ) +} + +// miseTrustLogSink discards anything mise wrote to stdout/stderr. +// We don't care about the output — success leaves mise silent and a +// failure is already covered by the err return path. +type miseTrustLogSink struct{} + +func (miseTrustLogSink) Write(p []byte) (int, error) { return len(p), nil } + +// prepareVMWorkspaceGuestIO performs the actual guest-side work: +// inspect the local repo, dial SSH, stream the tar. Called without +// holding the VM mutex. +func (s *WorkspaceService) prepareVMWorkspaceGuestIO(ctx context.Context, vm model.VMRecord, sourcePath, guestPath, branchName, fromRef string, mode model.WorkspacePrepareMode, includeUntracked bool) (model.WorkspacePrepareResult, error) { + spec, err := s.workspaceInspectRepoHook(ctx, sourcePath, branchName, fromRef, includeUntracked) + if err != nil { + return model.WorkspacePrepareResult{}, err + } + if len(spec.Submodules) > 0 && mode != model.WorkspacePrepareModeFullCopy { + return model.WorkspacePrepareResult{}, fmt.Errorf("workspace mode %q does not support git submodules in %s (%s); use --mode full_copy", mode, spec.RepoRoot, strings.Join(spec.Submodules, ", ")) + } + address := net.JoinHostPort(vm.Runtime.GuestIP, "22") + if err := s.waitGuestSSH(ctx, address, 250*time.Millisecond); err != nil { + return model.WorkspacePrepareResult{}, fmt.Errorf("guest ssh unavailable: %w", err) + } + client, err := s.dialGuest(ctx, address) + if err != nil { + return model.WorkspacePrepareResult{}, fmt.Errorf("dial guest ssh: %w", err) + } + defer client.Close() + if err := s.workspaceImportHook(ctx, client, spec, guestPath, mode); err != nil { + return model.WorkspacePrepareResult{}, err + } + // Auto-trust mise configs in the imported repo. The guest is + // single-tenant (root@.vm), the repo just came from the + // host user's own checkout, and any .mise.toml landing in /root + // would otherwise prompt on the first guest command and stall a + // 'banger vm run ./repo -- ' invocation. Best-effort: a + // missing mise binary or a 'trust' that does nothing is fine. + s.miseTrustGuestRepo(ctx, client, guestPath) + preparedAt := model.Now() + // Persist workspace state so `vm exec` and dirty-checking can + // resolve guest path + HEAD commit without re-stating them. Best + // effort: a store failure here doesn't roll back the prepare. + if err := s.store.SetVMWorkspace(ctx, vm.ID, model.VMWorkspace{ + GuestPath: guestPath, + SourcePath: spec.SourcePath, + HeadCommit: spec.HeadCommit, + PreparedAt: preparedAt, + }); err != nil && s.logger != nil { + s.logger.Warn("failed to persist workspace state", "vm_id", vm.ID, "error", err) + } + return model.WorkspacePrepareResult{ + VMID: vm.ID, + SourcePath: spec.SourcePath, + RepoRoot: spec.RepoRoot, + RepoName: spec.RepoName, + GuestPath: guestPath, + Mode: mode, + HeadCommit: spec.HeadCommit, + CurrentBranch: spec.CurrentBranch, + BranchName: spec.BranchName, + BaseCommit: spec.BaseCommit, + PreparedAt: preparedAt, + }, nil +} diff --git a/internal/daemon/workspace/util.go b/internal/daemon/workspace/util.go new file mode 100644 index 0000000..9f99b2f --- /dev/null +++ b/internal/daemon/workspace/util.go @@ -0,0 +1,20 @@ +package workspace + +import ( + "fmt" + "strings" +) + +// ShellQuote returns value single-quoted for bash, escaping embedded quotes. +func ShellQuote(value string) string { + return "'" + strings.ReplaceAll(value, "'", `'"'"'`) + "'" +} + +// FormatStepError wraps err with an action label and trimmed on-guest log. +func FormatStepError(action string, err error, log string) error { + log = strings.TrimSpace(log) + if log == "" { + return fmt.Errorf("%s: %w", action, err) + } + return fmt.Errorf("%s: %w: %s", action, err, log) +} diff --git a/internal/daemon/workspace/workspace.go b/internal/daemon/workspace/workspace.go new file mode 100644 index 0000000..1f33cf4 --- /dev/null +++ b/internal/daemon/workspace/workspace.go @@ -0,0 +1,454 @@ +// Package workspace contains the pure helpers of the workspace subsystem: +// git repo inspection, shallow copy preparation, guest-side tar import, +// finalization script generation, and small utilities. +// +// Every helper that needs to run a host command (git or otherwise) +// lives as a method on *Inspector rather than a free function that +// routes through a package global. That way two tests running in +// parallel can each build their own Inspector with a stub Runner +// without fighting over shared state. +// +// The orchestrator methods (ExportVMWorkspace, PrepareVMWorkspace) stay on +// *daemon.Daemon. +package workspace + +import ( + "bytes" + "context" + "errors" + "fmt" + "io" + "net/url" + "os" + "path/filepath" + "sort" + "strings" + + "banger/internal/model" + "banger/internal/system" +) + +// ShallowFetchDepth is the default --depth for the transient shallow clone +// used by metadata / overlay prepare modes. +const ShallowFetchDepth = 10 + +// RepoSpec describes the host-side git repository we're about to import into +// a guest. It captures the pieces both InspectRepo and the prepare flow need. +type RepoSpec struct { + SourcePath string + RepoRoot string + RepoName string + HeadCommit string + CurrentBranch string + BranchName string + BaseCommit string + OriginURL string + GitUserName string + GitUserEmail string + OverlayPaths []string + Submodules []string +} + +// GuestClient is the narrow subset of guest SSH operations needed by +// ImportRepoToGuest. Satisfied by the daemon-package guestSSHClient. +type GuestClient interface { + RunScript(ctx context.Context, script string, log io.Writer) error + StreamTar(ctx context.Context, dir, command string, log io.Writer) error + StreamTarEntries(ctx context.Context, dir string, entries []string, command string, log io.Writer) error +} + +// RunnerFunc is the single-method surface every Inspector needs: run a +// host command with args, return combined output + error. Tests supply +// a stub that records calls and replays canned responses; production +// uses realHostRunner which wraps system.NewRunner. +type RunnerFunc func(ctx context.Context, name string, args ...string) ([]byte, error) + +// Inspector bundles the host-command seam for all git-using workspace +// helpers. Construct one at the boundary where you're reading the +// filesystem (CLI deps, WorkspaceService) and call its methods directly; +// don't reach into the struct from helper code. +type Inspector struct { + Runner RunnerFunc +} + +// NewInspector returns an Inspector backed by the real host runner. +// Production callers (CLI deps initialisation, daemon WorkspaceService +// wiring) use this; tests construct Inspector{Runner: stub} directly. +func NewInspector() *Inspector { + return &Inspector{Runner: realHostRunner} +} + +func realHostRunner(ctx context.Context, name string, args ...string) ([]byte, error) { + runner := system.NewRunner() + output, err := runner.Run(ctx, name, args...) + if err == nil { + return output, nil + } + command := strings.TrimSpace(strings.Join(append([]string{name}, args...), " ")) + detail := strings.TrimSpace(string(output)) + if detail == "" { + return output, fmt.Errorf("%s: %w", command, err) + } + return output, fmt.Errorf("%s: %w: %s", command, err, detail) +} + +// InspectRepo resolves rawPath into an absolute repo root and captures +// the HEAD, branch, optional base-from ref, git identity, origin URL, +// submodules, and overlay paths needed for a prepare. Overlay paths +// cover tracked files by default; untracked non-ignored files are +// included only when includeUntracked is true. +func (i *Inspector) InspectRepo(ctx context.Context, rawPath, branchName, fromRef string, includeUntracked bool) (RepoSpec, error) { + sourcePath, err := ResolveSourcePath(rawPath) + if err != nil { + return RepoSpec{}, err + } + repoRoot, err := i.GitTrimmedOutput(ctx, sourcePath, "rev-parse", "--show-toplevel") + if err != nil { + return RepoSpec{}, fmt.Errorf("%s is not inside a git repository", sourcePath) + } + isBare, err := i.GitTrimmedOutput(ctx, repoRoot, "rev-parse", "--is-bare-repository") + if err != nil { + return RepoSpec{}, fmt.Errorf("inspect git repository %s: %w", repoRoot, err) + } + if isBare == "true" { + return RepoSpec{}, fmt.Errorf("workspace prepare requires a non-bare git repository: %s", repoRoot) + } + submodules, err := i.ListSubmodules(ctx, repoRoot) + if err != nil { + return RepoSpec{}, err + } + headCommit, err := i.GitTrimmedOutput(ctx, repoRoot, "rev-parse", "HEAD^{commit}") + if err != nil { + return RepoSpec{}, fmt.Errorf("git repository %s must have at least one commit", repoRoot) + } + currentBranch, err := i.GitTrimmedOutput(ctx, repoRoot, "branch", "--show-current") + if err != nil { + return RepoSpec{}, fmt.Errorf("resolve current branch for %s: %w", repoRoot, err) + } + baseCommit := headCommit + branchName = strings.TrimSpace(branchName) + if branchName != "" { + baseCommit, err = i.GitTrimmedOutput(ctx, repoRoot, "rev-parse", fromRef+"^{commit}") + if err != nil { + return RepoSpec{}, fmt.Errorf("resolve workspace from %q: %w", fromRef, err) + } + } + gitUserName, err := i.GitResolvedConfigValue(ctx, repoRoot, "user.name") + if err != nil { + return RepoSpec{}, fmt.Errorf("resolve git user.name for %s: %w", repoRoot, err) + } + gitUserEmail, err := i.GitResolvedConfigValue(ctx, repoRoot, "user.email") + if err != nil { + return RepoSpec{}, fmt.Errorf("resolve git user.email for %s: %w", repoRoot, err) + } + originURL, err := i.GitResolvedConfigValue(ctx, repoRoot, "remote.origin.url") + if err != nil { + return RepoSpec{}, fmt.Errorf("resolve origin url for %s: %w", repoRoot, err) + } + overlayPaths, err := i.ListOverlayPaths(ctx, repoRoot, includeUntracked) + if err != nil { + return RepoSpec{}, err + } + return RepoSpec{ + SourcePath: sourcePath, + RepoRoot: repoRoot, + RepoName: filepath.Base(repoRoot), + HeadCommit: headCommit, + CurrentBranch: currentBranch, + BranchName: branchName, + BaseCommit: baseCommit, + OriginURL: originURL, + GitUserName: gitUserName, + GitUserEmail: gitUserEmail, + OverlayPaths: overlayPaths, + Submodules: submodules, + }, nil +} + +// ImportRepoToGuest materialises spec inside the guest at guestPath. Mode +// selects between full copy, metadata-only, or shallow metadata + overlay. +func (i *Inspector) ImportRepoToGuest(ctx context.Context, client GuestClient, spec RepoSpec, guestPath string, mode model.WorkspacePrepareMode) error { + switch mode { + case model.WorkspacePrepareModeFullCopy: + var copyLog bytes.Buffer + command := fmt.Sprintf("rm -rf %s && mkdir -p %s && tar -o -C %s --strip-components=1 -xf -", ShellQuote(guestPath), ShellQuote(guestPath), ShellQuote(guestPath)) + if err := client.StreamTar(ctx, spec.RepoRoot, command, ©Log); err != nil { + return FormatStepError("copy full workspace", err, copyLog.String()) + } + var finalizeLog bytes.Buffer + if err := client.RunScript(ctx, FinalizeScript(spec, guestPath, mode), &finalizeLog); err != nil { + return FormatStepError("finalize full workspace", err, finalizeLog.String()) + } + return nil + case model.WorkspacePrepareModeMetadataOnly, model.WorkspacePrepareModeShallowOverlay: + repoCopyDir, cleanup, err := i.PrepareRepoCopy(ctx, spec) + if err != nil { + return err + } + defer cleanup() + var copyLog bytes.Buffer + command := fmt.Sprintf("rm -rf %s && mkdir -p %s && tar -o -C %s --strip-components=1 -xf -", ShellQuote(guestPath), ShellQuote(guestPath), ShellQuote(guestPath)) + if err := client.StreamTar(ctx, repoCopyDir, command, ©Log); err != nil { + return FormatStepError("copy guest git metadata", err, copyLog.String()) + } + var scriptLog bytes.Buffer + if err := client.RunScript(ctx, FinalizeScript(spec, guestPath, mode), &scriptLog); err != nil { + return FormatStepError("prepare guest checkout", err, scriptLog.String()) + } + if mode == model.WorkspacePrepareModeMetadataOnly { + return nil + } + var overlayLog bytes.Buffer + command = fmt.Sprintf("tar -o -C %s --strip-components=1 -xf -", ShellQuote(guestPath)) + if err := client.StreamTarEntries(ctx, spec.RepoRoot, spec.OverlayPaths, command, &overlayLog); err != nil { + return FormatStepError("overlay workspace working tree", err, overlayLog.String()) + } + return nil + default: + return fmt.Errorf("unsupported workspace mode %q", mode) + } +} + +// FinalizeScript returns the bash script run inside the guest after the repo +// copy lands: safe.directory, optional cleanup, branch/detached checkout, +// and git identity config. +func FinalizeScript(spec RepoSpec, guestPath string, mode model.WorkspacePrepareMode) string { + var script strings.Builder + script.WriteString("set -euo pipefail\n") + fmt.Fprintf(&script, "DIR=%s\n", ShellQuote(guestPath)) + script.WriteString("git config --global --add safe.directory \"$DIR\"\n") + if mode != model.WorkspacePrepareModeFullCopy { + script.WriteString("find \"$DIR\" -mindepth 1 -maxdepth 1 ! -name .git -exec rm -rf {} +\n") + } + switch { + case strings.TrimSpace(spec.BranchName) != "": + fmt.Fprintf(&script, "git -C \"$DIR\" checkout -B %s %s\n", ShellQuote(spec.BranchName), ShellQuote(spec.BaseCommit)) + case strings.TrimSpace(spec.CurrentBranch) != "": + fmt.Fprintf(&script, "git -C \"$DIR\" checkout -B %s %s\n", ShellQuote(spec.CurrentBranch), ShellQuote(spec.HeadCommit)) + default: + fmt.Fprintf(&script, "git -C \"$DIR\" checkout --detach %s\n", ShellQuote(spec.HeadCommit)) + } + if strings.TrimSpace(spec.GitUserName) != "" && strings.TrimSpace(spec.GitUserEmail) != "" { + fmt.Fprintf(&script, "git -C \"$DIR\" config user.name %s\n", ShellQuote(spec.GitUserName)) + fmt.Fprintf(&script, "git -C \"$DIR\" config user.email %s\n", ShellQuote(spec.GitUserEmail)) + } + return script.String() +} + +// PrepareRepoCopy materialises a shallow clone of spec into a temp dir. The +// returned cleanup removes the temp root. +func (i *Inspector) PrepareRepoCopy(ctx context.Context, spec RepoSpec) (string, func(), error) { + tempRoot, err := os.MkdirTemp("", "banger-workspace-*") + if err != nil { + return "", nil, err + } + cleanup := func() { _ = os.RemoveAll(tempRoot) } + repoCopyDir := filepath.Join(tempRoot, spec.RepoName) + cloneArgs := []string{"clone", "--no-checkout", "--depth", fmt.Sprintf("%d", ShallowFetchDepth)} + if strings.TrimSpace(spec.CurrentBranch) != "" { + cloneArgs = append(cloneArgs, "--single-branch", "--branch", spec.CurrentBranch) + } + cloneArgs = append(cloneArgs, GitFileURL(spec.RepoRoot), repoCopyDir) + if err := i.RunHostCommand(ctx, "git", cloneArgs...); err != nil { + cleanup() + return "", nil, fmt.Errorf("clone shallow workspace repo copy: %w", err) + } + checkoutCommit := spec.HeadCommit + if strings.TrimSpace(spec.BranchName) != "" { + checkoutCommit = spec.BaseCommit + } + if err := i.RunHostCommand(ctx, "git", "-C", repoCopyDir, "cat-file", "-e", checkoutCommit+"^{commit}"); err != nil { + if err := i.RunHostCommand(ctx, "git", "-C", repoCopyDir, "fetch", "--depth", fmt.Sprintf("%d", ShallowFetchDepth), GitFileURL(spec.RepoRoot), checkoutCommit); err != nil { + cleanup() + return "", nil, fmt.Errorf("fetch shallow workspace repo commit %s: %w", checkoutCommit, err) + } + } + if strings.TrimSpace(spec.OriginURL) != "" { + if err := i.RunHostCommand(ctx, "git", "-C", repoCopyDir, "remote", "set-url", "origin", spec.OriginURL); err != nil { + cleanup() + return "", nil, fmt.Errorf("set workspace origin remote: %w", err) + } + } else { + if err := i.RunHostCommand(ctx, "git", "-C", repoCopyDir, "remote", "remove", "origin"); err != nil { + cleanup() + return "", nil, fmt.Errorf("remove workspace placeholder origin remote: %w", err) + } + } + return repoCopyDir, cleanup, nil +} + +// ResolveSourcePath expands rawPath to an absolute path and verifies it is +// an existing directory. +func ResolveSourcePath(rawPath string) (string, error) { + if strings.TrimSpace(rawPath) == "" { + return "", errors.New("workspace source path is required") + } + absPath, err := filepath.Abs(rawPath) + if err != nil { + return "", err + } + info, err := os.Stat(absPath) + if err != nil { + return "", err + } + if !info.IsDir() { + return "", fmt.Errorf("%s is not a directory", absPath) + } + return absPath, nil +} + +// ListSubmodules returns the gitlink paths in repoRoot (mode 160000 entries). +func (i *Inspector) ListSubmodules(ctx context.Context, repoRoot string) ([]string, error) { + output, err := i.GitOutput(ctx, repoRoot, "ls-files", "--stage", "-z") + if err != nil { + return nil, fmt.Errorf("inspect workspace git index for %s: %w", repoRoot, err) + } + var submodules []string + for _, record := range ParseNullSeparatedOutput(output) { + if !strings.HasPrefix(record, "160000 ") { + continue + } + _, path, ok := strings.Cut(record, "\t") + if !ok { + continue + } + submodules = append(submodules, strings.TrimSpace(path)) + } + sort.Strings(submodules) + return submodules, nil +} + +// ListOverlayPaths returns tracked files in repoRoot, plus (when +// includeUntracked is true) untracked non-ignored files. Missing +// tracked entries (deleted working-tree files) are skipped in both +// modes. +// +// The default is tracked-only because "untracked + not gitignored" +// silently catches local credentials, .env files, scratch notes, and +// other secrets that live in the working tree but aren't meant to +// leave the developer's machine. Callers that genuinely want the +// fuller set (scratch repos, vendored binaries the user is iterating +// on) opt in explicitly. +func (i *Inspector) ListOverlayPaths(ctx context.Context, repoRoot string, includeUntracked bool) ([]string, error) { + trackedOutput, err := i.GitOutput(ctx, repoRoot, "ls-files", "-z") + if err != nil { + return nil, fmt.Errorf("list tracked files for %s: %w", repoRoot, err) + } + paths := make([]string, 0) + seen := make(map[string]struct{}) + for _, relPath := range ParseNullSeparatedOutput(trackedOutput) { + if relPath == "" { + continue + } + if _, err := os.Lstat(filepath.Join(repoRoot, relPath)); err != nil { + if os.IsNotExist(err) { + continue + } + return nil, err + } + seen[relPath] = struct{}{} + paths = append(paths, relPath) + } + if includeUntracked { + untrackedOutput, err := i.GitOutput(ctx, repoRoot, "ls-files", "--others", "--exclude-standard", "-z") + if err != nil { + return nil, fmt.Errorf("list untracked files for %s: %w", repoRoot, err) + } + for _, relPath := range ParseNullSeparatedOutput(untrackedOutput) { + if relPath == "" { + continue + } + if _, ok := seen[relPath]; ok { + continue + } + seen[relPath] = struct{}{} + paths = append(paths, relPath) + } + } + sort.Strings(paths) + return paths, nil +} + +// CountUntrackedPaths returns the number of untracked non-ignored +// files in repoRoot. Used by the CLI to warn the user when they are +// about to ship a workspace that has local-but-unignored scratch +// files which, under the default, will be skipped. +func (i *Inspector) CountUntrackedPaths(ctx context.Context, repoRoot string) (int, error) { + untrackedOutput, err := i.GitOutput(ctx, repoRoot, "ls-files", "--others", "--exclude-standard", "-z") + if err != nil { + return 0, fmt.Errorf("list untracked files for %s: %w", repoRoot, err) + } + count := 0 + for _, relPath := range ParseNullSeparatedOutput(untrackedOutput) { + if relPath != "" { + count++ + } + } + return count, nil +} + +// ParsePrepareMode validates and canonicalises a user-supplied mode value. +func ParsePrepareMode(raw string) (model.WorkspacePrepareMode, error) { + switch strings.TrimSpace(raw) { + case "", string(model.WorkspacePrepareModeShallowOverlay): + return model.WorkspacePrepareModeShallowOverlay, nil + case string(model.WorkspacePrepareModeFullCopy): + return model.WorkspacePrepareModeFullCopy, nil + case string(model.WorkspacePrepareModeMetadataOnly): + return model.WorkspacePrepareModeMetadataOnly, nil + default: + return "", fmt.Errorf("unsupported workspace mode %q", raw) + } +} + +// GitOutput runs `git [-C dir] args...` and returns its raw stdout. +func (i *Inspector) GitOutput(ctx context.Context, dir string, args ...string) ([]byte, error) { + fullArgs := make([]string, 0, len(args)+2) + if strings.TrimSpace(dir) != "" { + fullArgs = append(fullArgs, "-C", dir) + } + fullArgs = append(fullArgs, args...) + return i.Runner(ctx, "git", fullArgs...) +} + +// GitTrimmedOutput returns GitOutput with surrounding whitespace trimmed. +func (i *Inspector) GitTrimmedOutput(ctx context.Context, dir string, args ...string) (string, error) { + output, err := i.GitOutput(ctx, dir, args...) + if err != nil { + return "", err + } + return strings.TrimSpace(string(output)), nil +} + +// GitResolvedConfigValue reads git config key with --default "" --get. +func (i *Inspector) GitResolvedConfigValue(ctx context.Context, dir, key string) (string, error) { + return i.GitTrimmedOutput(ctx, dir, "config", "--default", "", "--get", key) +} + +// ParseNullSeparatedOutput splits on NULs and trims, returning non-empty +// values in order. +func ParseNullSeparatedOutput(output []byte) []string { + chunks := bytes.Split(output, []byte{0}) + values := make([]string, 0, len(chunks)) + for _, chunk := range chunks { + value := strings.TrimSpace(string(chunk)) + if value == "" { + continue + } + values = append(values, value) + } + return values +} + +// RunHostCommand runs a host command via the Inspector's Runner, +// discarding its stdout. +func (i *Inspector) RunHostCommand(ctx context.Context, name string, args ...string) error { + _, err := i.Runner(ctx, name, args...) + return err +} + +// GitFileURL returns a file:// URL for path, the form git requires when +// cloning from a local directory. +func GitFileURL(path string) string { + return (&url.URL{Scheme: "file", Path: filepath.ToSlash(path)}).String() +} diff --git a/internal/daemon/workspace/workspace_test.go b/internal/daemon/workspace/workspace_test.go new file mode 100644 index 0000000..6b33205 --- /dev/null +++ b/internal/daemon/workspace/workspace_test.go @@ -0,0 +1,102 @@ +package workspace + +import ( + "context" + "os" + "os/exec" + "path/filepath" + "slices" + "testing" +) + +// seedRepo creates a tiny git repo with one tracked file, one +// gitignored file, and one untracked-non-ignored file. Returns the +// repo root path. Skips the test if git isn't on PATH (unusual for +// a dev machine, but polite). +func seedRepo(t *testing.T) string { + t.Helper() + if _, err := exec.LookPath("git"); err != nil { + t.Skipf("git not on PATH: %v", err) + } + dir := t.TempDir() + run := func(args ...string) { + t.Helper() + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = dir + // Isolate from the ambient user config so commits don't need + // a global user.name/user.email. Also disable GPG signing. + cmd.Env = append(os.Environ(), + "GIT_AUTHOR_NAME=t", "GIT_AUTHOR_EMAIL=t@t", + "GIT_COMMITTER_NAME=t", "GIT_COMMITTER_EMAIL=t@t", + "GIT_CONFIG_GLOBAL=/dev/null", + ) + if out, err := cmd.CombinedOutput(); err != nil { + t.Fatalf("%v: %v\n%s", args, err, out) + } + } + writeFile := func(relPath, content string) { + t.Helper() + if err := os.WriteFile(filepath.Join(dir, relPath), []byte(content), 0o644); err != nil { + t.Fatal(err) + } + } + run("git", "init", "-q", "-b", "main") + run("git", "config", "commit.gpgsign", "false") + writeFile(".gitignore", "ignored.log\n") + writeFile("README.md", "hello\n") + run("git", "add", ".gitignore", "README.md") + run("git", "commit", "-q", "-m", "init") + // A tracked file AFTER the first commit so ls-files picks it up. + // A gitignored file so --exclude-standard filters it. + // An untracked non-ignored file so the flag matters. + writeFile("src.go", "package main\n") + run("git", "add", "src.go") + run("git", "commit", "-q", "-m", "src") + writeFile("ignored.log", "noisy\n") + writeFile("SECRETS.env", "TOKEN=abc\n") + return dir +} + +func TestListOverlayPaths_TrackedOnlyByDefault(t *testing.T) { + repo := seedRepo(t) + i := NewInspector() + got, err := i.ListOverlayPaths(context.Background(), repo, false) + if err != nil { + t.Fatalf("ListOverlayPaths: %v", err) + } + want := []string{".gitignore", "README.md", "src.go"} + if !slices.Equal(got, want) { + t.Fatalf("got %v, want %v (untracked SECRETS.env must be excluded; gitignored ignored.log must always be excluded)", got, want) + } +} + +func TestListOverlayPaths_IncludeUntracked(t *testing.T) { + repo := seedRepo(t) + i := NewInspector() + got, err := i.ListOverlayPaths(context.Background(), repo, true) + if err != nil { + t.Fatalf("ListOverlayPaths: %v", err) + } + want := []string{".gitignore", "README.md", "SECRETS.env", "src.go"} + if !slices.Equal(got, want) { + t.Fatalf("got %v, want %v", got, want) + } + // gitignored files must stay out even when untracked is included. + for _, p := range got { + if p == "ignored.log" { + t.Fatalf("gitignored file leaked into overlay: %v", got) + } + } +} + +func TestCountUntrackedPaths(t *testing.T) { + repo := seedRepo(t) + i := NewInspector() + count, err := i.CountUntrackedPaths(context.Background(), repo) + if err != nil { + t.Fatalf("CountUntrackedPaths: %v", err) + } + if count != 1 { + t.Fatalf("count = %d, want 1 (only SECRETS.env; ignored.log is gitignored)", count) + } +} diff --git a/internal/daemon/workspace_rejection_test.go b/internal/daemon/workspace_rejection_test.go new file mode 100644 index 0000000..ca27638 --- /dev/null +++ b/internal/daemon/workspace_rejection_test.go @@ -0,0 +1,87 @@ +package daemon + +import ( + "context" + "io" + "log/slog" + "path/filepath" + "strings" + "testing" + + "banger/internal/api" + "banger/internal/model" +) + +// newWorkspaceRejectionDaemon returns a running-VM + wired daemon +// suitable for the PrepareVMWorkspace rejection tests. No real guest +// state — rejection paths return before any SSH I/O, so the fake +// firecracker infra the happy-path tests need is unnecessary here. +func newWorkspaceRejectionDaemon(t *testing.T) (*Daemon, model.VMRecord) { + t.Helper() + vm := testVM("rejectbox", "image-reject", "172.16.0.211") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + + d := &Daemon{ + store: openDaemonStore(t), + config: model.DaemonConfig{SSHKeyPath: filepath.Join(t.TempDir(), "id_ed25519")}, + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + } + wireServices(d) + upsertDaemonVM(t, context.Background(), d.store, vm) + // Handle cache entry with a live-looking PID so vmAlive returns + // true for the "VM is running" path; the rejection tests that want + // the not-running branch clear this override explicitly. + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: 1}) // init is always alive + return d, vm +} + +func TestPrepareVMWorkspace_RejectsMalformedMode(t *testing.T) { + d, vm := newWorkspaceRejectionDaemon(t) + _, err := d.ws.PrepareVMWorkspace(context.Background(), api.VMWorkspacePrepareParams{ + IDOrName: vm.Name, + SourcePath: "/tmp/fake", + Mode: "bogus_mode", + }) + if err == nil || !strings.Contains(err.Error(), "unsupported workspace mode") { + t.Fatalf("err = %v, want unsupported-mode rejection", err) + } +} + +func TestPrepareVMWorkspace_RejectsFromWithoutBranch(t *testing.T) { + d, vm := newWorkspaceRejectionDaemon(t) + _, err := d.ws.PrepareVMWorkspace(context.Background(), api.VMWorkspacePrepareParams{ + IDOrName: vm.Name, + SourcePath: "/tmp/fake", + From: "HEAD", + // Branch deliberately left empty. + }) + if err == nil || !strings.Contains(err.Error(), "workspace from requires branch") { + t.Fatalf("err = %v, want from-without-branch rejection", err) + } +} + +func TestPrepareVMWorkspace_RejectsNotRunningVM(t *testing.T) { + d, vm := newWorkspaceRejectionDaemon(t) + // Clear handles so vmAlive returns false — simulates a VM that's + // been stopped or never booted. + d.vm.clearVMHandles(vm) + _, err := d.ws.PrepareVMWorkspace(context.Background(), api.VMWorkspacePrepareParams{ + IDOrName: vm.Name, + SourcePath: "/tmp/fake", + }) + if err == nil || !strings.Contains(err.Error(), "is not running") { + t.Fatalf("err = %v, want not-running rejection", err) + } +} + +func TestPrepareVMWorkspace_RejectsUnknownVM(t *testing.T) { + d, _ := newWorkspaceRejectionDaemon(t) + _, err := d.ws.PrepareVMWorkspace(context.Background(), api.VMWorkspacePrepareParams{ + IDOrName: "ghost-vm", + SourcePath: "/tmp/fake", + }) + if err == nil || !strings.Contains(err.Error(), "not found") { + t.Fatalf("err = %v, want VM-not-found rejection", err) + } +} diff --git a/internal/daemon/workspace_service.go b/internal/daemon/workspace_service.go new file mode 100644 index 0000000..864c293 --- /dev/null +++ b/internal/daemon/workspace_service.go @@ -0,0 +1,94 @@ +package daemon + +import ( + "context" + "log/slog" + "time" + + ws "banger/internal/daemon/workspace" + "banger/internal/model" + "banger/internal/paths" + "banger/internal/store" + "banger/internal/system" +) + +// WorkspaceService owns workspace.prepare / workspace.export plus the +// ssh-key + git-identity sync that runs as part of VM start's +// prepare_work_disk capability hook. The workspaceLocks set lives here +// so its scope (serialise concurrent tar imports on the same VM) is +// obvious at the field definition. +// +// The inspect/import test seams are per-service fields so tests inject +// fakes without mutating package-level state. +type WorkspaceService struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + store *store.Store + + // workspaceLocks serialises concurrent workspace.prepare / + // workspace.export on the same VM. Separate from vmLocks so slow + // guest I/O doesn't block lifecycle ops. + workspaceLocks vmLockSet + + // Peer-service access via narrow function-typed dependencies. + // WorkspaceService doesn't hold pointers to the full VMService or + // HostNetwork; it only sees the exact operations it needs. + vmResolver func(ctx context.Context, idOrName string) (model.VMRecord, error) + aliveChecker func(vm model.VMRecord) bool + waitGuestSSH func(ctx context.Context, address string, interval time.Duration) error + dialGuest func(ctx context.Context, address string) (guestSSHClient, error) + imageResolver func(ctx context.Context, idOrName string) (model.Image, error) + imageWorkSeed func(ctx context.Context, image model.Image, fingerprint string) error + withVMLockByRef func(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) + + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog + + // repoInspector is the Inspector used by the real InspectRepo / + // ImportRepoToGuest fallbacks when the test seams below aren't + // set. wireServices installs the production one; tests that want + // to intercept only the host-command surface (not the whole + // inspect/import hook) can assign a stub-runner Inspector here. + repoInspector *ws.Inspector + + // Test seams. + workspaceInspectRepo func(ctx context.Context, sourcePath, branchName, fromRef string, includeUntracked bool) (ws.RepoSpec, error) + workspaceImport func(ctx context.Context, client ws.GuestClient, spec ws.RepoSpec, guestPath string, mode model.WorkspacePrepareMode) error +} + +type workspaceServiceDeps struct { + runner system.CommandRunner + logger *slog.Logger + config model.DaemonConfig + layout paths.Layout + store *store.Store + repoInspector *ws.Inspector + vmResolver func(ctx context.Context, idOrName string) (model.VMRecord, error) + aliveChecker func(vm model.VMRecord) bool + waitGuestSSH func(ctx context.Context, address string, interval time.Duration) error + dialGuest func(ctx context.Context, address string) (guestSSHClient, error) + imageResolver func(ctx context.Context, idOrName string) (model.Image, error) + imageWorkSeed func(ctx context.Context, image model.Image, fingerprint string) error + withVMLockByRef func(ctx context.Context, idOrName string, fn func(model.VMRecord) (model.VMRecord, error)) (model.VMRecord, error) + beginOperation func(ctx context.Context, name string, attrs ...any) *operationLog +} + +func newWorkspaceService(deps workspaceServiceDeps) *WorkspaceService { + return &WorkspaceService{ + runner: deps.runner, + logger: deps.logger, + config: deps.config, + layout: deps.layout, + store: deps.store, + repoInspector: deps.repoInspector, + vmResolver: deps.vmResolver, + aliveChecker: deps.aliveChecker, + waitGuestSSH: deps.waitGuestSSH, + dialGuest: deps.dialGuest, + imageResolver: deps.imageResolver, + imageWorkSeed: deps.imageWorkSeed, + withVMLockByRef: deps.withVMLockByRef, + beginOperation: deps.beginOperation, + } +} diff --git a/internal/daemon/workspace_test.go b/internal/daemon/workspace_test.go new file mode 100644 index 0000000..2f19996 --- /dev/null +++ b/internal/daemon/workspace_test.go @@ -0,0 +1,686 @@ +package daemon + +import ( + "context" + "io" + "log/slog" + "os" + "path/filepath" + "strings" + "sync/atomic" + "testing" + "time" + + "banger/internal/api" + "banger/internal/daemon/workspace" + "banger/internal/model" +) + +// exportGuestClient is a scriptable fake for RunScriptOutput used in export tests. +// Each call to RunScriptOutput returns the next response from the queue. +type exportGuestClient struct { + responses []exportGuestResponse + scripts []string + callIndex int + runScriptLog []string +} + +type exportGuestResponse struct { + output []byte + err error +} + +func (e *exportGuestClient) Close() error { return nil } + +func (e *exportGuestClient) RunScript(_ context.Context, script string, _ io.Writer) error { + e.runScriptLog = append(e.runScriptLog, script) + return nil +} + +func (e *exportGuestClient) RunScriptOutput(_ context.Context, script string) ([]byte, error) { + e.scripts = append(e.scripts, script) + if e.callIndex >= len(e.responses) { + return nil, nil + } + r := e.responses[e.callIndex] + e.callIndex++ + return r.output, r.err +} + +func (e *exportGuestClient) UploadFile(_ context.Context, _ string, _ os.FileMode, _ []byte, _ io.Writer) error { + return nil +} + +func (e *exportGuestClient) StreamTar(_ context.Context, _ string, _ string, _ io.Writer) error { + return nil +} + +func (e *exportGuestClient) StreamTarEntries(_ context.Context, _ string, _ []string, _ string, _ io.Writer) error { + return nil +} + +func newExportTestDaemonStore(t *testing.T, fake *exportGuestClient) *Daemon { + t.Helper() + db := openDaemonStore(t) + d := &Daemon{ + store: db, + config: model.DaemonConfig{SSHKeyPath: filepath.Join(t.TempDir(), "id_ed25519")}, + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + } + wireServices(d) + d.guestDial = func(_ context.Context, _ string, _ string) (guestSSHClient, error) { + return fake, nil + } + return d +} + +func TestExportVMWorkspace_HappyPath(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox", "image-export", "172.16.0.100") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + patch := []byte("diff --git a/file.go b/file.go\nindex 0000000..1111111 100644\n") + names := []byte("file.go\n") + + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: patch}, + {output: names}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + result, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + GuestPath: "/root/repo", + }) + if err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + if !result.HasChanges { + t.Fatal("HasChanges = false, want true") + } + if string(result.Patch) != string(patch) { + t.Fatalf("Patch = %q, want %q", result.Patch, patch) + } + if result.GuestPath != "/root/repo" { + t.Fatalf("GuestPath = %q, want /root/repo", result.GuestPath) + } + if len(result.ChangedFiles) != 1 || result.ChangedFiles[0] != "file.go" { + t.Fatalf("ChangedFiles = %v, want [file.go]", result.ChangedFiles) + } + if fake.callIndex != 2 { + t.Fatalf("RunScriptOutput call count = %d, want 2", fake.callIndex) + } + // No base_commit provided: diff ref must be HEAD. + for _, script := range fake.scripts { + if !strings.Contains(script, "HEAD") { + t.Fatalf("script missing HEAD ref: %q", script) + } + } + if result.BaseCommit != "HEAD" { + t.Fatalf("BaseCommit = %q, want HEAD", result.BaseCommit) + } +} + +func TestExportVMWorkspace_WithBaseCommit(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox-base", "image-export", "172.16.0.105") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + // Simulate: worker committed inside the VM. Without base_commit the diff + // against the new HEAD would be empty. With base_commit we capture + // everything since the original checkout. + patch := []byte("diff --git a/worker.go b/worker.go\nindex 0000000..abcdef 100644\n") + names := []byte("worker.go\n") + + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: patch}, + {output: names}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + const prepareCommit = "abc1234deadbeef" + result, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + BaseCommit: prepareCommit, + }) + if err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + if !result.HasChanges { + t.Fatal("HasChanges = false, want true") + } + if result.BaseCommit != prepareCommit { + t.Fatalf("BaseCommit = %q, want %q", result.BaseCommit, prepareCommit) + } + // Both scripts must reference the caller-supplied commit, not HEAD. + for _, script := range fake.scripts { + if strings.Contains(script, " HEAD") { + t.Fatalf("script used HEAD instead of base_commit: %q", script) + } + if !strings.Contains(script, prepareCommit) { + t.Fatalf("script missing base_commit %q: %q", prepareCommit, script) + } + } +} + +func TestExportVMWorkspace_BaseCommitFallsBackToHEAD(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox-nobase", "image-export", "172.16.0.106") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: nil}, + {output: nil}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + result, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + BaseCommit: "", // omitted + }) + if err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + if result.BaseCommit != "HEAD" { + t.Fatalf("BaseCommit = %q, want HEAD when not supplied", result.BaseCommit) + } + for _, script := range fake.scripts { + if !strings.Contains(script, "HEAD") { + t.Fatalf("script missing HEAD fallback: %q", script) + } + } +} + +func TestExportVMWorkspace_NoChanges(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox-empty", "image-export", "172.16.0.101") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + // Both scripts return empty output (no changes). + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: nil}, + {output: nil}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + result, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + }) + if err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + if result.HasChanges { + t.Fatal("HasChanges = true, want false") + } + if len(result.Patch) != 0 { + t.Fatalf("Patch = %q, want empty", result.Patch) + } + if len(result.ChangedFiles) != 0 { + t.Fatalf("ChangedFiles = %v, want empty", result.ChangedFiles) + } +} + +func TestExportVMWorkspace_DefaultGuestPath(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox-default", "image-export", "172.16.0.102") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: nil}, + {output: nil}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + // GuestPath omitted — should default to /root/repo. + result, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + }) + if err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + if result.GuestPath != "/root/repo" { + t.Fatalf("GuestPath = %q, want /root/repo", result.GuestPath) + } +} + +func TestExportVMWorkspace_VMNotRunning(t *testing.T) { + t.Parallel() + ctx := context.Background() + + vm := testVM("exportbox-stopped", "image-export", "172.16.0.103") + vm.State = model.VMStateStopped + + fake := &exportGuestClient{} + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + // VM is stopped — no handle seed; vmAlive must return false. + + _, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + }) + if err == nil || !strings.Contains(err.Error(), "not running") { + t.Fatalf("error = %v, want 'not running' error", err) + } + if fake.callIndex != 0 { + t.Fatal("RunScriptOutput should not be called when VM is not running") + } +} + +func TestExportVMWorkspace_MultipleChangedFiles(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox-multi", "image-export", "172.16.0.104") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + patch := []byte("diff --git a/a.go b/a.go\n--- a/a.go\n+++ b/a.go\n") + names := []byte("a.go\nb.go\nnew/file.go\n") + + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: patch}, + {output: names}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + result, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{ + IDOrName: vm.Name, + }) + if err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + if len(result.ChangedFiles) != 3 { + t.Fatalf("ChangedFiles = %v, want 3 entries", result.ChangedFiles) + } + want := []string{"a.go", "b.go", "new/file.go"} + for i, f := range want { + if result.ChangedFiles[i] != f { + t.Fatalf("ChangedFiles[%d] = %q, want %q", i, result.ChangedFiles[i], f) + } + } +} + +// TestPrepareVMWorkspace_ReleasesVMLockDuringGuestIO is a regression +// guard for an earlier design that held the per-VM mutex across SSH +// dial, tar streaming, and remote chmod. A long import could then +// block unrelated lifecycle ops (vm stop / delete / restart) on the +// same VM until it completed. The fix switched to a dedicated +// workspaceLocks set for I/O, with vmLocks held only for the brief +// state-validation phase. This test kicks off a prepare that blocks +// inside the import step and then asserts the VM mutex is acquirable +// while the prepare is mid-flight. +func TestPrepareVMWorkspace_ReleasesVMLockDuringGuestIO(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("lockbox", "image-x", "172.16.0.210") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + d := &Daemon{ + store: openDaemonStore(t), + config: model.DaemonConfig{SSHKeyPath: filepath.Join(t.TempDir(), "id_ed25519")}, + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + } + wireServices(d) + d.guestWaitForSSH = func(_ context.Context, _, _ string, _ time.Duration) error { return nil } + d.guestDial = func(_ context.Context, _, _ string) (guestSSHClient, error) { + return &exportGuestClient{}, nil + } + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + // Install the workspace seams on this daemon instance. InspectRepo + // returns a trivial spec so the real filesystem isn't touched; + // Import blocks until we say go. + importStarted := make(chan struct{}) + releaseImport := make(chan struct{}) + d.ws.workspaceInspectRepo = func(context.Context, string, string, string, bool) (workspace.RepoSpec, error) { + return workspace.RepoSpec{RepoName: "fake", RepoRoot: "/tmp/fake"}, nil + } + d.ws.workspaceImport = func(context.Context, workspace.GuestClient, workspace.RepoSpec, string, model.WorkspacePrepareMode) error { + close(importStarted) + <-releaseImport + return nil + } + + // Kick off prepare in a goroutine. It will block inside the import. + prepareDone := make(chan error, 1) + go func() { + _, err := d.ws.PrepareVMWorkspace(ctx, api.VMWorkspacePrepareParams{ + IDOrName: vm.Name, + SourcePath: "/tmp/fake", + }) + prepareDone <- err + }() + + // Wait for prepare to reach the guest-I/O phase (past the VM + // mutex) before testing the assertion. + select { + case <-importStarted: + case <-time.After(2 * time.Second): + t.Fatal("import never started; prepare blocked before reaching guest I/O") + } + + // With the fix in place, the VM mutex is free even though the + // import is in flight. Acquiring it must not wait. + acquired := make(chan struct{}) + go func() { + unlock := d.vm.lockVMID(vm.ID) + close(acquired) + unlock() + }() + select { + case <-acquired: + case <-time.After(500 * time.Millisecond): + close(releaseImport) // unblock the goroutine so the test can exit + <-prepareDone + t.Fatal("VM mutex held during guest I/O — lifecycle ops would block behind workspace prepare") + } + + // Now let the import finish and make sure prepare returns. + close(releaseImport) + select { + case err := <-prepareDone: + if err != nil { + t.Fatalf("prepare returned error: %v", err) + } + case <-time.After(2 * time.Second): + t.Fatal("prepare did not return after import unblocked") + } +} + +// TestPrepareVMWorkspace_SerialisesConcurrentPreparesOnSameVM asserts +// the workspaceLocks scope: two concurrent prepares on the same VM +// serialise via workspaceLocks even though they don't hold the core +// VM mutex, so a lifecycle op (stop/delete) isn't blocked. +func TestPrepareVMWorkspace_SerialisesConcurrentPreparesOnSameVM(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("serialbox", "image-x", "172.16.0.211") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + d := &Daemon{ + store: openDaemonStore(t), + config: model.DaemonConfig{SSHKeyPath: filepath.Join(t.TempDir(), "id_ed25519")}, + logger: slog.New(slog.NewTextHandler(io.Discard, nil)), + } + wireServices(d) + d.guestWaitForSSH = func(_ context.Context, _, _ string, _ time.Duration) error { return nil } + d.guestDial = func(_ context.Context, _, _ string) (guestSSHClient, error) { + return &exportGuestClient{}, nil + } + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + d.ws.workspaceInspectRepo = func(context.Context, string, string, string, bool) (workspace.RepoSpec, error) { + return workspace.RepoSpec{RepoName: "fake", RepoRoot: "/tmp/fake"}, nil + } + + // Counter of simultaneous Import calls. Should never exceed 1. + var active int32 + var maxObserved int32 + release := make(chan struct{}) + d.ws.workspaceImport = func(context.Context, workspace.GuestClient, workspace.RepoSpec, string, model.WorkspacePrepareMode) error { + n := atomic.AddInt32(&active, 1) + for { + prev := atomic.LoadInt32(&maxObserved) + if n <= prev || atomic.CompareAndSwapInt32(&maxObserved, prev, n) { + break + } + } + <-release + atomic.AddInt32(&active, -1) + return nil + } + + const n = 3 + done := make(chan error, n) + for i := 0; i < n; i++ { + go func() { + _, err := d.ws.PrepareVMWorkspace(ctx, api.VMWorkspacePrepareParams{ + IDOrName: vm.Name, + SourcePath: "/tmp/fake", + }) + done <- err + }() + } + + // Give goroutines a moment to queue up. + time.Sleep(100 * time.Millisecond) + + if got := atomic.LoadInt32(&active); got != 1 { + close(release) // unblock to avoid hang + for i := 0; i < n; i++ { + <-done + } + t.Fatalf("%d concurrent imports, want exactly 1 (workspace lock should serialise)", got) + } + + // Drain: release imports one at a time. + for i := 0; i < n; i++ { + release <- struct{}{} + } + close(release) + for i := 0; i < n; i++ { + if err := <-done; err != nil { + t.Errorf("prepare #%d error: %v", i, err) + } + } + if got := atomic.LoadInt32(&maxObserved); got != 1 { + t.Fatalf("peak concurrent imports = %d, want 1", got) + } +} + +// TestExportVMWorkspace_DoesNotMutateRealIndex is a regression guard +// for an earlier design where `git add -A` ran against the guest's +// real `.git/index`, leaving staged changes behind after what the user +// thought was a read-only observation. Every export script must now +// route `git add -A` through a throwaway index selected by +// GIT_INDEX_FILE, and every script must clean that file up. +func TestExportVMWorkspace_DoesNotMutateRealIndex(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("exportbox-noindex-mutation", "image-export", "172.16.0.107") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + fake := &exportGuestClient{ + responses: []exportGuestResponse{ + {output: []byte("diff --git a/x b/x\n")}, + {output: []byte("x\n")}, + }, + } + d := newExportTestDaemonStore(t, fake) + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + if _, err := d.ws.ExportVMWorkspace(ctx, api.WorkspaceExportParams{IDOrName: vm.Name}); err != nil { + t.Fatalf("ExportVMWorkspace: %v", err) + } + + if len(fake.scripts) == 0 { + t.Fatal("expected at least one export script to be sent") + } + for i, script := range fake.scripts { + if !strings.Contains(script, "GIT_INDEX_FILE") { + t.Errorf("script[%d] missing GIT_INDEX_FILE routing:\n%s", i, script) + } + // git add -A must ONLY appear on a line that also sets + // GIT_INDEX_FILE. A bare occurrence would mutate the real + // index. + for _, line := range strings.Split(script, "\n") { + if strings.Contains(line, "git add -A") && !strings.Contains(line, "GIT_INDEX_FILE") { + t.Errorf("script[%d] has unscoped `git add -A`:\n%s", i, script) + break + } + } + if !strings.Contains(script, "git read-tree") { + t.Errorf("script[%d] missing git read-tree (temp index seed):\n%s", i, script) + } + if !strings.Contains(script, "mktemp") { + t.Errorf("script[%d] missing mktemp for temp index:\n%s", i, script) + } + if !strings.Contains(script, "trap") || !strings.Contains(script, "rm") { + t.Errorf("script[%d] missing temp-index cleanup trap:\n%s", i, script) + } + } +} + +// TestMiseTrustScriptShape pins the exact shell run inside the +// guest by miseTrustGuestRepo. The two contracts other code paths +// rely on: +// +// 1. The script never fails the prepare — `mise trust` is wrapped +// in `... || true` and gated on `command -v mise`, so a guest +// image without mise simply no-ops. +// 2. The path is shell-quoted via ws.ShellQuote, so a guest_path +// containing spaces, quotes, or other oddballs doesn't break +// out of the argument. +func TestMiseTrustScriptShape(t *testing.T) { + got := miseTrustScript("/root/repo") + for _, want := range []string{ + "command -v mise", + "cd '/root/repo' && mise trust --quiet --all", + "|| true", + } { + if !strings.Contains(got, want) { + t.Errorf("script missing %q:\n%s", want, got) + } + } + + // Path with a single quote in it must come back quoted, not + // truncated. ws.ShellQuote escapes by closing/reopening the + // quoted string around each apostrophe. + exotic := miseTrustScript("/root/it's odd") + if !strings.Contains(exotic, `'/root/it'"'"'s odd'`) { + t.Errorf("path with apostrophe was not shell-quoted safely:\n%s", exotic) + } +} + +// TestPrepareVMWorkspace_RunsMiseTrustAfterImport asserts the auto- +// trust step fires once a successful import lands. Failure-path +// behaviour (no import → no trust) is covered by the existing +// rejection tests. +func TestPrepareVMWorkspace_RunsMiseTrustAfterImport(t *testing.T) { + t.Parallel() + ctx := context.Background() + + apiSock := filepath.Join(t.TempDir(), "fc.sock") + firecracker := startFakeFirecracker(t, apiSock) + + vm := testVM("trustbox", "image-trust", "172.16.0.211") + vm.State = model.VMStateRunning + vm.Runtime.State = model.VMStateRunning + vm.Runtime.APISockPath = apiSock + + fake := &exportGuestClient{} + d := newExportTestDaemonStore(t, fake) + d.guestWaitForSSH = func(_ context.Context, _, _ string, _ time.Duration) error { return nil } + upsertDaemonVM(t, ctx, d.store, vm) + d.vm.setVMHandlesInMemory(vm.ID, model.VMHandles{PID: firecracker.Process.Pid}) + + d.ws.workspaceInspectRepo = func(context.Context, string, string, string, bool) (workspace.RepoSpec, error) { + return workspace.RepoSpec{RepoName: "x", RepoRoot: "/tmp/x"}, nil + } + d.ws.workspaceImport = func(context.Context, workspace.GuestClient, workspace.RepoSpec, string, model.WorkspacePrepareMode) error { + return nil + } + + if _, err := d.ws.PrepareVMWorkspace(ctx, api.VMWorkspacePrepareParams{ + IDOrName: vm.Name, + SourcePath: "/tmp/x", + GuestPath: "/root/repo", + }); err != nil { + t.Fatalf("PrepareVMWorkspace: %v", err) + } + + var sawTrust bool + for _, script := range fake.runScriptLog { + if strings.Contains(script, "mise trust") { + sawTrust = true + break + } + } + if !sawTrust { + t.Fatalf("expected mise trust script after import; saw %d scripts: %v", len(fake.runScriptLog), fake.runScriptLog) + } +} diff --git a/internal/download/verified.go b/internal/download/verified.go new file mode 100644 index 0000000..7f51743 --- /dev/null +++ b/internal/download/verified.go @@ -0,0 +1,86 @@ +// Package download contains transport-level primitives shared by +// banger's catalog and update flows. Today it exposes one helper +// (FetchVerified). When imagecat and kernelcat are next touched, their +// duplicate copies of the same logic could fold into this package +// without a behaviour change. +package download + +import ( + "context" + "crypto/sha256" + "encoding/hex" + "fmt" + "io" + "net/http" + "os" + "strings" +) + +// FetchVerified streams `url` into `dstPath`, capped at maxBytes +// bytes, hashing the body on the fly and refusing payloads whose +// SHA256 doesn't match expectedSHA256. +// +// On any failure (HTTP error, ContentLength > cap, body exceeds +// cap mid-stream, write error, sha256 mismatch) dstPath is removed +// before returning so the caller doesn't have to disambiguate +// "did we leave a partial file?". +// +// Returns the number of bytes written. The caller owns successful +// cleanup of dstPath when it's done with the file. +// +// expectedSHA256 is matched case-insensitively. Pass an empty +// client to use http.DefaultClient. +func FetchVerified(ctx context.Context, client *http.Client, url, expectedSHA256 string, maxBytes int64, dstPath string) (int64, error) { + if client == nil { + client = http.DefaultClient + } + if maxBytes <= 0 { + return 0, fmt.Errorf("FetchVerified: maxBytes must be > 0, got %d", maxBytes) + } + if strings.TrimSpace(expectedSHA256) == "" { + return 0, fmt.Errorf("FetchVerified: expectedSHA256 is required") + } + + req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) + if err != nil { + return 0, err + } + resp, err := client.Do(req) + if err != nil { + return 0, fmt.Errorf("fetch %s: %w", url, err) + } + defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return 0, fmt.Errorf("fetch %s: HTTP %s", url, resp.Status) + } + if resp.ContentLength > maxBytes { + return 0, fmt.Errorf("fetch %s: advertised %d bytes exceeds %d-byte cap", url, resp.ContentLength, maxBytes) + } + + f, err := os.Create(dstPath) + if err != nil { + return 0, err + } + + hasher := sha256.New() + limited := io.LimitReader(resp.Body, maxBytes+1) + n, copyErr := io.Copy(io.MultiWriter(f, hasher), limited) + if closeErr := f.Close(); copyErr == nil && closeErr != nil { + copyErr = closeErr + } + if copyErr != nil { + _ = os.Remove(dstPath) + return 0, fmt.Errorf("download %s: %w", url, copyErr) + } + if n > maxBytes { + _ = os.Remove(dstPath) + return 0, fmt.Errorf("download %s: body exceeded %d-byte cap before sha256 check", url, maxBytes) + } + + got := hex.EncodeToString(hasher.Sum(nil)) + if !strings.EqualFold(got, expectedSHA256) { + _ = os.Remove(dstPath) + return 0, fmt.Errorf("sha256 mismatch for %s: got %s, want %s", url, got, expectedSHA256) + } + return n, nil +} diff --git a/internal/download/verified_test.go b/internal/download/verified_test.go new file mode 100644 index 0000000..5c9ab0b --- /dev/null +++ b/internal/download/verified_test.go @@ -0,0 +1,126 @@ +package download + +import ( + "bytes" + "context" + "crypto/sha256" + "encoding/hex" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "strings" + "testing" +) + +func sha256Hex(b []byte) string { + sum := sha256.Sum256(b) + return hex.EncodeToString(sum[:]) +} + +func serveBody(t *testing.T, body []byte) *httptest.Server { + t.Helper() + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/octet-stream") + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + return srv +} + +func TestFetchVerifiedHappyPath(t *testing.T) { + body := bytes.Repeat([]byte("ok"), 1024) + srv := serveBody(t, body) + dst := filepath.Join(t.TempDir(), "out") + + n, err := FetchVerified(context.Background(), srv.Client(), srv.URL, sha256Hex(body), 1<<20, dst) + if err != nil { + t.Fatalf("FetchVerified: %v", err) + } + if n != int64(len(body)) { + t.Fatalf("n = %d, want %d", n, len(body)) + } + got, _ := os.ReadFile(dst) + if !bytes.Equal(got, body) { + t.Fatalf("file content differs from served body") + } +} + +func TestFetchVerifiedRejectsHashMismatch(t *testing.T) { + body := []byte("payload") + srv := serveBody(t, body) + dst := filepath.Join(t.TempDir(), "out") + wrongHash := sha256Hex([]byte("other")) + + _, err := FetchVerified(context.Background(), srv.Client(), srv.URL, wrongHash, 1<<10, dst) + if err == nil || !strings.Contains(err.Error(), "sha256 mismatch") { + t.Fatalf("err = %v, want sha256 mismatch", err) + } + if _, statErr := os.Stat(dst); !os.IsNotExist(statErr) { + t.Fatalf("partial file should be removed; stat err = %v", statErr) + } +} + +func TestFetchVerifiedRejectsContentLengthOverCap(t *testing.T) { + body := bytes.Repeat([]byte("x"), 2048) + srv := serveBody(t, body) + dst := filepath.Join(t.TempDir(), "out") + + _, err := FetchVerified(context.Background(), srv.Client(), srv.URL, sha256Hex(body), 64, dst) + if err == nil || !strings.Contains(err.Error(), "cap") { + t.Fatalf("err = %v, want cap rejection", err) + } + if _, statErr := os.Stat(dst); !os.IsNotExist(statErr) { + t.Fatalf("dst created despite oversize Content-Length: %v", statErr) + } +} + +func TestFetchVerifiedRejectsLyingContentLength(t *testing.T) { + // Server returns no Content-Length but a body bigger than cap. + body := bytes.Repeat([]byte("y"), 2048) + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + // Force chunked: don't set Content-Length. + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + dst := filepath.Join(t.TempDir(), "out") + + _, err := FetchVerified(context.Background(), srv.Client(), srv.URL, sha256Hex(body), 64, dst) + if err == nil || !strings.Contains(err.Error(), "cap") { + t.Fatalf("err = %v, want cap rejection on lying server", err) + } + if _, statErr := os.Stat(dst); !os.IsNotExist(statErr) { + t.Fatalf("partial file from lying server should be removed; stat err = %v", statErr) + } +} + +func TestFetchVerifiedRejectsHTTPError(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + http.Error(w, "missing", http.StatusNotFound) + })) + t.Cleanup(srv.Close) + dst := filepath.Join(t.TempDir(), "out") + + _, err := FetchVerified(context.Background(), srv.Client(), srv.URL, sha256Hex([]byte{}), 1<<10, dst) + if err == nil || !strings.Contains(err.Error(), "404") { + t.Fatalf("err = %v, want 404 mention", err) + } +} + +func TestFetchVerifiedRejectsEmptyExpectedSHA(t *testing.T) { + srv := serveBody(t, []byte("body")) + dst := filepath.Join(t.TempDir(), "out") + _, err := FetchVerified(context.Background(), srv.Client(), srv.URL, "", 1<<10, dst) + if err == nil || !strings.Contains(err.Error(), "expectedSHA256") { + t.Fatalf("err = %v, want empty-sha rejection", err) + } +} + +func TestFetchVerifiedRejectsZeroMaxBytes(t *testing.T) { + srv := serveBody(t, []byte("body")) + dst := filepath.Join(t.TempDir(), "out") + _, err := FetchVerified(context.Background(), srv.Client(), srv.URL, sha256Hex([]byte("body")), 0, dst) + if err == nil || !strings.Contains(err.Error(), "maxBytes") { + t.Fatalf("err = %v, want maxBytes rejection", err) + } +} diff --git a/internal/firecracker/client.go b/internal/firecracker/client.go index d0d8aec..3a96acf 100644 --- a/internal/firecracker/client.go +++ b/internal/firecracker/client.go @@ -6,6 +6,8 @@ import ( "log/slog" "os" "os/exec" + "path/filepath" + "strconv" "strings" "sync" @@ -32,8 +34,34 @@ type MachineConfig struct { VCPUCount int MemoryMiB int Logger *slog.Logger + // Jailer, when non-nil, wraps firecracker in `jailer`. Path fields + // (SocketPath, KernelImagePath, InitrdPath, Drives[].Path, VSockPath) + // MUST be pre-translated by the caller: SocketPath/VSockPath as + // host-visible chroot paths; the rest as chroot-internal paths + // (jailer chroots before exec, so firecracker resolves them inside + // the chroot). + Jailer *JailerOpts } +// JailerOpts captures the jailer-specific knobs. The chroot tree at +// `/firecracker//root/` and the kernel/drive nodes +// inside it must be staged by the caller before NewMachine — this +// package only constructs the launch cmd. +type JailerOpts struct { + Binary string + ChrootBaseDir string + UID int + GID int +} + +// JailerSocketName is the chroot-relative API socket path passed to +// firecracker via --api-sock. Lives at the chroot root (no /run/ subdir +// required) so we don't depend on jailer creating intermediate dirs. +const JailerSocketName = "/firecracker.socket" + +// JailerVSockName mirrors JailerSocketName for the vsock UDS. +const JailerVSockName = "/vsock.sock" + type DriveConfig struct { ID string Path string @@ -74,11 +102,36 @@ func NewMachine(ctx context.Context, cfg MachineConfig) (*Machine, error) { return &Machine{machine: machine, logFile: logFile}, nil } +// JailerChrootRoot returns the host-visible path to the jailer chroot +// root for vmid under base. Mirrors the layout firecracker's jailer +// builds: /firecracker//root. +func JailerChrootRoot(base, vmid string) string { + return filepath.Join(base, "firecracker", vmid, "root") +} + func (m *Machine) Start(ctx context.Context) error { - if err := m.machine.Start(ctx); err != nil { + // The caller's ctx is INTENTIONALLY not forwarded to the SDK. + // firecracker-go-sdk's startVMM (machine.go) spawns a goroutine + // that SIGTERMs firecracker the instant this ctx cancels, and + // retains it for the lifetime of the VMM — not just the boot + // phase. Plumbing an RPC request ctx through would mean + // firecracker dies the moment the daemon writes its RPC response + // (daemon.go:handleConn defers cancel). That silently breaks + // `vm start` on a stopped VM: start "succeeds", the handler + // returns, ctx cancels, firecracker is SIGTERMed, and the next + // `vm ssh` hits `vmAlive = false`. `vm.create` sidesteps the bug + // because BeginVMCreate detaches to a background ctx before + // calling startVMLocked. + // + // We own firecracker lifecycle explicitly — StopVM / KillVM / + // cleanupRuntime — so losing ctx-driven cancellation here is + // deliberate. The SDK still enforces its own boot-phase timeouts + // (socket wait, HTTP) with internal deadlines. + if err := m.machine.Start(context.Background()); err != nil { m.closeLog() return err } + _ = ctx go func() { _ = m.machine.Wait(context.Background()) @@ -123,7 +176,7 @@ func buildConfig(cfg MachineConfig) sdk.Config { } drives := drivesBuilder.Build() - return sdk.Config{ + out := sdk.Config{ SocketPath: cfg.SocketPath, LogPath: cfg.LogPath, MetricsPath: cfg.MetricsPath, @@ -143,7 +196,28 @@ func buildConfig(cfg MachineConfig) sdk.Config { Smt: sdk.Bool(false), }, VMID: cfg.VMID, + // Disable the SDK's signal-forwarding goroutine. Default + // (nil) makes the SDK install a handler that catches + // SIGTERM/SIGINT/SIGHUP/SIGQUIT/SIGABRT in the parent process + // and forwards them to the firecracker child — which means + // `systemctl stop bangerd-root.service` (sends SIGTERM to the + // helper) ends up signaling every firecracker the helper has + // launched, killing every running VM. Empty slice (not nil) + // short-circuits setupSignals at len()==0. + ForwardSignals: []os.Signal{}, } + if cfg.Jailer != nil { + // The path fields above are already chroot-translated by the + // caller (see MachineConfig.Jailer doc). Skip the SDK's host-side + // existence checks — kernel/drives live inside the chroot, not + // at the paths we report. + out.DisableValidation = true + // LogPath is the host-side file used only for cmd.Stderr capture. + // Clearing it here prevents the SDK from sending PUT /logger with + // a host path that firecracker can't open from inside the chroot. + out.LogPath = "" + } + return out } func buildVsockDevices(cfg MachineConfig) []sdk.VsockDevice { @@ -183,11 +257,41 @@ func defaultDriveID(drive DriveConfig, fallback string) string { return fallback } +// buildProcessRunner constructs the *exec.Cmd the SDK will start. Args are +// passed directly — no shell, no string interpolation — so any future change +// to MachineConfig fields can't smuggle shell metacharacters into the launch. +// +// The daemon and root-helper processes set umask 077 at startup, so the +// API/vsock sockets firecracker creates inherit 0600 mode without needing a +// shell-level `umask` wrapper. +// +// When firecracker has to be launched under sudo (non-root daemon), the +// resulting sockets are root-owned. The caller (LaunchFirecracker) kicks off +// fcproc.EnsureSocketAccessForAsync immediately *before* Machine.Start so the +// chown wins the race against the SDK's HTTP probe over the API socket. That +// replaces the previous in-shell chown_watcher. +// +// When cfg.Jailer is set, the launch is wrapped by `jailer`. The chroot tree +// MUST already be staged (kernel hard-linked, drives mknod'd, dirs chowned to +// the configured UID:GID) — see fcproc.PrepareJailerChroot. The SDK's own +// JailerCfg path is intentionally bypassed: it cannot mknod block devices and +// does not expose --new-pid-ns. func buildProcessRunner(cfg MachineConfig, logFile *os.File) *exec.Cmd { - script := "umask 000 && exec " + shellQuote(cfg.BinaryPath) + - " --api-sock " + shellQuote(cfg.SocketPath) + - " --id " + shellQuote(cfg.VMID) - cmd := exec.Command("sudo", "-n", "sh", "-c", script) + var bin string + var args []string + if cfg.Jailer != nil { + bin, args = jailerArgs(cfg) + } else { + bin = cfg.BinaryPath + args = []string{"--api-sock", cfg.SocketPath, "--id", cfg.VMID} + } + var cmd *exec.Cmd + switch { + case os.Geteuid() == 0: + cmd = exec.Command(bin, args...) + default: + cmd = exec.Command("sudo", append([]string{"-n", "-E", bin}, args...)...) + } cmd.Stdin = nil if logFile != nil { cmd.Stdout = logFile @@ -196,8 +300,26 @@ func buildProcessRunner(cfg MachineConfig, logFile *os.File) *exec.Cmd { return cmd } -func shellQuote(value string) string { - return "'" + strings.ReplaceAll(value, "'", `'"'"'`) + "'" +// jailerArgs returns the (binary, args) tuple for the jailer wrapper. +// firecracker's flags are passed after `--`. --new-pid-ns gives the guest +// VMM its own PID namespace; the SDK's JailerCommandBuilder doesn't expose +// it in v1.0.0, which is the main reason this path doesn't go through +// sdk.Config.JailerCfg. +func jailerArgs(cfg MachineConfig) (string, []string) { + args := []string{ + "--id", cfg.VMID, + "--uid", strconv.Itoa(cfg.Jailer.UID), + "--gid", strconv.Itoa(cfg.Jailer.GID), + "--exec-file", cfg.BinaryPath, + "--chroot-base-dir", cfg.Jailer.ChrootBaseDir, + // "--new-pid-ns": jailer forks when creating the PID namespace; the + // SDK tracks the parent's PID, which exits immediately, causing the + // SDK's "process exited" goroutine to tear down the API socket while + // firecracker is still booting in the child. Left out intentionally. + "--", + "--api-sock", JailerSocketName, + } + return cfg.Jailer.Binary, args } func newLogger(base *slog.Logger) *logrus.Entry { diff --git a/internal/firecracker/client_test.go b/internal/firecracker/client_test.go index e02f6a9..cf22d5c 100644 --- a/internal/firecracker/client_test.go +++ b/internal/firecracker/client_test.go @@ -5,6 +5,7 @@ import ( "context" "log/slog" "net" + "os" "path/filepath" "strings" "testing" @@ -72,29 +73,54 @@ func TestBuildConfig(t *testing.T) { } } -func TestBuildProcessRunnerUsesSudoShellWrapper(t *testing.T) { +func TestBuildProcessRunnerInvokesSudoWithDirectArgs(t *testing.T) { + cmd := buildProcessRunner(MachineConfig{ + BinaryPath: "/repo/firecracker", + SocketPath: "/tmp/fc.sock", + VSockPath: "/tmp/vsock.sock", + VMID: "vm-1", + }, nil) + + // No shell, no string interpolation: the binary path and every flag + // are independent argv entries. Even if MachineConfig ever carried an + // attacker-controlled value, there's no shell to interpret it. + wantArgs := []string{"sudo", "-n", "-E", "/repo/firecracker", "--api-sock", "/tmp/fc.sock", "--id", "vm-1"} + if !equalStrings(cmd.Args, wantArgs) { + t.Fatalf("args = %v, want %v", cmd.Args, wantArgs) + } + if cmd.Path != "/usr/bin/sudo" && cmd.Path != "sudo" { + t.Fatalf("command path = %q", cmd.Path) + } + if cmd.Cancel != nil { + t.Fatal("process runner should not be tied to a request context") + } +} + +func TestBuildProcessRunnerOmitsSudoWhenAlreadyRoot(t *testing.T) { + if os.Geteuid() != 0 { + t.Skip("requires root to exercise the no-sudo branch") + } cmd := buildProcessRunner(MachineConfig{ BinaryPath: "/repo/firecracker", SocketPath: "/tmp/fc.sock", VMID: "vm-1", }, nil) + wantArgs := []string{"/repo/firecracker", "--api-sock", "/tmp/fc.sock", "--id", "vm-1"} + if !equalStrings(cmd.Args, wantArgs) { + t.Fatalf("args = %v, want %v", cmd.Args, wantArgs) + } +} - if cmd.Path != "/usr/bin/sudo" && cmd.Path != "sudo" { - t.Fatalf("command path = %q", cmd.Path) +func equalStrings(a, b []string) bool { + if len(a) != len(b) { + return false } - if len(cmd.Args) != 5 { - t.Fatalf("args = %v", cmd.Args) - } - if cmd.Args[1] != "-n" || cmd.Args[2] != "sh" || cmd.Args[3] != "-c" { - t.Fatalf("args = %v", cmd.Args) - } - want := "umask 000 && exec '/repo/firecracker' --api-sock '/tmp/fc.sock' --id 'vm-1'" - if cmd.Args[4] != want { - t.Fatalf("script = %q, want %q", cmd.Args[4], want) - } - if cmd.Cancel != nil { - t.Fatal("process runner should not be tied to a request context") + for i := range a { + if a[i] != b[i] { + return false + } } + return true } func TestSDKLoggerBridgeEmitsStructuredDebugLogs(t *testing.T) { diff --git a/internal/firecracker/version.go b/internal/firecracker/version.go new file mode 100644 index 0000000..6da9a0f --- /dev/null +++ b/internal/firecracker/version.go @@ -0,0 +1,133 @@ +package firecracker + +import ( + "context" + "fmt" + "regexp" + "strconv" + "strings" +) + +// MinSupportedVersion is the lowest firecracker version banger has +// been validated against. Below this, banger refuses to launch — the +// jailer flags banger relies on (notably the `--exec-file` / +// `--chroot-base-dir` pair plus the structured chroot layout) might +// behave differently or be missing entirely. +// +// Bumping this is a deliberate decision; it should change in lockstep +// with whatever feature in the helper started requiring the newer +// firecracker. +const MinSupportedVersion = "1.5.0" + +// KnownTestedVersion is the firecracker version banger's smoke suite +// is currently exercised against. Newer versions usually work +// (firecracker keeps its API stable within a major) but they sit +// outside the tested window — `banger doctor` warns rather than fails +// when it finds a higher version. +const KnownTestedVersion = "1.14.1" + +// versionPattern matches the canonical `Firecracker v1.14.1` line +// emitted by `firecracker --version`. The pre-release suffix +// (e.g. `-beta`) is captured for fidelity in the reported string but +// ignored for ordering. +var versionPattern = regexp.MustCompile(`Firecracker v(\d+)\.(\d+)\.(\d+)(?:-([\w.]+))?`) + +// SemVer is a structural representation of a `MAJOR.MINOR.PATCH` +// triple plus an optional pre-release label. Comparisons use only +// the triple; pre-releases are kept for display. +type SemVer struct { + Major, Minor, Patch int + PreRelease string +} + +// String renders the SemVer back to its canonical form, with a +// leading "v" so it matches firecracker's own output. +func (s SemVer) String() string { + if s.PreRelease == "" { + return fmt.Sprintf("v%d.%d.%d", s.Major, s.Minor, s.Patch) + } + return fmt.Sprintf("v%d.%d.%d-%s", s.Major, s.Minor, s.Patch, s.PreRelease) +} + +// Compare returns -1, 0, or +1 based on the (Major, Minor, Patch) +// triple. Pre-release labels are ignored — banger doesn't +// distinguish `v1.10.0` from `v1.10.0-rc1` for compatibility purposes. +func (s SemVer) Compare(other SemVer) int { + if s.Major != other.Major { + return cmpInt(s.Major, other.Major) + } + if s.Minor != other.Minor { + return cmpInt(s.Minor, other.Minor) + } + return cmpInt(s.Patch, other.Patch) +} + +// ParseVersionOutput pulls the SemVer out of `firecracker --version` +// stdout. firecracker historically prints the version line followed +// by a freeform "exiting successfully" line; we match the first +// occurrence of the pattern anywhere in the output to be tolerant of +// future cosmetic changes. +func ParseVersionOutput(out string) (SemVer, error) { + m := versionPattern.FindStringSubmatch(out) + if m == nil { + return SemVer{}, fmt.Errorf("unrecognised firecracker version output: %q", strings.TrimSpace(out)) + } + major, _ := strconv.Atoi(m[1]) + minor, _ := strconv.Atoi(m[2]) + patch, _ := strconv.Atoi(m[3]) + return SemVer{Major: major, Minor: minor, Patch: patch, PreRelease: m[4]}, nil +} + +// MustParseSemVer parses a `MAJOR.MINOR.PATCH` (optionally `v`-prefixed) +// constant. Panics on malformed input — only used for the package-level +// constants above and in tests, so a malformed string is a developer +// error rather than a runtime concern. +func MustParseSemVer(s string) SemVer { + parts := strings.SplitN(strings.TrimPrefix(s, "v"), ".", 3) + if len(parts) != 3 { + panic("MustParseSemVer: " + s) + } + major, err := strconv.Atoi(parts[0]) + if err != nil { + panic("MustParseSemVer: " + s) + } + minor, err := strconv.Atoi(parts[1]) + if err != nil { + panic("MustParseSemVer: " + s) + } + patch, err := strconv.Atoi(parts[2]) + if err != nil { + panic("MustParseSemVer: " + s) + } + return SemVer{Major: major, Minor: minor, Patch: patch} +} + +// VersionRunner is the slim contract QueryVersion needs from a +// command-runner. system.Runner satisfies it; defining the interface +// inline keeps internal/firecracker free of cross-cutting imports. +type VersionRunner interface { + Run(ctx context.Context, name string, args ...string) ([]byte, error) +} + +// QueryVersion runs ` --version` and parses the result. Returns +// only the parsed SemVer — firecracker's stdout includes a trailing +// "exiting successfully" log line that we have no use for; callers +// render the result via SemVer.String() ("v1.14.1") for display. +func QueryVersion(ctx context.Context, runner VersionRunner, bin string) (SemVer, error) { + out, err := runner.Run(ctx, bin, "--version") + if err != nil { + return SemVer{}, err + } + return ParseVersionOutput(string(out)) +} + +func cmpInt(a, b int) int { + switch { + case a < b: + return -1 + case a > b: + return 1 + default: + return 0 + } +} diff --git a/internal/firecracker/version_test.go b/internal/firecracker/version_test.go new file mode 100644 index 0000000..e314631 --- /dev/null +++ b/internal/firecracker/version_test.go @@ -0,0 +1,96 @@ +package firecracker + +import ( + "strings" + "testing" +) + +func TestParseVersionOutput(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + input string + want SemVer + wantErr bool + }{ + {name: "canonical", input: "Firecracker v1.14.1\n", want: SemVer{Major: 1, Minor: 14, Patch: 1}}, + {name: "with_trailing_log", input: "Firecracker v1.14.1\n\n2026-04-28T17:38:12.392171332 [anonymous-instance:main] exit_code=0\n", want: SemVer{Major: 1, Minor: 14, Patch: 1}}, + {name: "prerelease", input: "Firecracker v1.10.0-rc1\n", want: SemVer{Major: 1, Minor: 10, Patch: 0, PreRelease: "rc1"}}, + {name: "two_digit_minor", input: "Firecracker v2.0.42\n", want: SemVer{Major: 2, Minor: 0, Patch: 42}}, + {name: "garbage", input: "not a firecracker", wantErr: true}, + {name: "empty", input: "", wantErr: true}, + {name: "missing_v", input: "Firecracker 1.14.1", wantErr: true}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + got, err := ParseVersionOutput(tc.input) + if tc.wantErr { + if err == nil { + t.Fatalf("ParseVersionOutput(%q) succeeded, want error", tc.input) + } + return + } + if err != nil { + t.Fatalf("ParseVersionOutput(%q) = %v", tc.input, err) + } + if got != tc.want { + t.Fatalf("ParseVersionOutput(%q) = %+v, want %+v", tc.input, got, tc.want) + } + }) + } +} + +func TestSemVerCompare(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + a, b string + want int + }{ + {name: "equal", a: "1.14.1", b: "1.14.1", want: 0}, + {name: "patch_lower", a: "1.14.0", b: "1.14.1", want: -1}, + {name: "patch_higher", a: "1.14.2", b: "1.14.1", want: 1}, + {name: "minor_dominates_patch", a: "1.10.999", b: "1.11.0", want: -1}, + {name: "major_dominates", a: "2.0.0", b: "1.99.99", want: 1}, + {name: "min_vs_tested_today", a: MinSupportedVersion, b: KnownTestedVersion, want: -1}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + a := MustParseSemVer(tc.a) + b := MustParseSemVer(tc.b) + if got := a.Compare(b); got != tc.want { + t.Fatalf("(%s).Compare(%s) = %d, want %d", tc.a, tc.b, got, tc.want) + } + }) + } +} + +func TestSemVerString(t *testing.T) { + t.Parallel() + if got := MustParseSemVer("1.14.1").String(); got != "v1.14.1" { + t.Fatalf("v1.14.1.String() = %q", got) + } + pre := SemVer{Major: 1, Minor: 10, Patch: 0, PreRelease: "rc1"} + if got := pre.String(); got != "v1.10.0-rc1" { + t.Fatalf("rc String() = %q", got) + } +} + +// MustParseSemVer panics on malformed input; pin a few inputs so a +// future refactor doesn't accidentally widen what counts as valid. +func TestMustParseSemVerRejectsMalformed(t *testing.T) { + t.Parallel() + for _, bad := range []string{"", "1", "1.2", "1.2.3.4", "v1.2.x", "vfoo"} { + bad := bad + t.Run(strings.ReplaceAll(bad, ".", "_"), func(t *testing.T) { + defer func() { + if r := recover(); r == nil { + t.Errorf("MustParseSemVer(%q) did not panic", bad) + } + }() + _ = MustParseSemVer(bad) + }) + } +} diff --git a/internal/guest/known_hosts.go b/internal/guest/known_hosts.go new file mode 100644 index 0000000..2dd3f90 --- /dev/null +++ b/internal/guest/known_hosts.go @@ -0,0 +1,256 @@ +package guest + +import ( + "bufio" + "encoding/base64" + "errors" + "fmt" + "net" + "os" + "strings" + "sync" + + "golang.org/x/crypto/ssh" +) + +// TOFUHostKeyCallback returns a HostKeyCallback that implements +// trust-on-first-use against a banger-owned known_hosts file. +// +// Semantics: +// - If the file has an entry for `host:port` → require an exact +// key match; a mismatch returns an error (MITM protection). +// - If no entry exists → append one and accept. +// +// The file format is compatible with OpenSSH so shell SSH clients can +// use the same path via `UserKnownHostsFile`. +// +// Callers keep a process-wide mutex on the file so concurrent dials +// to different VMs don't interleave writes. +// +// An empty path disables host-key checking entirely — only for test +// harnesses and tools that dial ad-hoc infrastructure; production +// paths must supply a real file. +func TOFUHostKeyCallback(path string) ssh.HostKeyCallback { + if strings.TrimSpace(path) == "" { + return ssh.InsecureIgnoreHostKey() + } + return func(hostname string, remote net.Addr, key ssh.PublicKey) error { + host := hostLookupKey(hostname, remote) + knownHostsMu.Lock() + defer knownHostsMu.Unlock() + + entries, err := loadKnownHosts(path) + if err != nil { + return fmt.Errorf("read known_hosts: %w", err) + } + stored, matched := entries.match(host, key.Type()) + if matched { + if keysEqual(stored.key, key) { + return nil + } + return fmt.Errorf("banger: host key for %s does not match pinned entry — "+ + "possible MITM. If the VM was legitimately rebuilt, remove the old "+ + "entry from %s and retry.", host, path) + } + if err := appendKnownHost(path, host, key); err != nil { + return fmt.Errorf("pin host key for %s: %w", host, err) + } + return nil + } +} + +// RemoveKnownHosts strips every entry matching any host in `hosts` +// from the known_hosts file. Called on VM delete so a future VM +// reusing the same IP or name never trips the TOFU mismatch branch. +// Missing file / missing hosts = no-op. +func RemoveKnownHosts(path string, hosts ...string) error { + if strings.TrimSpace(path) == "" || len(hosts) == 0 { + return nil + } + knownHostsMu.Lock() + defer knownHostsMu.Unlock() + + entries, err := loadKnownHosts(path) + if err != nil { + return err + } + drop := make(map[string]struct{}, len(hosts)) + for _, h := range hosts { + h = strings.TrimSpace(h) + if h == "" { + continue + } + drop[h] = struct{}{} + } + if len(drop) == 0 { + return nil + } + filtered := entries.filter(func(e knownHostEntry) bool { + for _, h := range e.hosts { + if _, skip := drop[h]; skip { + return false + } + } + return true + }) + return filtered.write(path) +} + +var knownHostsMu sync.Mutex + +// knownHostEntry is one line in known_hosts: a set of host patterns +// (comma-separated in the file), a key type, and a key blob. +type knownHostEntry struct { + hosts []string + keyType string + key ssh.PublicKey + raw string +} + +type knownHostList []knownHostEntry + +func (l knownHostList) match(host, keyType string) (knownHostEntry, bool) { + for _, e := range l { + if e.keyType != keyType { + continue + } + for _, h := range e.hosts { + if h == host { + return e, true + } + } + } + return knownHostEntry{}, false +} + +func (l knownHostList) filter(keep func(knownHostEntry) bool) knownHostList { + out := make(knownHostList, 0, len(l)) + for _, e := range l { + if keep(e) { + out = append(out, e) + } + } + return out +} + +func (l knownHostList) write(path string) error { + if len(l) == 0 { + // If everything got filtered, truncate the file rather than + // removing it — callers may want the file to keep existing + // (with 0600 perms) for later appends. + return os.WriteFile(path, nil, 0o600) + } + var buf strings.Builder + for _, e := range l { + buf.WriteString(e.raw) + if !strings.HasSuffix(e.raw, "\n") { + buf.WriteByte('\n') + } + } + return os.WriteFile(path, []byte(buf.String()), 0o600) +} + +func loadKnownHosts(path string) (knownHostList, error) { + f, err := os.Open(path) + if err != nil { + if os.IsNotExist(err) { + return nil, nil + } + return nil, err + } + defer f.Close() + + var out knownHostList + scanner := bufio.NewScanner(f) + scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024) + for scanner.Scan() { + line := scanner.Text() + trimmed := strings.TrimSpace(line) + if trimmed == "" || strings.HasPrefix(trimmed, "#") { + continue + } + fields := strings.Fields(trimmed) + if len(fields) < 3 { + continue + } + keyBytes, err := base64.StdEncoding.DecodeString(fields[2]) + if err != nil { + continue + } + key, err := ssh.ParsePublicKey(keyBytes) + if err != nil { + continue + } + out = append(out, knownHostEntry{ + hosts: strings.Split(fields[0], ","), + keyType: fields[1], + key: key, + raw: line, + }) + } + if err := scanner.Err(); err != nil { + return nil, err + } + return out, nil +} + +func appendKnownHost(path, host string, key ssh.PublicKey) error { + line := fmt.Sprintf("%s %s %s\n", + host, + key.Type(), + base64.StdEncoding.EncodeToString(key.Marshal()), + ) + f, err := os.OpenFile(path, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0o600) + if err != nil { + return err + } + defer f.Close() + _, err = f.WriteString(line) + return err +} + +// hostLookupKey returns the canonical key under which we store host +// entries. For a TCP dial the SSH library hands us hostname of the +// form "host:port"; we normalise to "host" so pinning by IP also +// works for a hostname-based lookup that resolves to the same IP. +// +// If hostname contains a port, strip it. If it's empty, fall back to +// the remote address. +func hostLookupKey(hostname string, remote net.Addr) string { + if h, _, err := net.SplitHostPort(hostname); err == nil { + hostname = h + } + if strings.TrimSpace(hostname) != "" { + return hostname + } + if remote != nil { + if h, _, err := net.SplitHostPort(remote.String()); err == nil { + return h + } + return remote.String() + } + return "" +} + +func keysEqual(a, b ssh.PublicKey) bool { + if a == nil || b == nil { + return a == nil && b == nil + } + ba := a.Marshal() + bb := b.Marshal() + if len(ba) != len(bb) { + return false + } + for i := range ba { + if ba[i] != bb[i] { + return false + } + } + return true +} + +// errHostKeyMismatch sentinel is currently unused but reserved for +// callers that want to distinguish MITM from other failures. +var errHostKeyMismatch = errors.New("host key mismatch") + +var _ = errHostKeyMismatch diff --git a/internal/guest/known_hosts_test.go b/internal/guest/known_hosts_test.go new file mode 100644 index 0000000..8c9e3b2 --- /dev/null +++ b/internal/guest/known_hosts_test.go @@ -0,0 +1,185 @@ +package guest + +import ( + "crypto/ed25519" + "crypto/rand" + "net" + "os" + "path/filepath" + "strings" + "testing" + + "golang.org/x/crypto/ssh" +) + +// makeTestHostKey generates a fresh ed25519 key and returns the +// ssh.PublicKey the server would present during a handshake. +func makeTestHostKey(t *testing.T) ssh.PublicKey { + t.Helper() + pub, _, err := ed25519.GenerateKey(rand.Reader) + if err != nil { + t.Fatalf("GenerateKey: %v", err) + } + sshPub, err := ssh.NewPublicKey(pub) + if err != nil { + t.Fatalf("NewPublicKey: %v", err) + } + return sshPub +} + +func TestTOFUHostKeyCallbackPinsOnFirstUse(t *testing.T) { + t.Parallel() + path := filepath.Join(t.TempDir(), "known_hosts") + cb := TOFUHostKeyCallback(path) + + key := makeTestHostKey(t) + addr := &net.TCPAddr{IP: net.ParseIP("172.16.0.5"), Port: 22} + + if err := cb("172.16.0.5:22", addr, key); err != nil { + t.Fatalf("first-use callback: %v", err) + } + + data, err := os.ReadFile(path) + if err != nil { + t.Fatalf("ReadFile: %v", err) + } + content := string(data) + if !strings.Contains(content, "172.16.0.5") { + t.Errorf("known_hosts missing host:\n%s", content) + } + if !strings.Contains(content, key.Type()) { + t.Errorf("known_hosts missing key type:\n%s", content) + } +} + +func TestTOFUHostKeyCallbackAcceptsMatch(t *testing.T) { + t.Parallel() + path := filepath.Join(t.TempDir(), "known_hosts") + cb := TOFUHostKeyCallback(path) + key := makeTestHostKey(t) + addr := &net.TCPAddr{IP: net.ParseIP("172.16.0.6"), Port: 22} + + if err := cb("172.16.0.6:22", addr, key); err != nil { + t.Fatalf("first-use: %v", err) + } + // Same key, second dial: must succeed. + if err := cb("172.16.0.6:22", addr, key); err != nil { + t.Fatalf("second dial with matching key: %v", err) + } +} + +func TestTOFUHostKeyCallbackRejectsMismatch(t *testing.T) { + t.Parallel() + path := filepath.Join(t.TempDir(), "known_hosts") + cb := TOFUHostKeyCallback(path) + addr := &net.TCPAddr{IP: net.ParseIP("172.16.0.7"), Port: 22} + + original := makeTestHostKey(t) + if err := cb("172.16.0.7:22", addr, original); err != nil { + t.Fatalf("pin original: %v", err) + } + + impostor := makeTestHostKey(t) + err := cb("172.16.0.7:22", addr, impostor) + if err == nil { + t.Fatal("expected mismatch error, got nil") + } + if !strings.Contains(err.Error(), "does not match") { + t.Errorf("error = %v, want message about mismatch", err) + } +} + +func TestTOFUEmptyPathDisablesVerification(t *testing.T) { + t.Parallel() + // Empty path returns an Insecure callback — useful for tests / + // throwaway tools. Document behaviour so the fallback doesn't + // silently regress to "always verify but without a file". + cb := TOFUHostKeyCallback("") + addr := &net.TCPAddr{IP: net.ParseIP("127.0.0.1"), Port: 22} + if err := cb("127.0.0.1:22", addr, makeTestHostKey(t)); err != nil { + t.Fatalf("empty-path callback should accept: %v", err) + } +} + +func TestRemoveKnownHostsDropsEntry(t *testing.T) { + t.Parallel() + path := filepath.Join(t.TempDir(), "known_hosts") + cb := TOFUHostKeyCallback(path) + keep := makeTestHostKey(t) + drop := makeTestHostKey(t) + + if err := cb("172.16.0.10:22", &net.TCPAddr{IP: net.ParseIP("172.16.0.10"), Port: 22}, keep); err != nil { + t.Fatalf("pin keep: %v", err) + } + if err := cb("172.16.0.11:22", &net.TCPAddr{IP: net.ParseIP("172.16.0.11"), Port: 22}, drop); err != nil { + t.Fatalf("pin drop: %v", err) + } + + if err := RemoveKnownHosts(path, "172.16.0.11"); err != nil { + t.Fatalf("RemoveKnownHosts: %v", err) + } + + data, _ := os.ReadFile(path) + content := string(data) + if !strings.Contains(content, "172.16.0.10") { + t.Errorf("kept entry missing:\n%s", content) + } + if strings.Contains(content, "172.16.0.11") { + t.Errorf("dropped entry still present:\n%s", content) + } +} + +func TestRemoveKnownHostsMissingFileIsNoOp(t *testing.T) { + t.Parallel() + missing := filepath.Join(t.TempDir(), "absent") + if err := RemoveKnownHosts(missing, "any"); err != nil { + t.Fatalf("RemoveKnownHosts on missing: %v", err) + } +} + +func TestRemoveKnownHostsEmptyPathIsNoOp(t *testing.T) { + t.Parallel() + if err := RemoveKnownHosts("", "any"); err != nil { + t.Fatalf("RemoveKnownHosts(empty): %v", err) + } +} + +// TestTOFURewritesAllowsReuseAfterRemove: after a VM is deleted and +// its pin is cleared, a future VM reusing the same IP (with a fresh +// host key) should re-pin cleanly, not fail the mismatch branch. +func TestTOFURewritesAllowsReuseAfterRemove(t *testing.T) { + t.Parallel() + path := filepath.Join(t.TempDir(), "known_hosts") + cb := TOFUHostKeyCallback(path) + addr := &net.TCPAddr{IP: net.ParseIP("172.16.0.15"), Port: 22} + + original := makeTestHostKey(t) + if err := cb("172.16.0.15:22", addr, original); err != nil { + t.Fatalf("pin original: %v", err) + } + + // VM deleted → pin removed. + if err := RemoveKnownHosts(path, "172.16.0.15"); err != nil { + t.Fatalf("RemoveKnownHosts: %v", err) + } + + // New VM, same IP, new host key. Must re-pin without error. + replacement := makeTestHostKey(t) + if err := cb("172.16.0.15:22", addr, replacement); err != nil { + t.Fatalf("re-pin after remove: %v", err) + } +} + +func TestHostLookupKeyStripsPort(t *testing.T) { + t.Parallel() + if got := hostLookupKey("10.0.0.1:22", nil); got != "10.0.0.1" { + t.Errorf("got %q, want 10.0.0.1", got) + } + if got := hostLookupKey("host.vm", nil); got != "host.vm" { + t.Errorf("got %q, want host.vm", got) + } + addr := &net.TCPAddr{IP: net.ParseIP("1.2.3.4"), Port: 22} + if got := hostLookupKey("", addr); got != "1.2.3.4" { + t.Errorf("fallback: got %q, want 1.2.3.4", got) + } +} diff --git a/internal/guest/ssh.go b/internal/guest/ssh.go index 2f6af93..bbf2e4b 100644 --- a/internal/guest/ssh.go +++ b/internal/guest/ssh.go @@ -15,6 +15,7 @@ import ( "path/filepath" "sort" "strings" + "sync" "time" "golang.org/x/crypto/ssh" @@ -24,12 +25,25 @@ type Client struct { client *ssh.Client } -func WaitForSSH(ctx context.Context, address, privateKeyPath string, interval time.Duration) error { +type StreamSession struct { + client *Client + session *ssh.Session + stdin io.WriteCloser + stdout io.Reader + stderr io.Reader + waitCh chan error + closeOnce sync.Once +} + +// WaitForSSH polls Dial until it succeeds or ctx cancels. The +// knownHostsPath argument is the banger-owned TOFU file; empty +// disables host-key verification (tests only). +func WaitForSSH(ctx context.Context, address, privateKeyPath, knownHostsPath string, interval time.Duration) error { if interval <= 0 { interval = time.Second } for { - client, err := Dial(ctx, address, privateKeyPath) + client, err := Dial(ctx, address, privateKeyPath, knownHostsPath) if err == nil { _ = client.Close() return nil @@ -42,7 +56,11 @@ func WaitForSSH(ctx context.Context, address, privateKeyPath string, interval ti } } -func Dial(ctx context.Context, address, privateKeyPath string) (*Client, error) { +// Dial opens an SSH client to address, authenticating with the key +// at privateKeyPath and verifying the remote host key against the +// TOFU known_hosts file at knownHostsPath. An empty knownHostsPath +// disables verification (tests / one-shot tools only). +func Dial(ctx context.Context, address, privateKeyPath, knownHostsPath string) (*Client, error) { signer, err := privateKeySigner(privateKeyPath) if err != nil { return nil, err @@ -50,7 +68,7 @@ func Dial(ctx context.Context, address, privateKeyPath string) (*Client, error) config := &ssh.ClientConfig{ User: "root", Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)}, - HostKeyCallback: ssh.InsecureIgnoreHostKey(), + HostKeyCallback: TOFUHostKeyCallback(knownHostsPath), Timeout: 10 * time.Second, } dialer := &net.Dialer{Timeout: 10 * time.Second} @@ -78,6 +96,35 @@ func (c *Client) RunScript(ctx context.Context, script string, logWriter io.Writ return c.runSession(ctx, "bash -se", strings.NewReader(script), logWriter) } +// RunScriptOutput runs script on the guest and returns its stdout. +// Stderr is discarded. Use for capturing structured output (patches, JSON, +// file content) where mixing stderr into stdout would corrupt the result. +func (c *Client) RunScriptOutput(ctx context.Context, script string) ([]byte, error) { + if c == nil || c.client == nil { + return nil, fmt.Errorf("ssh client is not connected") + } + session, err := c.client.NewSession() + if err != nil { + return nil, err + } + defer session.Close() + session.Stdin = strings.NewReader(script) + var stdout bytes.Buffer + session.Stdout = &stdout + // session.Stderr left nil: stderr is intentionally discarded. + done := make(chan error, 1) + go func() { + select { + case <-ctx.Done(): + _ = c.client.Close() + case <-done: + } + }() + err = session.Run("bash -se") + done <- nil + return stdout.Bytes(), err +} + func (c *Client) UploadFile(ctx context.Context, remotePath string, mode os.FileMode, data []byte, logWriter io.Writer) error { command := fmt.Sprintf("install -D -m %04o /dev/stdin %s", mode.Perm(), shellQuote(remotePath)) return c.runSession(ctx, command, bytes.NewReader(data), logWriter) @@ -109,6 +156,116 @@ func (c *Client) StreamTarEntries(ctx context.Context, sourceDir string, entries return errors.Join(runErr, tarErr) } +func (c *Client) StartCommand(ctx context.Context, command string) (*StreamSession, error) { + if c == nil || c.client == nil { + return nil, fmt.Errorf("ssh client is not connected") + } + session, err := c.client.NewSession() + if err != nil { + return nil, err + } + stdin, err := session.StdinPipe() + if err != nil { + _ = session.Close() + return nil, err + } + stdout, err := session.StdoutPipe() + if err != nil { + _ = session.Close() + return nil, err + } + stderr, err := session.StderrPipe() + if err != nil { + _ = session.Close() + return nil, err + } + done := make(chan struct{}) + go func() { + select { + case <-ctx.Done(): + _ = session.Close() + _ = c.client.Close() + case <-done: + } + }() + if err := session.Start(command); err != nil { + close(done) + _ = session.Close() + return nil, err + } + stream := &StreamSession{ + client: c, + session: session, + stdin: stdin, + stdout: stdout, + stderr: stderr, + waitCh: make(chan error, 1), + } + go func() { + err := session.Wait() + close(done) + stream.waitCh <- err + close(stream.waitCh) + }() + return stream, nil +} + +func (s *StreamSession) Stdin() io.WriteCloser { + if s == nil { + return nil + } + return s.stdin +} + +func (s *StreamSession) Stdout() io.Reader { + if s == nil { + return nil + } + return s.stdout +} + +func (s *StreamSession) Stderr() io.Reader { + if s == nil { + return nil + } + return s.stderr +} + +func (s *StreamSession) Wait() error { + if s == nil || s.waitCh == nil { + return nil + } + err, ok := <-s.waitCh + if !ok { + return nil + } + return err +} + +func (s *StreamSession) Close() error { + if s == nil { + return nil + } + var err error + s.closeOnce.Do(func() { + err = errors.Join( + func() error { + if s.session != nil { + return s.session.Close() + } + return nil + }(), + func() error { + if s.client != nil { + return s.client.Close() + } + return nil + }(), + ) + }) + return err +} + func (c *Client) runSession(ctx context.Context, command string, stdin io.Reader, logWriter io.Writer) error { if c == nil || c.client == nil { return fmt.Errorf("ssh client is not connected") diff --git a/internal/guest/ssh_more_test.go b/internal/guest/ssh_more_test.go new file mode 100644 index 0000000..4be594e --- /dev/null +++ b/internal/guest/ssh_more_test.go @@ -0,0 +1,293 @@ +package guest + +import ( + "archive/tar" + "bytes" + "context" + "crypto/rand" + "crypto/rsa" + "crypto/x509" + "encoding/pem" + "errors" + "io" + "net" + "os" + "path/filepath" + "regexp" + "strings" + "testing" + "time" +) + +func writeTestKey(t *testing.T) string { + t.Helper() + privateKey, err := rsa.GenerateKey(rand.Reader, 1024) + if err != nil { + t.Fatalf("GenerateKey: %v", err) + } + privateKeyPEM := pem.EncodeToMemory(&pem.Block{ + Type: "RSA PRIVATE KEY", + Bytes: x509.MarshalPKCS1PrivateKey(privateKey), + }) + keyPath := filepath.Join(t.TempDir(), "id_rsa") + if err := os.WriteFile(keyPath, privateKeyPEM, 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + return keyPath +} + +func TestAuthorizedPublicKeyFingerprint(t *testing.T) { + t.Parallel() + keyPath := writeTestKey(t) + + fp, err := AuthorizedPublicKeyFingerprint(keyPath) + if err != nil { + t.Fatalf("AuthorizedPublicKeyFingerprint: %v", err) + } + if !regexp.MustCompile(`^[0-9a-f]{64}$`).MatchString(fp) { + t.Fatalf("fingerprint = %q, want 64 hex chars", fp) + } + + fp2, err := AuthorizedPublicKeyFingerprint(keyPath) + if err != nil { + t.Fatalf("AuthorizedPublicKeyFingerprint (second): %v", err) + } + if fp != fp2 { + t.Fatalf("fingerprint not deterministic: %q vs %q", fp, fp2) + } +} + +func TestAuthorizedPublicKeyFingerprintMissingFile(t *testing.T) { + t.Parallel() + _, err := AuthorizedPublicKeyFingerprint(filepath.Join(t.TempDir(), "nope")) + if err == nil { + t.Fatal("expected error for missing key file") + } +} + +func TestAuthorizedPublicKeyBadPEM(t *testing.T) { + t.Parallel() + keyPath := filepath.Join(t.TempDir(), "bad") + if err := os.WriteFile(keyPath, []byte("not a private key"), 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + if _, err := AuthorizedPublicKey(keyPath); err == nil { + t.Fatal("expected ParsePrivateKey error") + } +} + +func TestShellQuote(t *testing.T) { + t.Parallel() + cases := []struct { + in, want string + }{ + {"", "''"}, + {"simple", "'simple'"}, + {"with space", "'with space'"}, + {"it's", `'it'"'"'s'`}, + {"a'b'c", `'a'"'"'b'"'"'c'`}, + {"/path/to/file", "'/path/to/file'"}, + } + for _, tc := range cases { + got := shellQuote(tc.in) + if got != tc.want { + t.Errorf("shellQuote(%q) = %q, want %q", tc.in, got, tc.want) + } + } +} + +func TestWriteTarEntriesArchiveRejectsEscape(t *testing.T) { + t.Parallel() + dir := t.TempDir() + var buf bytes.Buffer + err := writeTarEntriesArchive(&buf, dir, []string{"../escape"}) + if err == nil { + t.Fatal("expected error for escaping entry") + } + if !strings.Contains(err.Error(), "escapes source dir") { + t.Fatalf("unexpected error: %v", err) + } +} + +func TestWriteTarEntriesArchiveRejectsDot(t *testing.T) { + t.Parallel() + dir := t.TempDir() + var buf bytes.Buffer + for _, bad := range []string{".", ".."} { + if err := writeTarEntriesArchive(&buf, dir, []string{bad}); err == nil { + t.Errorf("expected error for entry %q", bad) + } + } +} + +func TestWriteTarEntriesArchiveDedupsAndSkipsBlank(t *testing.T) { + t.Parallel() + sourceDir := filepath.Join(t.TempDir(), "repo") + if err := os.MkdirAll(sourceDir, 0o755); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + if err := os.WriteFile(filepath.Join(sourceDir, "a.txt"), []byte("A"), 0o644); err != nil { + t.Fatalf("WriteFile: %v", err) + } + + var buf bytes.Buffer + if err := writeTarEntriesArchive(&buf, sourceDir, []string{"a.txt", "a.txt", "", " "}); err != nil { + t.Fatalf("writeTarEntriesArchive: %v", err) + } + + tr := tar.NewReader(&buf) + var names []string + for { + h, err := tr.Next() + if err == io.EOF { + break + } + if err != nil { + t.Fatalf("tar.Next: %v", err) + } + names = append(names, h.Name) + } + if len(names) != 1 || names[0] != "repo/a.txt" { + t.Fatalf("names = %v, want [repo/a.txt]", names) + } +} + +func TestWriteTarEntriesArchiveSymlink(t *testing.T) { + t.Parallel() + sourceDir := filepath.Join(t.TempDir(), "repo") + if err := os.MkdirAll(sourceDir, 0o755); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + if err := os.WriteFile(filepath.Join(sourceDir, "target.txt"), []byte("T"), 0o644); err != nil { + t.Fatalf("WriteFile: %v", err) + } + linkPath := filepath.Join(sourceDir, "link") + if err := os.Symlink("target.txt", linkPath); err != nil { + t.Skipf("symlink unsupported: %v", err) + } + + var buf bytes.Buffer + if err := writeTarEntriesArchive(&buf, sourceDir, []string{"link"}); err != nil { + t.Fatalf("writeTarEntriesArchive: %v", err) + } + + tr := tar.NewReader(&buf) + h, err := tr.Next() + if err != nil { + t.Fatalf("tar.Next: %v", err) + } + if h.Typeflag != tar.TypeSymlink { + t.Fatalf("typeflag = %v, want TypeSymlink", h.Typeflag) + } + if h.Linkname != "target.txt" { + t.Fatalf("linkname = %q, want target.txt", h.Linkname) + } +} + +func TestWriteTarEntriesArchiveMissingPath(t *testing.T) { + t.Parallel() + sourceDir := t.TempDir() + var buf bytes.Buffer + err := writeTarEntriesArchive(&buf, sourceDir, []string{"missing.txt"}) + if err == nil { + t.Fatal("expected error for missing entry") + } +} + +func TestStreamSessionNilSafe(t *testing.T) { + t.Parallel() + var s *StreamSession + if s.Stdin() != nil || s.Stdout() != nil || s.Stderr() != nil { + t.Fatal("nil StreamSession getters should return nil") + } + if err := s.Wait(); err != nil { + t.Fatalf("nil Wait error: %v", err) + } + if err := s.Close(); err != nil { + t.Fatalf("nil Close error: %v", err) + } +} + +func TestClientNilClose(t *testing.T) { + t.Parallel() + var c *Client + if err := c.Close(); err != nil { + t.Fatalf("nil Close error: %v", err) + } + c2 := &Client{} + if err := c2.Close(); err != nil { + t.Fatalf("empty Close error: %v", err) + } +} + +func TestClientRunScriptOutputNotConnected(t *testing.T) { + t.Parallel() + var c *Client + if _, err := c.RunScriptOutput(context.Background(), "true"); err == nil { + t.Fatal("expected not-connected error") + } + c2 := &Client{} + if _, err := c2.RunScriptOutput(context.Background(), "true"); err == nil { + t.Fatal("expected not-connected error") + } +} + +func TestClientStartCommandNotConnected(t *testing.T) { + t.Parallel() + var c *Client + if _, err := c.StartCommand(context.Background(), "true"); err == nil { + t.Fatal("expected not-connected error") + } +} + +func TestClientRunScriptNotConnected(t *testing.T) { + t.Parallel() + var c *Client + if err := c.RunScript(context.Background(), "true", io.Discard); err == nil { + t.Fatal("expected not-connected error") + } +} + +// freeAddr grabs a loopback port by listening briefly, then closing. Next +// Dial to it deterministically fails with "connection refused" — no real +// server on the far end, no flakiness from random ports being taken. +func freeAddr(t *testing.T) string { + t.Helper() + ln, err := net.Listen("tcp", "127.0.0.1:0") + if err != nil { + t.Fatalf("net.Listen: %v", err) + } + addr := ln.Addr().String() + if err := ln.Close(); err != nil { + t.Fatalf("Close listener: %v", err) + } + return addr +} + +func TestWaitForSSHContextCancel(t *testing.T) { + t.Parallel() + keyPath := writeTestKey(t) + ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond) + defer cancel() + + start := time.Now() + err := WaitForSSH(ctx, freeAddr(t), keyPath, "", 10*time.Millisecond) + if !errors.Is(err, context.DeadlineExceeded) { + t.Fatalf("err = %v, want context.DeadlineExceeded", err) + } + if elapsed := time.Since(start); elapsed > 2*time.Second { + t.Fatalf("took too long: %v", elapsed) + } +} + +func TestDialReturnsErrorForBadKey(t *testing.T) { + t.Parallel() + keyPath := filepath.Join(t.TempDir(), "bogus") + if err := os.WriteFile(keyPath, []byte("nope"), 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + _, err := Dial(context.Background(), freeAddr(t), keyPath, "") + if err == nil { + t.Fatal("expected error for bad key") + } +} diff --git a/internal/hostnat/runner_test.go b/internal/hostnat/runner_test.go new file mode 100644 index 0000000..7853e53 --- /dev/null +++ b/internal/hostnat/runner_test.go @@ -0,0 +1,258 @@ +package hostnat + +import ( + "context" + "errors" + "fmt" + "reflect" + "strings" + "testing" +) + +type call struct { + sudo bool + name string + args []string +} + +type fakeRunner struct { + calls []call + // runResp maps "name arg0 arg1 ..." (Run, no sudo) to a scripted + // (stdout, err) pair. Missing entries return error. + runResp map[string]callResp + // sudoMatcher decides whether a RunSudo call succeeds. If nil, all + // RunSudo calls succeed with empty stdout. + sudoMatcher func(args []string) ([]byte, error) +} + +type callResp struct { + out []byte + err error +} + +func (r *fakeRunner) Run(ctx context.Context, name string, args ...string) ([]byte, error) { + c := call{name: name, args: append([]string(nil), args...)} + r.calls = append(r.calls, c) + key := name + " " + strings.Join(args, " ") + if resp, ok := r.runResp[key]; ok { + return resp.out, resp.err + } + return nil, fmt.Errorf("unexpected Run: %s", key) +} + +func (r *fakeRunner) RunSudo(ctx context.Context, args ...string) ([]byte, error) { + c := call{sudo: true, args: append([]string(nil), args...)} + r.calls = append(r.calls, c) + if r.sudoMatcher != nil { + return r.sudoMatcher(args) + } + return nil, nil +} + +func TestDefaultUplink(t *testing.T) { + t.Parallel() + r := &fakeRunner{ + runResp: map[string]callResp{ + "ip route show default": {out: []byte("default via 10.0.0.1 dev wlan0 proto dhcp\n")}, + }, + } + got, err := DefaultUplink(context.Background(), r) + if err != nil { + t.Fatalf("DefaultUplink: %v", err) + } + if got != "wlan0" { + t.Fatalf("got %q, want wlan0", got) + } +} + +func TestDefaultUplinkPropagatesRunError(t *testing.T) { + t.Parallel() + r := &fakeRunner{} + _, err := DefaultUplink(context.Background(), r) + if err == nil { + t.Fatal("expected error from DefaultUplink when Run fails") + } +} + +func TestRuleKey(t *testing.T) { + rule := Rule{Table: "nat", Chain: "POSTROUTING", Args: []string{"-s", "172.16.0.5/32"}} + key := RuleKey(rule) + if !strings.Contains(key, "nat") || !strings.Contains(key, "POSTROUTING") || !strings.Contains(key, "172.16.0.5/32") { + t.Fatalf("key missing expected parts: %q", key) + } + + // Different args → different key. + other := Rule{Table: "nat", Chain: "POSTROUTING", Args: []string{"-s", "10.0.0.5/32"}} + if RuleKey(rule) == RuleKey(other) { + t.Fatal("RuleKey should differ for different args") + } +} + +func TestEnsureEnableInstallsRules(t *testing.T) { + t.Parallel() + r := &fakeRunner{ + runResp: map[string]callResp{ + "ip route show default": {out: []byte("default via 10.0.0.1 dev eth0\n")}, + }, + sudoMatcher: func(args []string) ([]byte, error) { + // The first sudo call is sysctl; every subsequent call is + // `iptables -C ...` (probe) followed by `iptables -A ...` + // because the probe should report the rule is NOT present. + if args[0] == "sysctl" { + return nil, nil + } + if args[0] != "iptables" { + return nil, fmt.Errorf("unexpected sudo prefix: %v", args) + } + // Fail -C (rule absent) so Ensure issues -A. + for _, a := range args { + if a == "-C" { + return nil, errors.New("rule absent") + } + } + return nil, nil + }, + } + + if err := Ensure(context.Background(), r, "172.16.0.5", "tap-x", true); err != nil { + t.Fatalf("Ensure: %v", err) + } + + // Expect at least: 1 ip route, 1 sysctl, and for 3 rules: -C + -A = 6 iptables calls. + if len(r.calls) < 8 { + t.Fatalf("call count = %d, want >= 8; calls=%+v", len(r.calls), r.calls) + } + // First call is ip route; second is sysctl. + if r.calls[0].name != "ip" { + t.Errorf("calls[0] = %+v, want ip route", r.calls[0]) + } + if !r.calls[1].sudo || r.calls[1].args[0] != "sysctl" { + t.Errorf("calls[1] = %+v, want sudo sysctl", r.calls[1]) + } + // Somewhere we must have an iptables -A POSTROUTING call. + var sawAppend bool + for _, c := range r.calls { + if c.sudo && len(c.args) >= 3 && c.args[0] == "iptables" && contains(c.args, "-A") && contains(c.args, "POSTROUTING") { + sawAppend = true + break + } + } + if !sawAppend { + t.Fatal("no iptables -A POSTROUTING call observed") + } +} + +func TestEnsureEnableSkipsAppendWhenRulePresent(t *testing.T) { + t.Parallel() + r := &fakeRunner{ + runResp: map[string]callResp{ + "ip route show default": {out: []byte("default via 10.0.0.1 dev eth0\n")}, + }, + sudoMatcher: func(args []string) ([]byte, error) { + // Probe succeeds → Ensure should NOT follow up with -A. + return nil, nil + }, + } + if err := Ensure(context.Background(), r, "172.16.0.5", "tap-x", true); err != nil { + t.Fatalf("Ensure: %v", err) + } + + // No -A iptables calls should have been issued. + for _, c := range r.calls { + if c.sudo && contains(c.args, "iptables") && contains(c.args, "-A") { + t.Fatalf("unexpected -A call with probe success: %+v", c) + } + } +} + +func TestEnsureDisableRemovesRulesWhenPresent(t *testing.T) { + t.Parallel() + r := &fakeRunner{ + runResp: map[string]callResp{ + "ip route show default": {out: []byte("default via 10.0.0.1 dev eth0\n")}, + }, + sudoMatcher: func(args []string) ([]byte, error) { + // Every probe succeeds → rule is present → -D is issued. + return nil, nil + }, + } + if err := Ensure(context.Background(), r, "172.16.0.5", "tap-x", false); err != nil { + t.Fatalf("Ensure(disable): %v", err) + } + var sawDelete bool + for _, c := range r.calls { + if c.sudo && contains(c.args, "iptables") && contains(c.args, "-D") { + sawDelete = true + break + } + } + if !sawDelete { + t.Fatal("expected at least one iptables -D call") + } + // No sysctl on disable path. + for _, c := range r.calls { + if c.sudo && len(c.args) > 0 && c.args[0] == "sysctl" { + t.Fatal("sysctl should not run on disable path") + } + } +} + +func TestEnsureDisableSkipsRemovalWhenAbsent(t *testing.T) { + t.Parallel() + r := &fakeRunner{ + runResp: map[string]callResp{ + "ip route show default": {out: []byte("default via 10.0.0.1 dev eth0\n")}, + }, + sudoMatcher: func(args []string) ([]byte, error) { + return nil, errors.New("rule not present") + }, + } + if err := Ensure(context.Background(), r, "172.16.0.5", "tap-x", false); err != nil { + t.Fatalf("Ensure(disable, absent): %v", err) + } + for _, c := range r.calls { + if c.sudo && contains(c.args, "iptables") && contains(c.args, "-D") { + t.Fatalf("unexpected -D with absent rule: %+v", c) + } + } +} + +func TestEnsurePropagatesUplinkError(t *testing.T) { + t.Parallel() + r := &fakeRunner{} // no runResp → ip route fails + err := Ensure(context.Background(), r, "172.16.0.5", "tap-x", true) + if err == nil { + t.Fatal("expected uplink error to propagate") + } +} + +func TestEnsureValidatesInputs(t *testing.T) { + t.Parallel() + r := &fakeRunner{ + runResp: map[string]callResp{ + "ip route show default": {out: []byte("default via 10.0.0.1 dev eth0\n")}, + }, + } + if err := Ensure(context.Background(), r, "", "tap-x", true); err == nil { + t.Fatal("expected error for empty guestIP") + } +} + +func TestRuleArgsWithoutTable(t *testing.T) { + // Sanity: RuleArgs should only prepend -t when Table is set. + bare := Rule{Chain: "FORWARD", Args: []string{"-i", "eth0"}} + got := RuleArgs("-A", bare) + want := []string{"-A", "FORWARD", "-i", "eth0"} + if !reflect.DeepEqual(got, want) { + t.Fatalf("got %v, want %v", got, want) + } +} + +func contains(xs []string, target string) bool { + for _, x := range xs { + if x == target { + return true + } + } + return false +} diff --git a/internal/imagecat/catalog.go b/internal/imagecat/catalog.go new file mode 100644 index 0000000..b84415b --- /dev/null +++ b/internal/imagecat/catalog.go @@ -0,0 +1,88 @@ +// Package imagecat is the published catalog of banger image bundles +// (rootfs.ext4 + manifest.json, packaged as a .tar.zst). It ships +// embedded in the banger binary. Downloading a bundle is the fast +// path for pulling a curated banger image — the rootfs is already +// flattened, ownership-fixed, and has banger's guest agents injected +// at build time. +// +// This package is the metadata + fetch layer. Writing to the banger +// image store is done by higher layers (the daemon's PullImage +// orchestrator), so imagecat has no local-storage concept of its own. +package imagecat + +import ( + _ "embed" + "encoding/json" + "fmt" + "os" + "regexp" + "strings" +) + +//go:embed catalog.json +var embeddedCatalog []byte + +// Catalog is the list of pullable image bundles compiled into this +// banger binary. +type Catalog struct { + Version int `json:"version"` + Entries []CatEntry `json:"entries"` +} + +// CatEntry describes one downloadable bundle. TarballURL points at a +// .tar.zst containing rootfs.ext4 and manifest.json. +type CatEntry struct { + Name string `json:"name"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` + KernelRef string `json:"kernel_ref,omitempty"` // kernelcat entry name to pair with + TarballURL string `json:"tarball_url"` + TarballSHA256 string `json:"tarball_sha256"` + SizeBytes int64 `json:"size_bytes,omitempty"` + Description string `json:"description,omitempty"` +} + +// LoadEmbedded returns the catalog compiled into this banger binary. +func LoadEmbedded() (Catalog, error) { + return ParseCatalog(embeddedCatalog) +} + +// ParseCatalog decodes a catalog.json payload. An empty payload is +// valid and yields a zero Catalog. +func ParseCatalog(data []byte) (Catalog, error) { + var cat Catalog + if len(data) == 0 { + return cat, nil + } + if err := json.Unmarshal(data, &cat); err != nil { + return Catalog{}, fmt.Errorf("parse catalog: %w", err) + } + return cat, nil +} + +// Lookup returns the entry matching name, or os.ErrNotExist. +func (c Catalog) Lookup(name string) (CatEntry, error) { + for _, e := range c.Entries { + if e.Name == name { + return e, nil + } + } + return CatEntry{}, os.ErrNotExist +} + +// namePattern accepts short filesystem-safe identifiers. Same rule as +// kernelcat so `--kernel-ref` and bundle-name refs share syntax. +var namePattern = regexp.MustCompile(`^[a-zA-Z0-9][a-zA-Z0-9._-]{0,63}$`) + +// ValidateName returns an error unless name is a non-empty identifier +// of alphanumerics, dots, hyphens, and underscores, starting with an +// alphanumeric and at most 64 characters long. +func ValidateName(name string) error { + if strings.TrimSpace(name) == "" { + return fmt.Errorf("image name is required") + } + if !namePattern.MatchString(name) { + return fmt.Errorf("invalid image name %q: use alphanumerics, dots, hyphens, underscores (<=64 chars, starts with alphanumeric)", name) + } + return nil +} diff --git a/internal/imagecat/catalog.json b/internal/imagecat/catalog.json new file mode 100644 index 0000000..f8e9fd5 --- /dev/null +++ b/internal/imagecat/catalog.json @@ -0,0 +1,14 @@ +{ + "version": 1, + "entries": [ + { + "name": "debian-bookworm", + "distro": "debian", + "arch": "x86_64", + "kernel_ref": "generic-6.12", + "tarball_url": "https://images.thaloco.com/debian-bookworm-x86_64-e5000c22ea98.tar.zst", + "tarball_sha256": "e5000c22ea9801b25425361628ea177328e0fa85181dd00775c09f77d0c5baf2", + "size_bytes": 289965264 + } + ] +} diff --git a/internal/imagecat/catalog_test.go b/internal/imagecat/catalog_test.go new file mode 100644 index 0000000..e903877 --- /dev/null +++ b/internal/imagecat/catalog_test.go @@ -0,0 +1,80 @@ +package imagecat + +import ( + "errors" + "os" + "testing" +) + +func TestLoadEmbeddedReturnsVersion1(t *testing.T) { + cat, err := LoadEmbedded() + if err != nil { + t.Fatalf("LoadEmbedded: %v", err) + } + if cat.Version != 1 { + t.Fatalf("Version = %d, want 1", cat.Version) + } +} + +func TestParseCatalogAcceptsNilAndEmpty(t *testing.T) { + for _, data := range [][]byte{nil, {}} { + cat, err := ParseCatalog(data) + if err != nil { + t.Fatalf("ParseCatalog(%q): %v", data, err) + } + if cat.Version != 0 || len(cat.Entries) != 0 { + t.Fatalf("ParseCatalog returned non-zero catalog: %+v", cat) + } + } +} + +func TestParseCatalogRejectsMalformed(t *testing.T) { + if _, err := ParseCatalog([]byte("not json")); err == nil { + t.Fatal("want parse error for malformed catalog") + } +} + +func TestLookupHitAndMiss(t *testing.T) { + cat := Catalog{ + Version: 1, + Entries: []CatEntry{ + {Name: "debian-bookworm", TarballURL: "https://example.com/a.tar.zst", TarballSHA256: "deadbeef"}, + }, + } + hit, err := cat.Lookup("debian-bookworm") + if err != nil { + t.Fatalf("Lookup hit: %v", err) + } + if hit.TarballURL != "https://example.com/a.tar.zst" { + t.Fatalf("unexpected entry: %+v", hit) + } + if _, err := cat.Lookup("nope"); !errors.Is(err, os.ErrNotExist) { + t.Fatalf("Lookup miss error = %v, want ErrNotExist", err) + } +} + +func TestValidateName(t *testing.T) { + cases := []struct { + name string + ok bool + }{ + {"debian-bookworm", true}, + {"alpine-3.20", true}, + {"generic-6.12", true}, + {"a", true}, + {"", false}, + {" ", false}, + {"-starts-with-hyphen", false}, + {"has spaces", false}, + {"has/slash", false}, + } + for _, tc := range cases { + err := ValidateName(tc.name) + if tc.ok && err != nil { + t.Errorf("ValidateName(%q): unexpected error %v", tc.name, err) + } + if !tc.ok && err == nil { + t.Errorf("ValidateName(%q): expected error", tc.name) + } + } +} diff --git a/internal/imagecat/fetch.go b/internal/imagecat/fetch.go new file mode 100644 index 0000000..99777d3 --- /dev/null +++ b/internal/imagecat/fetch.go @@ -0,0 +1,211 @@ +package imagecat + +import ( + "archive/tar" + "context" + "crypto/sha256" + "encoding/hex" + "encoding/json" + "fmt" + "io" + "net/http" + "os" + "path/filepath" + "strings" + + "github.com/klauspost/compress/zstd" +) + +// Bundle filenames expected at the root of the .tar.zst. +const ( + RootfsFilename = "rootfs.ext4" + ManifestFilename = "manifest.json" +) + +// MaxFetchedBundleBytes caps the compressed bundle download. The +// previous flow streamed straight into a tar+zstd extractor and only +// hashed afterwards, so a malicious or compromised source could +// consume unbounded disk before the SHA mismatch fired. We now stage +// the download to a temp file under destDir, hash it on the way in, +// and refuse to decompress if the hash is wrong — bounding worst-case +// disk use to this cap. Generous enough for any legitimate banger +// rootfs bundle (a 4 GiB ext4 typically zstd-compresses to ~1-2 GiB); +// override per-call by setting this var before invoking Fetch. +var MaxFetchedBundleBytes int64 = 8 << 30 // 8 GiB + +// Manifest is the metadata file embedded inside a bundle. It mirrors +// the subset of CatEntry fields that describe the bundle's content +// (the remote URL + sha256 are catalog concerns, not bundle concerns). +type Manifest struct { + Name string `json:"name"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` + KernelRef string `json:"kernel_ref,omitempty"` + Description string `json:"description,omitempty"` +} + +// Fetch downloads entry's tarball, verifies its SHA256, and writes +// rootfs.ext4 + manifest.json into destDir. Returns the parsed +// manifest. On any error the partially-written files are removed so +// destDir is left in its pre-call state. +// +// destDir must already exist. Fetch does not create it, mirroring +// kernelcat.Fetch so callers manage their own staging. +func Fetch(ctx context.Context, client *http.Client, destDir string, entry CatEntry) (Manifest, error) { + if err := ValidateName(entry.Name); err != nil { + return Manifest{}, err + } + if strings.TrimSpace(entry.TarballURL) == "" { + return Manifest{}, fmt.Errorf("catalog entry %q has no tarball URL", entry.Name) + } + if strings.TrimSpace(entry.TarballSHA256) == "" { + return Manifest{}, fmt.Errorf("catalog entry %q has no tarball sha256", entry.Name) + } + if client == nil { + client = http.DefaultClient + } + + absDest, err := filepath.Abs(destDir) + if err != nil { + return Manifest{}, err + } + info, err := os.Stat(absDest) + if err != nil { + return Manifest{}, err + } + if !info.IsDir() { + return Manifest{}, fmt.Errorf("destDir %q is not a directory", destDir) + } + + cleanup := func() { + _ = os.Remove(filepath.Join(absDest, RootfsFilename)) + _ = os.Remove(filepath.Join(absDest, ManifestFilename)) + } + + req, err := http.NewRequestWithContext(ctx, http.MethodGet, entry.TarballURL, nil) + if err != nil { + return Manifest{}, err + } + resp, err := client.Do(req) + if err != nil { + return Manifest{}, fmt.Errorf("fetch %s: %w", entry.TarballURL, err) + } + defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return Manifest{}, fmt.Errorf("fetch %s: HTTP %s", entry.TarballURL, resp.Status) + } + + if resp.ContentLength > MaxFetchedBundleBytes { + return Manifest{}, fmt.Errorf("tarball advertised %d bytes, exceeds %d-byte cap", resp.ContentLength, MaxFetchedBundleBytes) + } + + // Stage the compressed tarball on disk first so we can verify the + // SHA256 BEFORE decompressing or extracting. Cap the read at + // MaxFetchedBundleBytes+1 — anything larger is refused. + tmp, err := os.CreateTemp(absDest, "banger-bundle-*.tar.zst") + if err != nil { + return Manifest{}, fmt.Errorf("create staging file: %w", err) + } + tmpPath := tmp.Name() + defer os.Remove(tmpPath) + + hasher := sha256.New() + limited := io.LimitReader(resp.Body, MaxFetchedBundleBytes+1) + n, copyErr := io.Copy(io.MultiWriter(tmp, hasher), limited) + if closeErr := tmp.Close(); copyErr == nil && closeErr != nil { + copyErr = closeErr + } + if copyErr != nil { + return Manifest{}, fmt.Errorf("download tarball: %w", copyErr) + } + if n > MaxFetchedBundleBytes { + return Manifest{}, fmt.Errorf("tarball exceeded %d-byte cap before sha256 check", MaxFetchedBundleBytes) + } + + got := hex.EncodeToString(hasher.Sum(nil)) + if !strings.EqualFold(got, entry.TarballSHA256) { + return Manifest{}, fmt.Errorf("tarball sha256 mismatch: got %s, want %s", got, entry.TarballSHA256) + } + + src, err := os.Open(tmpPath) + if err != nil { + return Manifest{}, fmt.Errorf("reopen staged tarball: %w", err) + } + defer src.Close() + zr, err := zstd.NewReader(src) + if err != nil { + return Manifest{}, fmt.Errorf("init zstd: %w", err) + } + defer zr.Close() + + if err := extractBundle(zr, absDest); err != nil { + cleanup() + return Manifest{}, err + } + + if _, err := os.Stat(filepath.Join(absDest, RootfsFilename)); err != nil { + cleanup() + return Manifest{}, fmt.Errorf("bundle missing %s: %w", RootfsFilename, err) + } + manifestData, err := os.ReadFile(filepath.Join(absDest, ManifestFilename)) + if err != nil { + cleanup() + return Manifest{}, fmt.Errorf("read manifest: %w", err) + } + var manifest Manifest + if err := json.Unmarshal(manifestData, &manifest); err != nil { + cleanup() + return Manifest{}, fmt.Errorf("parse manifest: %w", err) + } + if strings.TrimSpace(manifest.Name) == "" { + manifest.Name = entry.Name + } + return manifest, nil +} + +// extractBundle writes the bundle's two regular-file entries into +// absDest, refusing any other member type, any extra entry, and any +// path that escapes absDest. +func extractBundle(r io.Reader, absDest string) error { + tr := tar.NewReader(r) + seen := map[string]bool{} + for { + hdr, err := tr.Next() + if err == io.EOF { + break + } + if err != nil { + return fmt.Errorf("read bundle: %w", err) + } + rel := filepath.Clean(hdr.Name) + if rel == "." || rel == string(filepath.Separator) { + continue + } + if filepath.IsAbs(rel) || rel == ".." || strings.HasPrefix(rel, ".."+string(filepath.Separator)) { + return fmt.Errorf("unsafe path in bundle: %q", hdr.Name) + } + if rel != RootfsFilename && rel != ManifestFilename { + return fmt.Errorf("unexpected bundle entry %q (expected %s or %s at the root)", hdr.Name, RootfsFilename, ManifestFilename) + } + if hdr.Typeflag != tar.TypeReg { + return fmt.Errorf("bundle entry %q is not a regular file", hdr.Name) + } + dst := filepath.Join(absDest, rel) + f, err := os.OpenFile(dst, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644) + if err != nil { + return err + } + if _, err := io.Copy(f, tr); err != nil { + _ = f.Close() + return err + } + if err := f.Close(); err != nil { + return err + } + seen[rel] = true + } + if !seen[RootfsFilename] || !seen[ManifestFilename] { + return fmt.Errorf("bundle is missing required files: want both %s and %s", RootfsFilename, ManifestFilename) + } + return nil +} diff --git a/internal/imagecat/fetch_test.go b/internal/imagecat/fetch_test.go new file mode 100644 index 0000000..f8977d0 --- /dev/null +++ b/internal/imagecat/fetch_test.go @@ -0,0 +1,286 @@ +package imagecat + +import ( + "archive/tar" + "bytes" + "context" + "crypto/sha256" + "encoding/hex" + "encoding/json" + "io" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/klauspost/compress/zstd" +) + +// makeBundle builds a valid .tar.zst bundle with the given manifest +// and rootfs bytes. Returns the bundle bytes and their sha256 hex. +func makeBundle(t *testing.T, manifest Manifest, rootfs []byte) ([]byte, string) { + t.Helper() + var rawTar bytes.Buffer + tw := tar.NewWriter(&rawTar) + manifestJSON, err := json.Marshal(manifest) + if err != nil { + t.Fatal(err) + } + entries := []struct { + name string + data []byte + }{ + {RootfsFilename, rootfs}, + {ManifestFilename, manifestJSON}, + } + for _, e := range entries { + if err := tw.WriteHeader(&tar.Header{ + Name: e.name, + Size: int64(len(e.data)), + Mode: 0o644, + Typeflag: tar.TypeReg, + }); err != nil { + t.Fatal(err) + } + if _, err := tw.Write(e.data); err != nil { + t.Fatal(err) + } + } + if err := tw.Close(); err != nil { + t.Fatal(err) + } + var zstBuf bytes.Buffer + zw, err := zstd.NewWriter(&zstBuf) + if err != nil { + t.Fatal(err) + } + if _, err := io.Copy(zw, &rawTar); err != nil { + t.Fatal(err) + } + if err := zw.Close(); err != nil { + t.Fatal(err) + } + sum := sha256.Sum256(zstBuf.Bytes()) + return zstBuf.Bytes(), hex.EncodeToString(sum[:]) +} + +func serveBundle(t *testing.T, payload []byte) *httptest.Server { + t.Helper() + return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/octet-stream") + _, _ = w.Write(payload) + })) +} + +func TestFetchHappyPath(t *testing.T) { + manifest := Manifest{ + Name: "debian-bookworm", + Distro: "debian", + Arch: "x86_64", + KernelRef: "generic-6.12", + } + rootfs := []byte("not-actually-an-ext4-but-that's-fine-for-the-test") + bundle, sum := makeBundle(t, manifest, rootfs) + srv := serveBundle(t, bundle) + t.Cleanup(srv.Close) + + dest := t.TempDir() + got, err := Fetch(context.Background(), srv.Client(), dest, CatEntry{ + Name: "debian-bookworm", + TarballURL: srv.URL + "/bundle.tar.zst", + TarballSHA256: sum, + }) + if err != nil { + t.Fatalf("Fetch: %v", err) + } + if got.Name != "debian-bookworm" || got.KernelRef != "generic-6.12" || got.Distro != "debian" { + t.Fatalf("manifest = %+v", got) + } + if b, err := os.ReadFile(filepath.Join(dest, RootfsFilename)); err != nil || !bytes.Equal(b, rootfs) { + t.Fatalf("rootfs content mismatch: err=%v, %q", err, b) + } + if _, err := os.Stat(filepath.Join(dest, ManifestFilename)); err != nil { + t.Fatalf("manifest missing: %v", err) + } +} + +func TestFetchRejectsSHA256Mismatch(t *testing.T) { + manifest := Manifest{Name: "debian-bookworm"} + bundle, _ := makeBundle(t, manifest, []byte("abc")) + srv := serveBundle(t, bundle) + t.Cleanup(srv.Close) + + dest := t.TempDir() + _, err := Fetch(context.Background(), srv.Client(), dest, CatEntry{ + Name: "debian-bookworm", + TarballURL: srv.URL + "/bundle.tar.zst", + TarballSHA256: "deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef", + }) + if err == nil || !strings.Contains(err.Error(), "sha256 mismatch") { + t.Fatalf("want sha256 mismatch error, got %v", err) + } + // Cleanup: dest should not contain partial files. + if _, err := os.Stat(filepath.Join(dest, RootfsFilename)); !os.IsNotExist(err) { + t.Fatalf("rootfs should be cleaned up on sha256 failure, got %v", err) + } + if _, err := os.Stat(filepath.Join(dest, ManifestFilename)); !os.IsNotExist(err) { + t.Fatalf("manifest should be cleaned up on sha256 failure, got %v", err) + } +} + +// TestFetchRejectsOversizedTarballBeforeExtraction pins the new +// disk-bound cap: by setting MaxFetchedBundleBytes very low, the +// staged-tarball download must trip the limit and refuse to even +// decompress, leaving the destination dir clean. This is the +// "compromised mirror floods the host" scenario. +func TestFetchRejectsOversizedTarballBeforeExtraction(t *testing.T) { + manifest := Manifest{Name: "debian-bookworm"} + bundle, sum := makeBundle(t, manifest, bytes.Repeat([]byte("x"), 4096)) + srv := serveBundle(t, bundle) + t.Cleanup(srv.Close) + + prev := MaxFetchedBundleBytes + MaxFetchedBundleBytes = 64 + t.Cleanup(func() { MaxFetchedBundleBytes = prev }) + + dest := t.TempDir() + _, err := Fetch(context.Background(), srv.Client(), dest, CatEntry{ + Name: "debian-bookworm", + TarballURL: srv.URL + "/bundle.tar.zst", + TarballSHA256: sum, + }) + if err == nil { + t.Fatal("Fetch succeeded against an oversized tarball; want size-cap rejection") + } + if !strings.Contains(err.Error(), "cap") { + t.Fatalf("err = %v, want size-cap message", err) + } + // dest must be untouched: no rootfs, no manifest, no leftover tmp. + entries, _ := os.ReadDir(dest) + if len(entries) != 0 { + var names []string + for _, e := range entries { + names = append(names, e.Name()) + } + t.Fatalf("dest left dirty after size-cap rejection: %v", names) + } +} + +func TestFetchRejectsUnexpectedTarEntry(t *testing.T) { + // Hand-roll a bundle with a third, disallowed entry. + var rawTar bytes.Buffer + tw := tar.NewWriter(&rawTar) + for _, e := range []struct{ name, data string }{ + {RootfsFilename, "rootfs"}, + {ManifestFilename, `{"name":"x"}`}, + {"extra", "should be rejected"}, + } { + if err := tw.WriteHeader(&tar.Header{ + Name: e.name, + Size: int64(len(e.data)), + Mode: 0o644, + Typeflag: tar.TypeReg, + }); err != nil { + t.Fatal(err) + } + if _, err := tw.Write([]byte(e.data)); err != nil { + t.Fatal(err) + } + } + if err := tw.Close(); err != nil { + t.Fatal(err) + } + var zstBuf bytes.Buffer + zw, _ := zstd.NewWriter(&zstBuf) + _, _ = io.Copy(zw, &rawTar) + _ = zw.Close() + sum := sha256.Sum256(zstBuf.Bytes()) + + srv := serveBundle(t, zstBuf.Bytes()) + t.Cleanup(srv.Close) + + _, err := Fetch(context.Background(), srv.Client(), t.TempDir(), CatEntry{ + Name: "x", + TarballURL: srv.URL + "/bundle.tar.zst", + TarballSHA256: hex.EncodeToString(sum[:]), + }) + if err == nil || !strings.Contains(err.Error(), "unexpected bundle entry") { + t.Fatalf("want unexpected entry error, got %v", err) + } +} + +func TestFetchRejectsMissingManifest(t *testing.T) { + // Bundle with only rootfs. + var rawTar bytes.Buffer + tw := tar.NewWriter(&rawTar) + _ = tw.WriteHeader(&tar.Header{Name: RootfsFilename, Size: 3, Mode: 0o644, Typeflag: tar.TypeReg}) + _, _ = tw.Write([]byte("abc")) + _ = tw.Close() + var zstBuf bytes.Buffer + zw, _ := zstd.NewWriter(&zstBuf) + _, _ = io.Copy(zw, &rawTar) + _ = zw.Close() + sum := sha256.Sum256(zstBuf.Bytes()) + + srv := serveBundle(t, zstBuf.Bytes()) + t.Cleanup(srv.Close) + + _, err := Fetch(context.Background(), srv.Client(), t.TempDir(), CatEntry{ + Name: "x", + TarballURL: srv.URL + "/bundle.tar.zst", + TarballSHA256: hex.EncodeToString(sum[:]), + }) + if err == nil || !strings.Contains(err.Error(), "missing required files") { + t.Fatalf("want missing-required-files error, got %v", err) + } +} + +func TestFetchRejectsHTTPFailure(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + http.Error(w, "not found", http.StatusNotFound) + })) + t.Cleanup(srv.Close) + + _, err := Fetch(context.Background(), srv.Client(), t.TempDir(), CatEntry{ + Name: "x", + TarballURL: srv.URL + "/missing.tar.zst", + TarballSHA256: "deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef", + }) + if err == nil || !strings.Contains(err.Error(), "HTTP") { + t.Fatalf("want HTTP error, got %v", err) + } +} + +func TestFetchRejectsEmptyURL(t *testing.T) { + _, err := Fetch(context.Background(), http.DefaultClient, t.TempDir(), CatEntry{ + Name: "x", + TarballURL: "", + TarballSHA256: "abc", + }) + if err == nil || !strings.Contains(err.Error(), "no tarball URL") { + t.Fatalf("want no-URL error, got %v", err) + } +} + +func TestFetchRejectsEmptySHA256(t *testing.T) { + _, err := Fetch(context.Background(), http.DefaultClient, t.TempDir(), CatEntry{ + Name: "x", + TarballURL: "https://example.com/x.tar.zst", + }) + if err == nil || !strings.Contains(err.Error(), "no tarball sha256") { + t.Fatalf("want no-sha error, got %v", err) + } +} + +func TestFetchRejectsInvalidName(t *testing.T) { + _, err := Fetch(context.Background(), http.DefaultClient, t.TempDir(), CatEntry{ + Name: "", + TarballURL: "https://example.com/x.tar.zst", + TarballSHA256: "abc", + }) + if err == nil || !strings.Contains(err.Error(), "image name is required") { + t.Fatalf("want name-required error, got %v", err) + } +} diff --git a/internal/imagepreset/preset.go b/internal/imagepreset/preset.go deleted file mode 100644 index 3f60ba7..0000000 --- a/internal/imagepreset/preset.go +++ /dev/null @@ -1,86 +0,0 @@ -package imagepreset - -import ( - "crypto/sha256" - "fmt" - "strings" -) - -var debianBase = []string{ - "make", - "git", - "less", - "tree", - "ca-certificates", - "curl", - "wget", - "iproute2", - "vim", - "tmux", -} - -var voidBase = []string{ - "base-minimal", - "base-devel", - "bash", - "ca-certificates", - "curl", - "docker", - "docker-compose", - "e2fsprogs", - "git", - "iproute2", - "less", - "make", - "openssh", - "procps-ng", - "runit", - "shadow", - "sudo", - "tmux", - "tree", - "vim", - "wget", -} - -var alpineBase = []string{ - "alpine-base", - "bash", - "ca-certificates", - "curl", - "docker", - "docker-cli-compose", - "e2fsprogs", - "git", - "iproute2", - "less", - "libgcc", - "libstdc++", - "make", - "mkinitfs", - "openssh", - "procps-ng", - "shadow", - "sudo", - "tmux", - "tree", - "vim", - "wget", -} - -func DebianBasePackages() []string { - return append([]string(nil), debianBase...) -} - -func VoidBasePackages() []string { - return append([]string(nil), voidBase...) -} - -func AlpineBasePackages() []string { - return append([]string(nil), alpineBase...) -} - -func Hash(lines []string) string { - sum := sha256.Sum256([]byte(strings.Join(lines, "\n") + "\n")) - return fmt.Sprintf("%x", sum) -} diff --git a/internal/imagepull/assets/first-boot.service b/internal/imagepull/assets/first-boot.service new file mode 100644 index 0000000..fdf2967 --- /dev/null +++ b/internal/imagepull/assets/first-boot.service @@ -0,0 +1,17 @@ +[Unit] +Description=Banger first-boot provisioning +After=network-online.target banger-network.service +Wants=network-online.target +Before=sshd.service ssh.service +ConditionPathExists=/var/lib/banger/first-boot-pending + +[Service] +Type=oneshot +ExecStart=/usr/local/libexec/banger-first-boot +RemainAfterExit=yes +StandardOutput=journal +StandardError=journal +TimeoutStartSec=300s + +[Install] +WantedBy=multi-user.target diff --git a/internal/imagepull/assets/first-boot.sh b/internal/imagepull/assets/first-boot.sh new file mode 100644 index 0000000..a934e6a --- /dev/null +++ b/internal/imagepull/assets/first-boot.sh @@ -0,0 +1,142 @@ +#!/bin/sh +# banger-first-boot — universal init wrapper for banger VMs. +# +# When passed as init= on the kernel cmdline (direct-boot images without +# an initramfs), this script runs as PID 1. It: +# 1. Mounts the essential virtual filesystems (/proc, /sys, /dev, /run) +# 2. If systemd (or any init) is already installed, execs it immediately +# 3. Otherwise: brings up the network, installs systemd + openssh-server +# via the guest's native package manager, then execs systemd +# +# On subsequent boots (after systemd is installed), step 2 fires in <10ms. +# +# Test hooks: +# RUN_PLAN=1 echo the install command instead of executing it +# OS_RELEASE_FILE= override /etc/os-release for distro detection +# BANGER_FIRST_BOOT_MARKER= override the marker file path + +set -eu + +log() { printf '[banger-first-boot] %s\n' "$*" >&2; } + +# --- Step 1: essential mounts (only when running as PID 1) --- +if [ "$$" = "1" ]; then + mount -t proc proc /proc 2>/dev/null || true + mount -t sysfs sysfs /sys 2>/dev/null || true + mount -t devtmpfs devtmpfs /dev 2>/dev/null || true + mount -t tmpfs tmpfs /run 2>/dev/null || true + mount -t tmpfs tmpfs /tmp 2>/dev/null || true +fi + +# --- Step 2: if a real init exists, hand off immediately --- +# (RUN_PLAN mode skips this so the dispatch logic can be tested on hosts +# that have systemd installed.) +if [ "${RUN_PLAN:-0}" != "1" ]; then + for candidate_init in /usr/lib/systemd/systemd /lib/systemd/systemd /sbin/init; do + if [ -x "$candidate_init" ]; then + MARKER="${BANGER_FIRST_BOOT_MARKER:-/var/lib/banger/first-boot-pending}" + if [ -f "$MARKER" ]; then + rm -f "$MARKER" + fi + log "found init at $candidate_init; handing off" + exec "$candidate_init" "$@" + fi + done +fi + +# --- Step 3: no init found — we're on a container image, provision it --- +log "no init system found; installing systemd + openssh-server" + +# Bring up network so apt-get/apk can reach package repos. +# banger-network-bootstrap reads IP from /proc/cmdline (kernel ip= arg) +# or /etc/banger-network.conf (written by vm_disk.patchRootOverlay). +if [ -x /usr/local/libexec/banger-network-bootstrap ]; then + log "bringing up network" + /usr/local/libexec/banger-network-bootstrap || log "network bootstrap failed (continuing anyway)" +fi + +# Detect distro +DIST="" +FAMILY="" +OS_RELEASE_FILE="${OS_RELEASE_FILE:-/etc/os-release}" +if [ -r "$OS_RELEASE_FILE" ]; then + # shellcheck source=/dev/null + . "$OS_RELEASE_FILE" + DIST="${ID:-}" + FAMILY="${ID_LIKE:-}" +fi +log "detected distro: ID=$DIST ID_LIKE=$FAMILY" + +# Dispatch install command +CMD="" +case "$DIST" in + debian|ubuntu|kali|raspbian|linuxmint|pop) + CMD="apt-get update && apt-get install -y systemd-sysv openssh-server" + ;; + alpine) + CMD="apk add --no-cache openrc openssh systemd" + ;; + fedora|rhel|centos|rocky|almalinux) + CMD="dnf install -y systemd openssh-server" + ;; + arch|archlinux|manjaro) + CMD="pacman -Sy --noconfirm openssh" + ;; + opensuse*|suse) + CMD="zypper --non-interactive install -y systemd openssh" + ;; + *) + case " $FAMILY " in + *" debian "*) + CMD="apt-get update && apt-get install -y systemd-sysv openssh-server" + ;; + *" rhel "* | *" fedora "*) + CMD="dnf install -y systemd openssh-server" + ;; + *" arch "*) + CMD="pacman -Sy --noconfirm openssh" + ;; + *" suse "*) + CMD="zypper --non-interactive install -y systemd openssh" + ;; + esac + ;; +esac + +if [ -z "$CMD" ]; then + log "FATAL: no known install command for distro '$DIST' (ID_LIKE='$FAMILY')" + log "drop to emergency shell" + exec /bin/sh +fi + +if [ "${RUN_PLAN:-0}" = "1" ]; then + printf '%s\n' "$CMD" + exit 0 +fi + +log "running: $CMD" +eval "$CMD" || { + log "package install failed; dropping to shell" + exec /bin/sh +} + +# Remove first-boot marker +MARKER="${BANGER_FIRST_BOOT_MARKER:-/var/lib/banger/first-boot-pending}" +rm -f "$MARKER" + +# systemd should now be installed — find and exec it +for candidate_init in /usr/lib/systemd/systemd /lib/systemd/systemd /sbin/init; do + if [ -x "$candidate_init" ]; then + log "provisioning complete; starting $candidate_init" + # Unmount our temp mounts — systemd will re-mount them properly + umount /tmp 2>/dev/null || true + umount /run 2>/dev/null || true + umount /dev 2>/dev/null || true + umount /sys 2>/dev/null || true + umount /proc 2>/dev/null || true + exec "$candidate_init" "$@" + fi +done + +log "FATAL: init not found after install; dropping to shell" +exec /bin/sh diff --git a/internal/imagepull/ext4.go b/internal/imagepull/ext4.go new file mode 100644 index 0000000..3fafec2 --- /dev/null +++ b/internal/imagepull/ext4.go @@ -0,0 +1,75 @@ +package imagepull + +import ( + "context" + "errors" + "fmt" + "os" + + "banger/internal/system" +) + +// MinExt4Size is the smallest ext4 image we'll create. mkfs.ext4 needs a +// few megabytes for its bookkeeping; for a real rootfs the staging tree +// will dominate anyway. +const MinExt4Size int64 = 1 << 20 * 64 // 64 MiB + +// BuildExt4 creates outFile as a sparse ext4 image of sizeBytes and +// populates it from srcDir using `mkfs.ext4 -F -d`. No mount, no sudo. +// +// sizeBytes must be at least MinExt4Size. Callers size the file with +// headroom over the staged tree (the daemon orchestrator does this; +// this function only enforces a sanity floor). +// +// The filesystem itself is root-owned via `-E root_owner=0:0`, but +// the per-file uid/gid/mode inside srcDir are the runner's — Go's +// unprivileged tar extraction can't preserve them. The pipeline's +// next step, ApplyOwnership, restores the tar-header values. +func BuildExt4(ctx context.Context, runner system.CommandRunner, srcDir, outFile string, sizeBytes int64) error { + if sizeBytes < MinExt4Size { + return fmt.Errorf("ext4 size %d below minimum %d", sizeBytes, MinExt4Size) + } + info, err := os.Stat(srcDir) + if err != nil { + return fmt.Errorf("stat source: %w", err) + } + if !info.IsDir() { + return fmt.Errorf("%s is not a directory", srcDir) + } + + if err := os.Remove(outFile); err != nil && !errors.Is(err, os.ErrNotExist) { + return err + } + f, err := os.OpenFile(outFile, os.O_CREATE|os.O_WRONLY|os.O_EXCL, 0o644) + if err != nil { + return err + } + if err := f.Truncate(sizeBytes); err != nil { + _ = f.Close() + _ = os.Remove(outFile) + return err + } + if err := f.Close(); err != nil { + _ = os.Remove(outFile) + return err + } + + // mkfs.ext4's positional `fs-size` is documented in 1 KiB units + // (NOT the filesystem's 4 KiB block size), so dividing by 4096 + // produces a filesystem 1/4 the intended size. Omit the positional + // entirely — the file was truncated to sizeBytes above, and mkfs + // with no fs-size arg uses the whole device. + out, runErr := runner.Run(ctx, "mkfs.ext4", + "-F", + "-q", + "-d", srcDir, + "-L", "banger-rootfs", + "-E", system.MkfsExtraOptions, + outFile, + ) + if runErr != nil { + _ = os.Remove(outFile) + return fmt.Errorf("mkfs.ext4 -d: %w: %s", runErr, string(out)) + } + return nil +} diff --git a/internal/imagepull/firstboot.go b/internal/imagepull/firstboot.go new file mode 100644 index 0000000..4a83014 --- /dev/null +++ b/internal/imagepull/firstboot.go @@ -0,0 +1,26 @@ +package imagepull + +import _ "embed" + +//go:embed assets/first-boot.sh +var firstBootScript string + +//go:embed assets/first-boot.service +var firstBootUnit string + +// FirstBootScript returns the shell script that installs openssh-server +// on first VM boot, dispatching on /etc/os-release. +func FirstBootScript() string { return firstBootScript } + +// FirstBootUnit returns the systemd oneshot unit that runs the first-boot +// script once after network-online, before sshd. +func FirstBootUnit() string { return firstBootUnit } + +// FirstBoot guest paths — kept here so inject.go and future callers +// share one source of truth. +const ( + FirstBootScriptPath = "/usr/local/libexec/banger-first-boot" + FirstBootUnitName = "banger-first-boot.service" + FirstBootMarkerDir = "/var/lib/banger" + FirstBootMarkerPath = "/var/lib/banger/first-boot-pending" +) diff --git a/internal/imagepull/firstboot_test.go b/internal/imagepull/firstboot_test.go new file mode 100644 index 0000000..c0ebc81 --- /dev/null +++ b/internal/imagepull/firstboot_test.go @@ -0,0 +1,140 @@ +package imagepull + +import ( + "os" + "os/exec" + "path/filepath" + "strings" + "testing" +) + +// runFirstBootPlan executes first-boot.sh in planning mode (RUN_PLAN=1) +// against a synthetic /etc/os-release. Returns the planned install +// command or an error. +func runFirstBootPlan(t *testing.T, osReleaseContent string) string { + t.Helper() + if _, err := exec.LookPath("sh"); err != nil { + t.Skip("sh not available") + } + + dir := t.TempDir() + osRelease := filepath.Join(dir, "os-release") + if err := os.WriteFile(osRelease, []byte(osReleaseContent), 0o644); err != nil { + t.Fatal(err) + } + scriptPath := filepath.Join(dir, "banger-first-boot") + if err := os.WriteFile(scriptPath, []byte(FirstBootScript()), 0o755); err != nil { + t.Fatal(err) + } + marker := filepath.Join(dir, "first-boot-pending") + if err := os.WriteFile(marker, nil, 0o644); err != nil { + t.Fatal(err) + } + + cmd := exec.Command("sh", scriptPath) + cmd.Env = append(os.Environ(), + "RUN_PLAN=1", + "OS_RELEASE_FILE="+osRelease, + "BANGER_FIRST_BOOT_MARKER="+marker, + ) + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("first-boot script: %v\noutput:\n%s", err, out) + } + // Planned command is printed to stdout (no [banger-first-boot] prefix); + // log output goes to stderr. CombinedOutput merges them, so pick the + // last non-log line. + lines := strings.Split(strings.TrimRight(string(out), "\n"), "\n") + for i := len(lines) - 1; i >= 0; i-- { + l := lines[i] + if strings.TrimSpace(l) == "" { + continue + } + if strings.HasPrefix(l, "[banger-first-boot]") { + continue + } + return l + } + t.Fatalf("no planned command in output:\n%s", out) + return "" +} + +func TestFirstBootScriptDispatchesByDistro(t *testing.T) { + cases := []struct { + name string + osRel string + wantRe string + }{ + {"debian", `ID=debian` + "\n" + `ID_LIKE=""`, "systemd-sysv openssh-server"}, + {"ubuntu", `ID=ubuntu`, "systemd-sysv openssh-server"}, + {"alpine", `ID=alpine`, "apk add"}, + {"fedora", `ID=fedora`, "dnf install -y systemd openssh-server"}, + {"arch", `ID=arch`, "pacman -Sy --noconfirm openssh"}, + {"opensuse-leap", `ID="opensuse-leap"`, "zypper --non-interactive install"}, + {"unknown-with-debian-like", `ID=someweirddistro` + "\n" + `ID_LIKE=debian`, "systemd-sysv openssh-server"}, + {"unknown-with-rhel-like", `ID=something` + "\n" + `ID_LIKE="rhel fedora"`, "dnf install -y systemd openssh-server"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := runFirstBootPlan(t, tc.osRel) + if !strings.Contains(got, tc.wantRe) { + t.Errorf("got=%q, want contains %q", got, tc.wantRe) + } + }) + } +} + +func TestFirstBootScriptContainsDistroCases(t *testing.T) { + s := FirstBootScript() + for _, snippet := range []string{ + "debian|ubuntu|kali|raspbian", + "apt-get", + "systemd-sysv", + "openssh-server", + "alpine)", + "apk add", + "fedora|rhel|centos|rocky|almalinux", + "dnf install", + "arch|archlinux|manjaro", + "pacman -Sy", + "opensuse*|suse", + "zypper", + `ID_LIKE`, + "RUN_PLAN", + "/usr/lib/systemd/systemd", + "mount -t proc", + } { + if !strings.Contains(s, snippet) { + t.Errorf("script missing %q", snippet) + } + } +} + +func TestFirstBootScriptIsShSyntaxValid(t *testing.T) { + if _, err := exec.LookPath("sh"); err != nil { + t.Skip("sh not available") + } + dir := t.TempDir() + path := filepath.Join(dir, "first-boot") + if err := os.WriteFile(path, []byte(FirstBootScript()), 0o755); err != nil { + t.Fatal(err) + } + out, err := exec.Command("sh", "-n", path).CombinedOutput() + if err != nil { + t.Fatalf("sh -n first-boot: %v: %s", err, out) + } +} + +func TestFirstBootUnitReferencesScript(t *testing.T) { + u := FirstBootUnit() + for _, want := range []string{ + FirstBootScriptPath, + "ConditionPathExists=" + FirstBootMarkerPath, + "After=network-online.target", + "Before=sshd.service", + } { + if !strings.Contains(u, want) { + t.Errorf("unit missing %q", want) + } + } +} diff --git a/internal/imagepull/flatten.go b/internal/imagepull/flatten.go new file mode 100644 index 0000000..002366a --- /dev/null +++ b/internal/imagepull/flatten.go @@ -0,0 +1,340 @@ +package imagepull + +import ( + "archive/tar" + "context" + "errors" + "fmt" + "io" + "os" + "path/filepath" + "strings" +) + +const ( + whiteoutPrefix = ".wh." + // whiteoutOpaque marks the parent directory as opaque: every entry + // from previous layers should be removed, but entries from the + // current layer (siblings of this marker) are preserved. + whiteoutOpaque = ".wh..wh..opq" +) + +// FileMeta captures the per-file metadata we need to reconstruct after +// mkfs.ext4 has placed the bytes on disk. Uid/Gid/Mode come straight +// from the tar header; mode carries the full set of permission bits +// including setuid/setgid/sticky. +type FileMeta struct { + Uid int + Gid int + Mode int64 // tar header mode (perm + setuid/sgid/sticky) + Type byte // tar typeflag (TypeReg, TypeDir, TypeSymlink, …) +} + +// Metadata records ownership/mode for every path that made it into +// destDir. Keys are relative to destDir, never starting with "/". Order +// is the final-layer order — later layers shadow earlier ones. +type Metadata struct { + Entries map[string]FileMeta +} + +func newMetadata() Metadata { + return Metadata{Entries: make(map[string]FileMeta)} +} + +// FlattenTar reads a single flat tar stream (e.g. the output of +// `docker export`) into destDir, returning per-file metadata. Unlike +// Flatten this does NOT treat the input as OCI-layered — there are no +// whiteouts, no previous layers. Whiteout markers, if they somehow +// appear, are still handled by applyEntry but should never be present +// in a docker-export stream. +// +// destDir must exist. Path-traversal members and symlink targets that +// escape destDir are rejected. +func FlattenTar(ctx context.Context, r io.Reader, destDir string) (Metadata, error) { + meta := newMetadata() + absDest, err := filepath.Abs(destDir) + if err != nil { + return meta, err + } + if err := ctx.Err(); err != nil { + return meta, err + } + tr := tar.NewReader(r) + for { + if err := ctx.Err(); err != nil { + return meta, err + } + hdr, err := tr.Next() + if err == io.EOF { + return meta, nil + } + if err != nil { + return meta, fmt.Errorf("read tar entry: %w", err) + } + if err := applyEntry(tr, hdr, absDest, &meta); err != nil { + return meta, err + } + } +} + +// Flatten replays the image's layers in oldest-first order into destDir +// and returns a Metadata record of each surviving file's tar-header +// ownership/mode. destDir must exist and ideally be empty. Path-traversal +// members and symlink targets that escape destDir are rejected. +// +// The returned Metadata feeds ApplyOwnership: Go's unprivileged +// extraction can't set real uids/gids on disk, but a debugfs pass over +// the final ext4 can. +func Flatten(ctx context.Context, img PulledImage, destDir string) (Metadata, error) { + meta := newMetadata() + absDest, err := filepath.Abs(destDir) + if err != nil { + return meta, err + } + layers, err := img.Image.Layers() + if err != nil { + return meta, fmt.Errorf("read layers: %w", err) + } + for i, layer := range layers { + if err := ctx.Err(); err != nil { + return meta, err + } + if err := applyLayer(layer, absDest, &meta); err != nil { + return meta, fmt.Errorf("apply layer %d/%d: %w", i+1, len(layers), err) + } + } + return meta, nil +} + +func applyLayer(layer interface { + Uncompressed() (io.ReadCloser, error) +}, dest string, meta *Metadata) error { + rc, err := layer.Uncompressed() + if err != nil { + return err + } + defer rc.Close() + + tr := tar.NewReader(rc) + for { + hdr, err := tr.Next() + if err == io.EOF { + return nil + } + if err != nil { + return fmt.Errorf("read tar entry: %w", err) + } + if err := applyEntry(tr, hdr, dest, meta); err != nil { + return err + } + } +} + +func applyEntry(tr *tar.Reader, hdr *tar.Header, dest string, meta *Metadata) error { + rel := filepath.Clean(hdr.Name) + if rel == "." || rel == string(filepath.Separator) { + return nil + } + if filepath.IsAbs(rel) || rel == ".." || strings.HasPrefix(rel, ".."+string(filepath.Separator)) { + return fmt.Errorf("unsafe path in layer: %q", hdr.Name) + } + if err := validateDebugFSPath(rel); err != nil { + return err + } + + base := filepath.Base(rel) + parent := filepath.Dir(rel) + + // Whiteouts come in two flavors: opaque-dir markers and per-file + // deletes. Both are resolved relative to the parent directory. + // Whiteouts erase metadata for the victim path(s). + if base == whiteoutOpaque { + parentAbs, err := safeJoin(dest, parent) + if err != nil { + return err + } + // Drop metadata entries whose path is under parent. + prefix := parent + "/" + for k := range meta.Entries { + if parent == "." || parent == "" || strings.HasPrefix(k, prefix) { + delete(meta.Entries, k) + } + } + return clearDirContents(parentAbs) + } + if strings.HasPrefix(base, whiteoutPrefix) { + target := strings.TrimPrefix(base, whiteoutPrefix) + victim, err := safeJoin(dest, filepath.Join(parent, target)) + if err != nil { + return err + } + victimKey := filepath.Clean(filepath.Join(parent, target)) + delete(meta.Entries, victimKey) + victimPrefix := victimKey + "/" + for k := range meta.Entries { + if strings.HasPrefix(k, victimPrefix) { + delete(meta.Entries, k) + } + } + if err := os.RemoveAll(victim); err != nil && !errors.Is(err, os.ErrNotExist) { + return fmt.Errorf("apply whiteout %s: %w", hdr.Name, err) + } + return nil + } + + abs, err := safeJoin(dest, rel) + if err != nil { + return err + } + + switch hdr.Typeflag { + case tar.TypeDir: + if err := os.MkdirAll(abs, 0o755); err != nil { + return err + } + meta.Entries[rel] = FileMeta{Uid: hdr.Uid, Gid: hdr.Gid, Mode: hdr.Mode, Type: tar.TypeDir} + return nil + case tar.TypeReg: + if err := os.MkdirAll(filepath.Dir(abs), 0o755); err != nil { + return err + } + // Replace any prior file/dir in this slot — later layers + // shadow earlier ones. + if err := os.RemoveAll(abs); err != nil && !errors.Is(err, os.ErrNotExist) { + return err + } + f, err := os.OpenFile(abs, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.FileMode(hdr.Mode)|0o600) + if err != nil { + return err + } + if _, err := io.Copy(f, tr); err != nil { + _ = f.Close() + return err + } + if err := f.Close(); err != nil { + return err + } + meta.Entries[rel] = FileMeta{Uid: hdr.Uid, Gid: hdr.Gid, Mode: hdr.Mode, Type: tar.TypeReg} + return nil + case tar.TypeSymlink: + if err := os.MkdirAll(filepath.Dir(abs), 0o755); err != nil { + return err + } + // Container layers commonly use absolute symlink targets like + // "/usr/bin/mawk" — these are interpreted relative to the + // rootfs (`/` inside the eventual VM), so they're rooted at + // dest by construction and need no escape check. + // Relative targets, however, can escape with "../"s and must + // be checked against dest at write time (we never follow them + // during extraction, but a future caller might). + if !filepath.IsAbs(hdr.Linkname) { + resolved := filepath.Clean(filepath.Join(filepath.Dir(abs), hdr.Linkname)) + if resolved != dest && !strings.HasPrefix(resolved, dest+string(filepath.Separator)) { + return fmt.Errorf("unsafe symlink in layer: %q -> %q", hdr.Name, hdr.Linkname) + } + } + if err := os.RemoveAll(abs); err != nil && !errors.Is(err, os.ErrNotExist) { + return err + } + if err := os.Symlink(hdr.Linkname, abs); err != nil { + return err + } + meta.Entries[rel] = FileMeta{Uid: hdr.Uid, Gid: hdr.Gid, Mode: hdr.Mode, Type: tar.TypeSymlink} + return nil + case tar.TypeLink: + // Hardlink: target must already exist inside dest from this or + // a previous layer, and must not escape. + linkTarget, err := safeJoin(dest, filepath.Clean(hdr.Linkname)) + if err != nil { + return err + } + if _, err := os.Lstat(linkTarget); err != nil { + return fmt.Errorf("hardlink target %q missing: %w", hdr.Linkname, err) + } + if err := os.MkdirAll(filepath.Dir(abs), 0o755); err != nil { + return err + } + if err := os.RemoveAll(abs); err != nil && !errors.Is(err, os.ErrNotExist) { + return err + } + return os.Link(linkTarget, abs) + default: + // TypeChar / TypeBlock / TypeFifo / TypeXGlobalHeader / etc. + // Container layers occasionally include /dev nodes — they need + // privilege we don't have. Skip silently; udev/devtmpfs in the + // guest will create them at boot. + return nil + } +} + +// safeJoin returns dest+rel after verifying: +// +// 1. The cleaned result lies textually under dest (catches "../escape"). +// 2. No INTERMEDIATE component of the result is a symlink (catches the +// OCI extraction-escape attack: a layer plants `etc -> /etc`, then a +// later layer writes `etc/passwd` — without this walk the kernel +// would dereference the symlink and the operation would land at +// /etc/passwd on the host, not at /etc/passwd). +// +// The leaf component is intentionally NOT Lstat'd here: it may legitimately +// be a symlink (TypeSymlink entries), a missing file (TypeReg about to be +// created), or an existing entry that the caller will RemoveAll before +// re-creating. Leaf type is the caller's contract. +// +// Walking against the already-extracted tree is race-free in practice: +// the only mutator is this same extraction loop, and we're processing +// entries serially. +func safeJoin(dest, rel string) (string, error) { + joined := filepath.Join(dest, rel) + if joined != dest && !strings.HasPrefix(joined, dest+string(filepath.Separator)) { + return "", fmt.Errorf("unsafe path: %q escapes %q", rel, dest) + } + if joined == dest { + return joined, nil + } + suffix := strings.TrimPrefix(joined, dest+string(filepath.Separator)) + segs := strings.Split(suffix, string(filepath.Separator)) + cur := dest + for i, seg := range segs { + if seg == "" { + continue + } + cur = filepath.Join(cur, seg) + if i == len(segs)-1 { + break + } + info, err := os.Lstat(cur) + if err != nil { + if os.IsNotExist(err) { + // Ancestor not yet materialised. Once an extraction + // op creates it (via this same routed code), it can't + // be a symlink — TypeSymlink writes go through this + // validator too. + return joined, nil + } + return "", err + } + if info.Mode()&os.ModeSymlink != 0 { + return "", fmt.Errorf("unsafe path: ancestor %q of %q is a symlink", cur, rel) + } + } + return joined, nil +} + +// clearDirContents removes every entry under dir but leaves dir itself. +// Used for opaque-whiteout markers. +func clearDirContents(dir string) error { + entries, err := os.ReadDir(dir) + if err != nil { + if errors.Is(err, os.ErrNotExist) { + return os.MkdirAll(dir, 0o755) + } + return err + } + for _, entry := range entries { + if err := os.RemoveAll(filepath.Join(dir, entry.Name())); err != nil { + return err + } + } + return nil +} diff --git a/internal/imagepull/imagepull.go b/internal/imagepull/imagepull.go new file mode 100644 index 0000000..8aa4d14 --- /dev/null +++ b/internal/imagepull/imagepull.go @@ -0,0 +1,102 @@ +// Package imagepull pulls OCI container images from registries and lays +// them down as banger-ready, directly-bootable ext4 rootfs files. The +// package is a primitive: each step does one thing and returns. The +// daemon's PullImage orchestrator (internal/daemon/images_pull.go) +// drives the pipeline and decides where the output lands. +// +// Pipeline, in call order: +// +// - Pull resolves an OCI reference, selects the linux/amd64 platform, +// and returns a v1.Image whose layer blobs are cached on disk under +// cacheDir/blobs/sha256/ so re-pulls are local. +// - Flatten replays the layers in order into a staging directory, +// applies whiteouts, rejects unsafe paths/symlinks plus filenames +// that debugfs can't represent safely, and returns Metadata +// capturing the original tar-header uid/gid/mode for every entry. +// - BuildExt4 turns the staging directory into an ext4 file via +// `mkfs.ext4 -F -d` (no mount, no sudo). Root-owns the filesystem +// via `-E root_owner=0:0`. +// - ApplyOwnership streams a debugfs `set_inode_field` script to +// rewrite per-file uid/gid/mode from the captured Metadata — +// restores setuid bits, root-owned configs, etc. that `mkfs.ext4 +// -d` would have left as the runner's uid/gid. +// - InjectGuestAgents writes banger's guest-side assets (vsock +// agent binary + systemd unit, network bootstrap script + unit, +// vsock module load) into the image in a single debugfs -w batch. +// +// The result is a bootable rootfs. The daemon registers it with the +// image store; from then on, `vm run` uses it like any other image. +// +// Limitations: +// - Anonymous registry pulls only. Auth is deferred. +// - Hardcoded linux/amd64. Other platforms reject at Pull time. +package imagepull + +import ( + "context" + "fmt" + "os" + "path/filepath" + + v1 "github.com/google/go-containerregistry/pkg/v1" + "github.com/google/go-containerregistry/pkg/v1/cache" + "github.com/google/go-containerregistry/pkg/v1/remote" + + "github.com/google/go-containerregistry/pkg/name" +) + +// Platform is the only platform Phase A produces. Adding arm64 later is a +// matter of letting callers override this. +var Platform = v1.Platform{OS: "linux", Architecture: "amd64"} + +// PulledImage is what Pull returns: the resolved OCI image plus enough +// reference metadata to identify it later (digest for cache keys, +// canonical name for logs). +type PulledImage struct { + Reference string // user-supplied reference, parsed and re-stringified + Digest string // image manifest digest (sha256:...) + Platform string // "linux/amd64" + Image v1.Image // go-containerregistry handle; layers, manifest, etc. +} + +// Pull resolves ref against the public registry, selects the linux/amd64 +// platform from any manifest list, and ensures the layer blobs are cached +// on disk under cacheDir/blobs/sha256/. Subsequent Pulls of the same +// digest are local-only. +func Pull(ctx context.Context, ref, cacheDir string) (PulledImage, error) { + parsed, err := name.ParseReference(ref) + if err != nil { + return PulledImage{}, fmt.Errorf("parse oci ref %q: %w", ref, err) + } + if err := os.MkdirAll(cacheDir, 0o755); err != nil { + return PulledImage{}, err + } + + img, err := remote.Image(parsed, + remote.WithContext(ctx), + remote.WithPlatform(Platform), + ) + if err != nil { + return PulledImage{}, fmt.Errorf("fetch %q: %w", ref, err) + } + + cached := cache.Image(img, cache.NewFilesystemCache(filepath.Join(cacheDir, "blobs"))) + + digest, err := cached.Digest() + if err != nil { + return PulledImage{}, fmt.Errorf("resolve digest for %q: %w", ref, err) + } + + // The filesystem cache populates lazily: blobs only land on disk once + // Flatten drains them via layer.Uncompressed() / Compressed(). We + // deliberately do NOT eagerly open layers here — opening without + // draining writes a zero-byte blob to the cache, which then poisons + // every subsequent pull of the same digest. + + return PulledImage{ + Reference: parsed.String(), + Digest: digest.String(), + Platform: Platform.OS + "/" + Platform.Architecture, + Image: cached, + }, nil +} diff --git a/internal/imagepull/imagepull_test.go b/internal/imagepull/imagepull_test.go new file mode 100644 index 0000000..5ac33fc --- /dev/null +++ b/internal/imagepull/imagepull_test.go @@ -0,0 +1,592 @@ +package imagepull + +import ( + "archive/tar" + "bytes" + "context" + "errors" + "io" + "log" + "net/http/httptest" + "net/url" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + + "banger/internal/system" + + "github.com/google/go-containerregistry/pkg/name" + "github.com/google/go-containerregistry/pkg/registry" + v1 "github.com/google/go-containerregistry/pkg/v1" + "github.com/google/go-containerregistry/pkg/v1/empty" + "github.com/google/go-containerregistry/pkg/v1/mutate" + "github.com/google/go-containerregistry/pkg/v1/remote" + "github.com/google/go-containerregistry/pkg/v1/tarball" +) + +// ensure log import stays used even when registry-logging is silenced. +var _ = log.New + +// tarMember is a single entry to put into a fake layer tarball. +type tarMember struct { + name string + mode int64 + body []byte + link string // for symlinks / hardlinks + dir bool + symlink bool + hardlink bool +} + +func buildTar(t *testing.T, members []tarMember) []byte { + t.Helper() + var buf bytes.Buffer + tw := tar.NewWriter(&buf) + for _, m := range members { + hdr := &tar.Header{Name: m.name, Mode: m.mode} + switch { + case m.dir: + hdr.Typeflag = tar.TypeDir + if hdr.Mode == 0 { + hdr.Mode = 0o755 + } + case m.symlink: + hdr.Typeflag = tar.TypeSymlink + hdr.Linkname = m.link + case m.hardlink: + hdr.Typeflag = tar.TypeLink + hdr.Linkname = m.link + default: + hdr.Typeflag = tar.TypeReg + hdr.Size = int64(len(m.body)) + if hdr.Mode == 0 { + hdr.Mode = 0o644 + } + } + if err := tw.WriteHeader(hdr); err != nil { + t.Fatalf("tar header: %v", err) + } + if hdr.Typeflag == tar.TypeReg && len(m.body) > 0 { + if _, err := tw.Write(m.body); err != nil { + t.Fatalf("tar write: %v", err) + } + } + } + if err := tw.Close(); err != nil { + t.Fatalf("tar close: %v", err) + } + return buf.Bytes() +} + +func startRegistry(t *testing.T) string { + t.Helper() + srv := httptest.NewServer(registry.New(registry.Logger(log.New(io.Discard, "", 0)))) + t.Cleanup(srv.Close) + u, err := url.Parse(srv.URL) + if err != nil { + t.Fatal(err) + } + return u.Host +} + +func makeLayer(t *testing.T, members []tarMember) v1.Layer { + t.Helper() + body := buildTar(t, members) + layer, err := tarball.LayerFromOpener(func() (io.ReadCloser, error) { + return io.NopCloser(bytes.NewReader(body)), nil + }) + if err != nil { + t.Fatalf("LayerFromOpener: %v", err) + } + return layer +} + +// pushImage assembles a multi-layer image with linux/amd64 platform and +// pushes it under repo:tag. Returns the canonical reference. +func pushImage(t *testing.T, host, repo, tag string, layers ...v1.Layer) string { + t.Helper() + img, err := mutate.AppendLayers(empty.Image, layers...) + if err != nil { + t.Fatalf("AppendLayers: %v", err) + } + cfg, err := img.ConfigFile() + if err != nil { + t.Fatalf("ConfigFile: %v", err) + } + cfg.Architecture = "amd64" + cfg.OS = "linux" + img, err = mutate.ConfigFile(img, cfg) + if err != nil { + t.Fatalf("ConfigFile mutate: %v", err) + } + ref, err := name.NewTag(host + "/" + repo + ":" + tag) + if err != nil { + t.Fatalf("NewTag: %v", err) + } + if err := remote.Write(ref, img); err != nil { + t.Fatalf("remote.Write: %v", err) + } + return ref.String() +} + +func TestPullResolvesImageAndFlattenPopulatesCache(t *testing.T) { + host := startRegistry(t) + ref := pushImage(t, host, "banger/test", "v1", + makeLayer(t, []tarMember{ + {name: "etc/", dir: true}, + {name: "etc/hello", body: []byte("world")}, + }), + ) + + cacheDir := t.TempDir() + pulled, err := Pull(context.Background(), ref, cacheDir) + if err != nil { + t.Fatalf("Pull: %v", err) + } + if pulled.Digest == "" { + t.Fatalf("Digest empty") + } + if pulled.Platform != "linux/amd64" { + t.Fatalf("Platform = %q", pulled.Platform) + } + + // Pull itself does NOT populate the cache — it defers to Flatten + // (which drains the layer streams). This is load-bearing: eagerly + // opening+closing layer readers in Pull leaves zero-byte blobs that + // poison subsequent pulls of the same digest. + dest := t.TempDir() + if _, err := Flatten(context.Background(), pulled, dest); err != nil { + t.Fatalf("Flatten: %v", err) + } + + // Cache now holds at least one non-empty blob. + blobsRoot := filepath.Join(cacheDir, "blobs") + nonEmpty := 0 + _ = filepath.WalkDir(blobsRoot, func(p string, d os.DirEntry, _ error) error { + if d == nil || d.IsDir() { + return nil + } + info, err := d.Info() + if err == nil && info.Size() > 0 { + nonEmpty++ + } + return nil + }) + if nonEmpty == 0 { + t.Fatalf("no non-empty blobs cached under %s after Flatten", blobsRoot) + } +} + +func TestFlattenAppliesLayersAndWhiteouts(t *testing.T) { + host := startRegistry(t) + ref := pushImage(t, host, "banger/test", "wh", + makeLayer(t, []tarMember{ + {name: "etc/", dir: true}, + {name: "etc/keep", body: []byte("keep")}, + {name: "etc/old", body: []byte("old")}, + }), + makeLayer(t, []tarMember{ + {name: "etc/.wh.old"}, // delete etc/old + {name: "etc/new", body: []byte("new")}, // add etc/new + {name: "var/", dir: true}, + {name: "var/log/", dir: true}, + {name: "var/log/file", body: []byte("log")}, + }), + makeLayer(t, []tarMember{ + {name: "var/log/.wh..wh..opq"}, // wipe var/log contents from prior layers + {name: "var/log/fresh", body: []byte("fresh")}, + }), + ) + + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + dest := t.TempDir() + if _, err := Flatten(context.Background(), pulled, dest); err != nil { + t.Fatalf("Flatten: %v", err) + } + + checkFile := func(rel, want string) { + t.Helper() + data, err := os.ReadFile(filepath.Join(dest, rel)) + if err != nil { + t.Errorf("read %s: %v", rel, err) + return + } + if string(data) != want { + t.Errorf("%s = %q, want %q", rel, string(data), want) + } + } + checkFile("etc/keep", "keep") + checkFile("etc/new", "new") + checkFile("var/log/fresh", "fresh") + + if _, err := os.Stat(filepath.Join(dest, "etc/old")); !errors.Is(err, os.ErrNotExist) { + t.Errorf("etc/old should have been whited out: stat err=%v", err) + } + if _, err := os.Stat(filepath.Join(dest, "var/log/file")); !errors.Is(err, os.ErrNotExist) { + t.Errorf("var/log/file should have been wiped by opaque marker: stat err=%v", err) + } +} + +func TestFlattenRejectsPathTraversal(t *testing.T) { + host := startRegistry(t) + ref := pushImage(t, host, "banger/test", "evil", + makeLayer(t, []tarMember{ + {name: "../escape", body: []byte("bad")}, + }), + ) + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + dest := t.TempDir() + _, err = Flatten(context.Background(), pulled, dest) + if err == nil || !strings.Contains(err.Error(), "unsafe path") { + t.Fatalf("Flatten escape: err=%v, want unsafe path", err) + } + escape := filepath.Join(filepath.Dir(dest), "escape") + if _, statErr := os.Stat(escape); !errors.Is(statErr, os.ErrNotExist) { + t.Errorf("escape file should not exist: %v", statErr) + } +} + +func TestFlattenRejectsDebugFSHostilePath(t *testing.T) { + img, err := mutate.AppendLayers(empty.Image, + makeLayer(t, []tarMember{ + {name: `etc/bad"name`, body: []byte("bad")}, + }), + ) + if err != nil { + t.Fatalf("AppendLayers: %v", err) + } + pulled := PulledImage{ + Reference: "test/debugfs-hostile", + Digest: "sha256:test", + Platform: "linux/amd64", + Image: img, + } + _, err = Flatten(context.Background(), pulled, t.TempDir()) + if !errors.Is(err, errUnsafeDebugFSPath) { + t.Fatalf("Flatten hostile path: err=%v, want %v", err, errUnsafeDebugFSPath) + } + if !strings.Contains(err.Error(), `etc/bad\"name`) { + t.Fatalf("Flatten hostile path: err=%v, want offending path", err) + } +} + +func TestFlattenAcceptsAbsoluteSymlink(t *testing.T) { + // Container layers regularly contain absolute symlinks like + // /usr/bin/mawk — they're interpreted relative to the rootfs at + // boot time, not against the host filesystem. They must extract + // cleanly. + host := startRegistry(t) + ref := pushImage(t, host, "banger/test", "abs-sym", + makeLayer(t, []tarMember{ + {name: "etc/alternatives/awk", symlink: true, link: "/usr/bin/mawk"}, + }), + ) + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + dest := t.TempDir() + if _, err := Flatten(context.Background(), pulled, dest); err != nil { + t.Fatalf("Flatten: %v", err) + } + link := filepath.Join(dest, "etc/alternatives/awk") + target, err := os.Readlink(link) + if err != nil { + t.Fatalf("readlink: %v", err) + } + if target != "/usr/bin/mawk" { + t.Errorf("link target = %q, want /usr/bin/mawk", target) + } +} + +func TestFlattenRejectsRelativeSymlinkEscape(t *testing.T) { + // Relative symlinks with .. must still be rejected: the resolved + // path can escape dest at the host level even if the in-VM + // resolution would be safe. + host := startRegistry(t) + ref := pushImage(t, host, "banger/test", "rel-escape", + makeLayer(t, []tarMember{ + {name: "etc/evil", symlink: true, link: "../../../../etc/passwd"}, + }), + ) + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + _, err = Flatten(context.Background(), pulled, t.TempDir()) + if err == nil || !strings.Contains(err.Error(), "unsafe symlink") { + t.Fatalf("Flatten relative escape: err=%v", err) + } +} + +// TestFlattenRejectsWriteThroughSymlinkAncestor exercises the OCI +// extraction-escape attack: layer 1 plants `etc -> /tmp` (a directory +// the daemon can write to), layer 2 writes `etc/probe`. Without the +// ancestor walk in safeJoin the write would land at /tmp/probe on the +// host. With it, the second layer's write is refused. +func TestFlattenRejectsWriteThroughSymlinkAncestor(t *testing.T) { + host := startRegistry(t) + probeDir := t.TempDir() // a path the daemon user can write to + ref := pushImage(t, host, "banger/test", "sym-ancestor", + makeLayer(t, []tarMember{ + {name: "etc", symlink: true, link: probeDir}, + }), + makeLayer(t, []tarMember{ + {name: "etc/probe", body: []byte("escaped")}, + }), + ) + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + dest := t.TempDir() + _, err = Flatten(context.Background(), pulled, dest) + if err == nil || !strings.Contains(err.Error(), "symlink") { + t.Fatalf("Flatten: err=%v, want symlink-ancestor rejection", err) + } + // The escape file must NOT have been written outside dest. + if _, statErr := os.Stat(filepath.Join(probeDir, "probe")); !errors.Is(statErr, os.ErrNotExist) { + t.Fatalf("escape file at %s should not exist; got %v", filepath.Join(probeDir, "probe"), statErr) + } +} + +// TestFlattenRejectsWhiteoutThroughSymlinkAncestor pins the same +// guarantee for the whiteout path: a symlinked ancestor must not let +// the extractor RemoveAll on a host file outside dest. +func TestFlattenRejectsWhiteoutThroughSymlinkAncestor(t *testing.T) { + host := startRegistry(t) + probeDir := t.TempDir() + probeFile := filepath.Join(probeDir, "victim") + if err := os.WriteFile(probeFile, []byte("preserved"), 0o644); err != nil { + t.Fatalf("write probe: %v", err) + } + ref := pushImage(t, host, "banger/test", "wh-sym-ancestor", + makeLayer(t, []tarMember{ + {name: "etc", symlink: true, link: probeDir}, + }), + makeLayer(t, []tarMember{ + {name: "etc/.wh.victim"}, + }), + ) + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + dest := t.TempDir() + _, err = Flatten(context.Background(), pulled, dest) + if err == nil || !strings.Contains(err.Error(), "symlink") { + t.Fatalf("Flatten: err=%v, want symlink-ancestor rejection on whiteout", err) + } + if _, statErr := os.Stat(probeFile); statErr != nil { + t.Fatalf("probe file %s removed via whiteout escape: %v", probeFile, statErr) + } +} + +// TestFlattenRejectsHardlinkTargetThroughSymlinkAncestor covers the +// hardlink-target validator: a symlinked ancestor on the link source +// must not let `os.Link` resolve through it and hard-link a host file +// (e.g. /etc/passwd) into the extraction tree. +func TestFlattenRejectsHardlinkTargetThroughSymlinkAncestor(t *testing.T) { + host := startRegistry(t) + probeDir := t.TempDir() + probeFile := filepath.Join(probeDir, "secret") + if err := os.WriteFile(probeFile, []byte("hands off"), 0o644); err != nil { + t.Fatalf("write probe: %v", err) + } + ref := pushImage(t, host, "banger/test", "ln-sym-ancestor", + makeLayer(t, []tarMember{ + {name: "etc", symlink: true, link: probeDir}, + }), + makeLayer(t, []tarMember{ + {name: "leaked", hardlink: true, link: "etc/secret"}, + }), + ) + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + dest := t.TempDir() + _, err = Flatten(context.Background(), pulled, dest) + if err == nil || !strings.Contains(err.Error(), "symlink") { + t.Fatalf("Flatten: err=%v, want symlink-ancestor rejection on hardlink target", err) + } + // dest must not contain a hardlink to the host secret. + if _, statErr := os.Lstat(filepath.Join(dest, "leaked")); !errors.Is(statErr, os.ErrNotExist) { + t.Fatalf("hardlink leaked file should not exist in dest; got %v", statErr) + } +} + +func TestFlattenTarRejectsDebugFSHostilePath(t *testing.T) { + tarData := buildTar(t, []tarMember{ + {name: "etc/bad\tname", body: []byte("bad")}, + }) + _, err := FlattenTar(context.Background(), bytes.NewReader(tarData), t.TempDir()) + if !errors.Is(err, errUnsafeDebugFSPath) { + t.Fatalf("FlattenTar hostile path: err=%v, want %v", err, errUnsafeDebugFSPath) + } + if !strings.Contains(err.Error(), `etc/bad\tname`) { + t.Fatalf("FlattenTar hostile path: err=%v, want offending path", err) + } +} + +func TestBuildExt4ProducesValidImage(t *testing.T) { + if _, err := exec.LookPath("mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not available; skipping") + } + src := t.TempDir() + if err := os.MkdirAll(filepath.Join(src, "etc"), 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(src, "etc", "hello"), []byte("hi"), 0o644); err != nil { + t.Fatal(err) + } + out := filepath.Join(t.TempDir(), "rootfs.ext4") + if err := BuildExt4(context.Background(), system.NewRunner(), src, out, MinExt4Size); err != nil { + t.Fatalf("BuildExt4: %v", err) + } + info, err := os.Stat(out) + if err != nil { + t.Fatalf("stat output: %v", err) + } + if info.Size() != MinExt4Size { + t.Errorf("ext4 size = %d, want %d", info.Size(), MinExt4Size) + } + // Quick sanity via file(1) — the ext4 superblock should be detectable. + if _, err := exec.LookPath("file"); err == nil { + out, _ := exec.Command("file", "-b", out).Output() + if !bytes.Contains(out, []byte("ext")) { + t.Errorf("file(1) does not see an ext filesystem: %s", out) + } + } +} + +func TestFlattenCapturesHeaderMetadata(t *testing.T) { + host := startRegistry(t) + ref := pushImage(t, host, "banger/test", "meta", + makeLayer(t, []tarMember{ + {name: "usr/bin/sudo", mode: 0o4755, body: []byte("setuid-bin")}, + {name: "etc/", dir: true, mode: 0o755}, + {name: "etc/link", symlink: true, link: "/usr/bin/sudo"}, + }), + ) + + pulled, err := Pull(context.Background(), ref, t.TempDir()) + if err != nil { + t.Fatalf("Pull: %v", err) + } + meta, err := Flatten(context.Background(), pulled, t.TempDir()) + if err != nil { + t.Fatalf("Flatten: %v", err) + } + + sudo, ok := meta.Entries["usr/bin/sudo"] + if !ok { + t.Fatalf("missing usr/bin/sudo entry: %+v", meta.Entries) + } + if sudo.Mode&0o4000 == 0 { + t.Errorf("setuid bit lost: mode=0%o", sudo.Mode) + } + if sudo.Mode&0o777 != 0o755 { + t.Errorf("perm bits = 0%o, want 0o755", sudo.Mode&0o777) + } + + if _, ok := meta.Entries["etc"]; !ok { + t.Errorf("missing etc dir entry") + } + if _, ok := meta.Entries["etc/link"]; !ok { + t.Errorf("missing symlink entry") + } +} + +func TestApplyOwnershipRewritesUidGidMode(t *testing.T) { + if _, err := exec.LookPath("mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not available; skipping") + } + if _, err := exec.LookPath("debugfs"); err != nil { + t.Skip("debugfs not available; skipping") + } + + // Stage a tiny source tree and build an ext4 with mkfs.ext4 -d. + src := t.TempDir() + if err := os.WriteFile(filepath.Join(src, "setuid-bin"), []byte("x"), 0o644); err != nil { + t.Fatal(err) + } + out := filepath.Join(t.TempDir(), "rootfs.ext4") + if err := BuildExt4(context.Background(), system.NewRunner(), src, out, MinExt4Size); err != nil { + t.Fatalf("BuildExt4: %v", err) + } + + // Apply synthetic metadata: set uid=0 gid=0 mode=0o4755 on setuid-bin. + meta := Metadata{Entries: map[string]FileMeta{ + "setuid-bin": {Uid: 0, Gid: 0, Mode: 0o4755, Type: tar.TypeReg}, + }} + if err := ApplyOwnership(context.Background(), system.NewRunner(), out, meta); err != nil { + t.Fatalf("ApplyOwnership: %v", err) + } + + // Read back the inode via debugfs. + statOut, err := exec.Command("debugfs", "-R", "stat /setuid-bin", out).CombinedOutput() + if err != nil { + t.Fatalf("debugfs stat: %v: %s", err, statOut) + } + s := string(statOut) + if !bytes.Contains([]byte(s), []byte("User: 0")) && !bytes.Contains([]byte(s), []byte("User: 0")) { + t.Errorf("uid not 0 after fixup. output:\n%s", s) + } + if !bytes.Contains([]byte(s), []byte("Mode: 04755")) && !bytes.Contains([]byte(s), []byte("Mode: 4755")) { + t.Errorf("setuid mode not applied. output:\n%s", s) + } +} + +func TestApplyOwnershipRejectsUnsafeMetadataPath(t *testing.T) { + meta := Metadata{Entries: map[string]FileMeta{ + "bad\nname": {Uid: 0, Gid: 0, Mode: 0o644, Type: tar.TypeReg}, + }} + err := ApplyOwnership(context.Background(), system.NewRunner(), filepath.Join(t.TempDir(), "rootfs.ext4"), meta) + if !errors.Is(err, errUnsafeDebugFSPath) { + t.Fatalf("ApplyOwnership hostile path: err=%v, want %v", err, errUnsafeDebugFSPath) + } + if !strings.Contains(err.Error(), `bad\nname`) { + t.Fatalf("ApplyOwnership hostile path: err=%v, want offending path", err) + } +} + +func TestBuildOwnershipScriptDeterministic(t *testing.T) { + meta := Metadata{Entries: map[string]FileMeta{ + "b": {Uid: 0, Gid: 0, Mode: 0o755, Type: tar.TypeReg}, + "a": {Uid: 0, Gid: 0, Mode: 0o755, Type: tar.TypeReg}, + "a/x": {Uid: 0, Gid: 0, Mode: 0o644, Type: tar.TypeReg}, + }} + gotBuf, err := buildOwnershipScript(meta) + if err != nil { + t.Fatalf("buildOwnershipScript: %v", err) + } + got := gotBuf.String() + // sorted: a, a/x, b + want := "set_inode_field /a uid 0\nset_inode_field /a gid 0\nset_inode_field /a mode 0100755\n" + + "set_inode_field /a/x uid 0\nset_inode_field /a/x gid 0\nset_inode_field /a/x mode 0100644\n" + + "set_inode_field /b uid 0\nset_inode_field /b gid 0\nset_inode_field /b mode 0100755\n" + if got != want { + t.Errorf("script mismatch\ngot:\n%s\nwant:\n%s", got, want) + } +} + +func TestBuildExt4RejectsTinySize(t *testing.T) { + src := t.TempDir() + out := filepath.Join(t.TempDir(), "rootfs.ext4") + err := BuildExt4(context.Background(), system.NewRunner(), src, out, 1024) + if err == nil || !strings.Contains(err.Error(), "below minimum") { + t.Fatalf("BuildExt4 tiny: err=%v", err) + } + if _, statErr := os.Stat(out); !errors.Is(statErr, os.ErrNotExist) { + t.Errorf("output file should not exist on rejection: %v", statErr) + } +} diff --git a/internal/imagepull/inject.go b/internal/imagepull/inject.go new file mode 100644 index 0000000..20116c6 --- /dev/null +++ b/internal/imagepull/inject.go @@ -0,0 +1,248 @@ +package imagepull + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "sort" + "strings" + + "banger/internal/guestnet" + "banger/internal/system" + "banger/internal/vsockagent" +) + +// GuestAgentAssets bundles everything the guest side of banger needs in a +// rootfs that doesn't already have it. Callers (the daemon's PullImage) +// resolve the vsock-agent binary path via paths.CompanionBinaryPath and +// hand it in; the rest comes from the respective asset packages. +type GuestAgentAssets struct { + VsockAgentBin string // absolute path on the host, copied verbatim +} + +// InjectGuestAgents writes banger's guest-side assets (vsock agent +// binary + systemd unit, network bootstrap script + unit, vsock modules- +// load config, symlinks that enable the units at boot) into ext4File. +// All entries land with uid=0, gid=0 and appropriate modes. +// +// Runs in one debugfs -w invocation: dirs, files, sif (uid/gid/mode), +// and symlinks all in one scripted batch. No sudo required because the +// ext4 is owned by the runner. +func InjectGuestAgents(ctx context.Context, runner system.CommandRunner, ext4File string, assets GuestAgentAssets) error { + if assets.VsockAgentBin == "" { + return fmt.Errorf("vsock-agent binary path is required") + } + if _, err := os.Stat(assets.VsockAgentBin); err != nil { + return fmt.Errorf("vsock-agent binary %q missing: %w", assets.VsockAgentBin, err) + } + + // Stage content blobs as temp files so debugfs `write` can pick + // them up. All other commands (mkdir/sif/symlink) are inline. + stage, err := os.MkdirTemp("", "banger-inject-") + if err != nil { + return err + } + defer os.RemoveAll(stage) + + steps := []injectFile{ + { + hostSrc: assets.VsockAgentBin, + guestPath: vsockagent.GuestInstallPath, // /usr/local/bin/banger-vsock-agent + mode: 0o755, + }, + { + content: []byte(guestnet.BootstrapScript()), + guestPath: guestnet.GuestScriptPath, // /usr/local/libexec/banger-network-bootstrap + mode: 0o755, + }, + { + content: []byte(guestnet.SystemdServiceUnit()), + guestPath: "/etc/systemd/system/" + guestnet.SystemdServiceName, // banger-network.service + mode: 0o644, + }, + { + content: []byte(vsockagent.ServiceUnit()), + guestPath: "/etc/systemd/system/" + vsockagent.ServiceName, // banger-vsock-agent.service + mode: 0o644, + }, + { + content: []byte(vsockagent.ModulesLoadConfig()), + guestPath: "/etc/modules-load.d/banger-vsock.conf", + mode: 0o644, + }, + { + content: []byte(FirstBootScript()), + guestPath: FirstBootScriptPath, // /usr/local/libexec/banger-first-boot + mode: 0o755, + }, + { + content: []byte(FirstBootUnit()), + guestPath: "/etc/systemd/system/" + FirstBootUnitName, + mode: 0o644, + }, + { + content: nil, // empty marker file — its existence triggers the service + guestPath: FirstBootMarkerPath, + mode: 0o644, + }, + } + + // Resolve content-backed steps to on-disk temp files. + for i := range steps { + if steps[i].hostSrc != "" { + continue + } + tmp := filepath.Join(stage, fmt.Sprintf("blob-%d", i)) + if err := os.WriteFile(tmp, steps[i].content, 0o644); err != nil { + return err + } + steps[i].hostSrc = tmp + } + + symlinks := []injectSymlink{ + { + target: "/etc/systemd/system/" + guestnet.SystemdServiceName, + link: "/etc/systemd/system/multi-user.target.wants/" + guestnet.SystemdServiceName, + }, + { + target: "/etc/systemd/system/" + vsockagent.ServiceName, + link: "/etc/systemd/system/multi-user.target.wants/" + vsockagent.ServiceName, + }, + { + target: "/etc/systemd/system/" + FirstBootUnitName, + link: "/etc/systemd/system/multi-user.target.wants/" + FirstBootUnitName, + }, + } + + script := buildInjectScript(steps, symlinks) + + stdinRunner, ok := runner.(system.StdinRunner) + if !ok { + return fmt.Errorf("inject requires a runner that supports stdin (got %T)", runner) + } + out, err := stdinRunner.RunStdin(ctx, script, "debugfs", "-w", "-f", "-", ext4File) + if err != nil { + return fmt.Errorf("debugfs inject: %w: %s", err, string(out)) + } + // Scan output for hard errors — debugfs keeps going past errors + // with -f, so we need to look at stdout/stderr-as-stdout for bad + // signs. mkdir errors on already-present dirs are expected; we + // ignore "File exists" and "Is a directory". Other errors bubble. + if bad := scanInjectOutput(out); bad != "" { + return fmt.Errorf("debugfs inject: %s", bad) + } + return nil +} + +type injectFile struct { + content []byte + hostSrc string // set by InjectGuestAgents after staging + guestPath string + mode uint32 // perm bits; type bits added by buildInjectScript +} + +type injectSymlink struct { + target string + link string +} + +// buildInjectScript emits the debugfs command stream. +func buildInjectScript(files []injectFile, symlinks []injectSymlink) *bytes.Buffer { + var buf bytes.Buffer + + // Create every ancestor directory of every file/symlink path. mkdir + // on an already-existing dir is benign (debugfs continues past the + // error), but we prune duplicates to keep the script clean. + dirs := collectAncestors(files, symlinks) + for _, d := range dirs { + fmt.Fprintf(&buf, "mkdir %s\n", d) + } + + // Write each file content. + for _, f := range files { + fmt.Fprintf(&buf, "write %s %s\n", f.hostSrc, f.guestPath) + } + + // Fix ownership + mode on every written file (uid=0, gid=0). + for _, f := range files { + fmt.Fprintf(&buf, "set_inode_field %s uid 0\n", f.guestPath) + fmt.Fprintf(&buf, "set_inode_field %s gid 0\n", f.guestPath) + fmt.Fprintf(&buf, "set_inode_field %s mode 0%o\n", f.guestPath, 0o100000|f.mode) + } + + // Fix dir ownership. Don't touch modes — mkdir's default 0755 is fine. + for _, d := range dirs { + fmt.Fprintf(&buf, "set_inode_field %s uid 0\n", d) + fmt.Fprintf(&buf, "set_inode_field %s gid 0\n", d) + } + + // Finally, create the enable-at-boot symlinks. + for _, s := range symlinks { + fmt.Fprintf(&buf, "symlink %s %s\n", s.link, s.target) + } + + return &buf +} + +// collectAncestors walks every file + symlink path and returns the unique +// set of parent directories, sorted shallowest first so mkdir ordering +// is valid. +func collectAncestors(files []injectFile, symlinks []injectSymlink) []string { + set := map[string]struct{}{} + add := func(p string) { + dir := filepath.Dir(p) + for dir != "" && dir != "/" { + set[dir] = struct{}{} + dir = filepath.Dir(dir) + } + } + for _, f := range files { + add(f.guestPath) + } + for _, s := range symlinks { + add(s.link) + } + out := make([]string, 0, len(set)) + for d := range set { + out = append(out, d) + } + // Shallow-first by depth, then lexicographic. + sort.Slice(out, func(i, j int) bool { + di := strings.Count(out[i], "/") + dj := strings.Count(out[j], "/") + if di != dj { + return di < dj + } + return out[i] < out[j] + }) + return out +} + +// scanInjectOutput returns a non-empty string if debugfs reported an +// error that's not a benign "File exists" from mkdir on an already- +// present directory. Debugfs emits errors on stderr AND stdout (which +// we capture together); we look for known failure signatures. +func scanInjectOutput(out []byte) string { + lines := strings.Split(string(out), "\n") + for _, line := range lines { + line = strings.TrimSpace(line) + if line == "" { + continue + } + // Benign: mkdir on existing dir. + if strings.Contains(line, "File exists") { + continue + } + // Failure signatures we care about. + if strings.Contains(line, "error writing file") || + strings.Contains(line, "couldn't find") || + strings.Contains(line, "No such file") || + strings.Contains(line, "Unrecognized command") || + strings.Contains(line, "symlink:") { + return line + } + } + return "" +} diff --git a/internal/imagepull/inject_test.go b/internal/imagepull/inject_test.go new file mode 100644 index 0000000..03354a5 --- /dev/null +++ b/internal/imagepull/inject_test.go @@ -0,0 +1,116 @@ +package imagepull + +import ( + "context" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + + "banger/internal/system" +) + +func TestInjectGuestAgentsWritesExpectedFiles(t *testing.T) { + if _, err := exec.LookPath("mkfs.ext4"); err != nil { + t.Skip("mkfs.ext4 not available; skipping") + } + if _, err := exec.LookPath("debugfs"); err != nil { + t.Skip("debugfs not available; skipping") + } + + // Build a bare ext4 from an empty (but non-empty-dir) source so + // debugfs has a valid filesystem to inject into. mkfs.ext4 -d + // wants the source dir itself to contain at least something. + src := t.TempDir() + if err := os.MkdirAll(filepath.Join(src, "usr"), 0o755); err != nil { + t.Fatal(err) + } + if err := os.MkdirAll(filepath.Join(src, "etc"), 0o755); err != nil { + t.Fatal(err) + } + ext4 := filepath.Join(t.TempDir(), "rootfs.ext4") + if err := BuildExt4(context.Background(), system.NewRunner(), src, ext4, MinExt4Size); err != nil { + t.Fatalf("BuildExt4: %v", err) + } + + // Fake vsock-agent binary content — InjectGuestAgents copies bytes + // verbatim so any file passes as a stand-in. + fakeAgent := filepath.Join(t.TempDir(), "banger-vsock-agent") + if err := os.WriteFile(fakeAgent, []byte("#!/bin/true\n"), 0o755); err != nil { + t.Fatal(err) + } + + if err := InjectGuestAgents(context.Background(), system.NewRunner(), ext4, GuestAgentAssets{ + VsockAgentBin: fakeAgent, + }); err != nil { + t.Fatalf("InjectGuestAgents: %v", err) + } + + // Verify each expected path is present via debugfs stat. + expectPaths := []string{ + "/usr/local/bin/banger-vsock-agent", + "/usr/local/libexec/banger-network-bootstrap", + "/etc/systemd/system/banger-network.service", + "/etc/systemd/system/banger-vsock-agent.service", + "/etc/modules-load.d/banger-vsock.conf", + "/etc/systemd/system/multi-user.target.wants/banger-network.service", + "/etc/systemd/system/multi-user.target.wants/banger-vsock-agent.service", + // Phase B-3 first-boot bits: + FirstBootScriptPath, + "/etc/systemd/system/" + FirstBootUnitName, + "/etc/systemd/system/multi-user.target.wants/" + FirstBootUnitName, + FirstBootMarkerPath, + } + for _, p := range expectPaths { + out, err := exec.Command("debugfs", "-R", "stat "+p, ext4).CombinedOutput() + if err != nil { + t.Errorf("debugfs stat %s: %v: %s", p, err, out) + continue + } + if strings.Contains(string(out), "couldn't find file") || strings.Contains(string(out), "File not found") { + t.Errorf("path missing: %s\noutput:\n%s", p, out) + } + } + + // Verify ownership on one file (uid=0). + statOut, err := exec.Command("debugfs", "-R", "stat /usr/local/bin/banger-vsock-agent", ext4).CombinedOutput() + if err != nil { + t.Fatalf("debugfs stat agent: %v: %s", err, statOut) + } + s := string(statOut) + if !strings.Contains(s, "User: 0") && !strings.Contains(s, "User: 0") { + t.Errorf("vsock-agent binary not uid=0:\n%s", s) + } + if !strings.Contains(s, "Mode: 0755") && !strings.Contains(s, "Mode: 100755") { + t.Errorf("vsock-agent binary mode not 0755:\n%s", s) + } +} + +func TestInjectGuestAgentsRequiresVsockAgentBinary(t *testing.T) { + err := InjectGuestAgents(context.Background(), system.NewRunner(), "/tmp/nonexistent.ext4", GuestAgentAssets{ + VsockAgentBin: "", + }) + if err == nil || !strings.Contains(err.Error(), "required") { + t.Fatalf("expected missing-binary error, got %v", err) + } +} + +func TestCollectAncestorsIsShallowFirst(t *testing.T) { + files := []injectFile{ + {guestPath: "/a/b/c/file"}, + } + symlinks := []injectSymlink{ + {link: "/x/y/z/link"}, + } + got := collectAncestors(files, symlinks) + want := []string{"/a", "/x", "/a/b", "/x/y", "/a/b/c", "/x/y/z"} + if len(got) != len(want) { + t.Fatalf("len got=%d want=%d: %v", len(got), len(want), got) + } + for i, g := range got { + if g != want[i] { + t.Errorf("index %d: got %q want %q", i, g, want[i]) + } + } +} diff --git a/internal/imagepull/ownership.go b/internal/imagepull/ownership.go new file mode 100644 index 0000000..ac7bd78 --- /dev/null +++ b/internal/imagepull/ownership.go @@ -0,0 +1,124 @@ +package imagepull + +import ( + "archive/tar" + "bytes" + "context" + "errors" + "fmt" + "sort" + "strings" + + "banger/internal/system" +) + +// ApplyOwnership rewrites the ext4 image's per-file uid/gid/mode to match +// the tar-header values Flatten captured. `mkfs.ext4 -d` preserves the +// on-disk ownership of the source tree — which is the runner's uid/gid, +// since we extracted as a regular user — so without this pass setuid +// binaries become setuid-nonroot and root-owned config files are +// readable by the runner's group. +// +// Implementation: stream a "set_inode_field" script to `debugfs -w`. +// One invocation handles tens of thousands of files; the bottleneck is +// debugfs's one-inode-at-a-time disk I/O, not process startup. +func ApplyOwnership(ctx context.Context, runner system.CommandRunner, ext4File string, meta Metadata) error { + if len(meta.Entries) == 0 { + return nil + } + script, err := buildOwnershipScript(meta) + if err != nil { + return err + } + if script.Len() == 0 { + return nil + } + stdinRunner, ok := runner.(system.StdinRunner) + if !ok { + return fmt.Errorf("ownership fixup requires a runner that supports stdin (got %T)", runner) + } + out, err := stdinRunner.RunStdin(ctx, script, "debugfs", "-w", "-f", "-", ext4File) + if err != nil { + return fmt.Errorf("debugfs ownership fixup: %w: %s", err, string(out)) + } + return nil +} + +// buildOwnershipScript emits one `set_inode_field` block per entry. +// Paths are prefixed with "/" so debugfs resolves them from the ext4 +// root. Entries are sorted for deterministic output (helps testing and +// makes debugfs's internal caching slightly more cache-friendly). +func buildOwnershipScript(meta Metadata) (*bytes.Buffer, error) { + var buf bytes.Buffer + paths := make([]string, 0, len(meta.Entries)) + for p := range meta.Entries { + paths = append(paths, p) + } + sort.Strings(paths) + for _, p := range paths { + m := meta.Entries[p] + mode := debugfsMode(m.Type, m.Mode) + if mode == 0 { + continue // hardlinks or unsupported types (skip) + } + if err := validateDebugFSPath(p); err != nil { + return nil, err + } + escaped := escapeDebugfsPath(p) + fmt.Fprintf(&buf, "set_inode_field %s uid %d\n", escaped, m.Uid) + fmt.Fprintf(&buf, "set_inode_field %s gid %d\n", escaped, m.Gid) + fmt.Fprintf(&buf, "set_inode_field %s mode 0%o\n", escaped, mode) + } + return &buf, nil +} + +// debugfsMode composes the full i_mode word (file-type bits + +// permission bits) that debugfs' `set_inode_field ... mode` expects. +// Returns 0 for types we don't set (hardlinks, unknown). +func debugfsMode(typ byte, hdrMode int64) uint32 { + perm := uint32(hdrMode) & 0o7777 + switch typ { + case tar.TypeReg: + return 0o100000 | perm + case tar.TypeDir: + return 0o040000 | perm + case tar.TypeSymlink: + return 0o120000 | perm + case tar.TypeChar: + return 0o020000 | perm + case tar.TypeBlock: + return 0o060000 | perm + case tar.TypeFifo: + return 0o010000 | perm + default: + return 0 + } +} + +var errUnsafeDebugFSPath = errors.New("unsafe path for debugfs ownership script") + +func validateDebugFSPath(rel string) error { + for i := 0; i < len(rel); i++ { + switch c := rel[i]; { + case c == '"': + return fmt.Errorf("%w: %q contains '\"'", errUnsafeDebugFSPath, rel) + case c == '\\': + return fmt.Errorf("%w: %q contains '\\\\'", errUnsafeDebugFSPath, rel) + case c < 0x20 || c == 0x7f: + return fmt.Errorf("%w: %q contains control byte 0x%02x", errUnsafeDebugFSPath, rel, c) + } + } + return nil +} + +// escapeDebugfsPath prepends "/" and wraps in double quotes if the path +// contains spaces. validateDebugFSPath rejects debugfs-hostile bytes +// before this runs, so the only quoting we need is the simple +// whitespace case debugfs already handles. +func escapeDebugfsPath(rel string) string { + abs := "/" + rel + if strings.ContainsRune(abs, ' ') { + return `"` + abs + `"` + } + return abs +} diff --git a/internal/installmeta/installmeta.go b/internal/installmeta/installmeta.go new file mode 100644 index 0000000..b7566bb --- /dev/null +++ b/internal/installmeta/installmeta.go @@ -0,0 +1,138 @@ +package installmeta + +import ( + "fmt" + "os" + "os/user" + "path/filepath" + "strconv" + "strings" + "time" + + toml "github.com/pelletier/go-toml" +) + +const ( + DefaultDir = "/etc/banger" + DefaultPath = DefaultDir + "/install.toml" + DefaultService = "bangerd.service" + DefaultRootHelperService = "bangerd-root.service" + DefaultSocketPath = "/run/banger/bangerd.sock" + DefaultRootHelperRuntimeDir = "/run/banger-root" + DefaultRootHelperSocketPath = DefaultRootHelperRuntimeDir + "/bangerd-root.sock" +) + +type Metadata struct { + OwnerUser string `toml:"owner_user"` + OwnerUID int `toml:"owner_uid"` + OwnerGID int `toml:"owner_gid"` + OwnerHome string `toml:"owner_home"` + InstalledAt time.Time `toml:"installed_at"` + Version string `toml:"version,omitempty"` + Commit string `toml:"commit,omitempty"` + BuiltAt string `toml:"built_at,omitempty"` +} + +func LookupOwner(name string) (Metadata, error) { + name = strings.TrimSpace(name) + if name == "" { + return Metadata{}, fmt.Errorf("owner username is required") + } + entry, err := user.Lookup(name) + if err != nil { + return Metadata{}, err + } + uid, err := strconv.Atoi(entry.Uid) + if err != nil { + return Metadata{}, fmt.Errorf("parse owner uid %q: %w", entry.Uid, err) + } + gid, err := strconv.Atoi(entry.Gid) + if err != nil { + return Metadata{}, fmt.Errorf("parse owner gid %q: %w", entry.Gid, err) + } + home := strings.TrimSpace(entry.HomeDir) + if home == "" || !filepath.IsAbs(home) { + return Metadata{}, fmt.Errorf("owner %q has invalid home directory %q", name, entry.HomeDir) + } + return Metadata{ + OwnerUser: name, + OwnerUID: uid, + OwnerGID: gid, + OwnerHome: home, + }, nil +} + +func Load(path string) (Metadata, error) { + if strings.TrimSpace(path) == "" { + path = DefaultPath + } + data, err := os.ReadFile(path) + if err != nil { + return Metadata{}, err + } + var meta Metadata + if err := toml.Unmarshal(data, &meta); err != nil { + return Metadata{}, err + } + if err := meta.Validate(); err != nil { + return Metadata{}, err + } + return meta, nil +} + +func Save(path string, meta Metadata) error { + if strings.TrimSpace(path) == "" { + path = DefaultPath + } + if err := meta.Validate(); err != nil { + return err + } + if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil { + return err + } + data, err := toml.Marshal(meta) + if err != nil { + return err + } + return os.WriteFile(path, data, 0o644) +} + +// UpdateBuildInfo refreshes only the Version / Commit / BuiltAt +// fields on the install metadata, preserving everything else +// (OwnerUser/UID/GID/Home and the original InstalledAt timestamp). +// Used by `banger update` to record what's running after a +// successful binary swap; the install identity is unchanged so +// re-running `banger system install` is not required. +// +// Errors when path doesn't exist or can't be parsed — `banger +// update` runs in system mode where install.toml IS the source of +// truth; a missing file means we shouldn't be updating at all. +func UpdateBuildInfo(path, version, commit, builtAt string) error { + if strings.TrimSpace(path) == "" { + path = DefaultPath + } + meta, err := Load(path) + if err != nil { + return err + } + meta.Version = strings.TrimSpace(version) + meta.Commit = strings.TrimSpace(commit) + meta.BuiltAt = strings.TrimSpace(builtAt) + return Save(path, meta) +} + +func (m Metadata) Validate() error { + if strings.TrimSpace(m.OwnerUser) == "" { + return fmt.Errorf("install metadata missing owner_user") + } + if m.OwnerUID < 0 { + return fmt.Errorf("install metadata has invalid owner_uid %d", m.OwnerUID) + } + if m.OwnerGID < 0 { + return fmt.Errorf("install metadata has invalid owner_gid %d", m.OwnerGID) + } + if strings.TrimSpace(m.OwnerHome) == "" || !filepath.IsAbs(m.OwnerHome) { + return fmt.Errorf("install metadata has invalid owner_home %q", m.OwnerHome) + } + return nil +} diff --git a/internal/installmeta/installmeta_test.go b/internal/installmeta/installmeta_test.go new file mode 100644 index 0000000..1b9044c --- /dev/null +++ b/internal/installmeta/installmeta_test.go @@ -0,0 +1,194 @@ +package installmeta + +import ( + "errors" + "os" + "os/user" + "path/filepath" + "strconv" + "testing" + "time" +) + +func TestSaveLoadRoundTrip(t *testing.T) { + path := filepath.Join(t.TempDir(), "install.toml") + want := Metadata{ + OwnerUser: "dev", + OwnerUID: 1000, + OwnerGID: 1000, + OwnerHome: "/home/dev", + InstalledAt: time.Unix(1710000000, 0).UTC(), + Version: "v1.2.3", + Commit: "abc123", + BuiltAt: "2026-04-23T00:00:00Z", + } + + if err := Save(path, want); err != nil { + t.Fatalf("Save: %v", err) + } + got, err := Load(path) + if err != nil { + t.Fatalf("Load: %v", err) + } + if got != want { + t.Fatalf("Load() = %+v, want %+v", got, want) + } +} + +func TestSaveCreatesParentDir(t *testing.T) { + path := filepath.Join(t.TempDir(), "nested", "dir", "install.toml") + meta := Metadata{OwnerUser: "dev", OwnerUID: 1, OwnerGID: 1, OwnerHome: "/home/dev"} + if err := Save(path, meta); err != nil { + t.Fatalf("Save: %v", err) + } + if _, err := os.Stat(path); err != nil { + t.Fatalf("file not written: %v", err) + } +} + +func TestSaveRejectsInvalidMetadata(t *testing.T) { + path := filepath.Join(t.TempDir(), "install.toml") + if err := Save(path, Metadata{OwnerUID: 1, OwnerGID: 1, OwnerHome: "/home/dev"}); err == nil { + t.Fatal("Save() = nil, want validation error") + } + if _, err := os.Stat(path); !errors.Is(err, os.ErrNotExist) { + t.Fatalf("Save wrote a file despite validation error: stat err = %v", err) + } +} + +func TestLoadMissingFile(t *testing.T) { + _, err := Load(filepath.Join(t.TempDir(), "missing.toml")) + if !errors.Is(err, os.ErrNotExist) { + t.Fatalf("Load() = %v, want os.ErrNotExist", err) + } +} + +func TestLoadInvalidTOML(t *testing.T) { + path := filepath.Join(t.TempDir(), "install.toml") + if err := os.WriteFile(path, []byte("not = valid = toml\n"), 0o644); err != nil { + t.Fatal(err) + } + if _, err := Load(path); err == nil { + t.Fatal("Load() = nil, want TOML parse error") + } +} + +func TestLoadRejectsInvalidPersistedMetadata(t *testing.T) { + // File parses but fails Validate (no owner_user) — Load must surface + // the validation error rather than returning a zero-value Metadata. + path := filepath.Join(t.TempDir(), "install.toml") + if err := os.WriteFile(path, []byte("owner_uid = 1\nowner_gid = 1\nowner_home = \"/home/dev\"\n"), 0o644); err != nil { + t.Fatal(err) + } + if _, err := Load(path); err == nil { + t.Fatal("Load() = nil, want validation error") + } +} + +func TestValidate(t *testing.T) { + tests := []struct { + name string + m Metadata + ok bool + }{ + {"valid", Metadata{OwnerUser: "dev", OwnerUID: 1, OwnerGID: 1, OwnerHome: "/home/dev"}, true}, + {"missing owner_user", Metadata{OwnerUID: 1, OwnerGID: 1, OwnerHome: "/home/dev"}, false}, + {"whitespace owner_user", Metadata{OwnerUser: " ", OwnerUID: 1, OwnerGID: 1, OwnerHome: "/home/dev"}, false}, + {"negative uid", Metadata{OwnerUser: "dev", OwnerUID: -1, OwnerGID: 1, OwnerHome: "/home/dev"}, false}, + {"negative gid", Metadata{OwnerUser: "dev", OwnerUID: 1, OwnerGID: -1, OwnerHome: "/home/dev"}, false}, + {"empty home", Metadata{OwnerUser: "dev", OwnerUID: 1, OwnerGID: 1, OwnerHome: ""}, false}, + {"relative home", Metadata{OwnerUser: "dev", OwnerUID: 1, OwnerGID: 1, OwnerHome: "home/dev"}, false}, + } + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + err := tc.m.Validate() + if tc.ok && err != nil { + t.Fatalf("Validate() = %v, want nil", err) + } + if !tc.ok && err == nil { + t.Fatal("Validate() = nil, want error") + } + }) + } +} + +func TestLookupOwnerEmpty(t *testing.T) { + if _, err := LookupOwner(""); err == nil { + t.Fatal("LookupOwner(\"\") = nil, want error") + } + if _, err := LookupOwner(" "); err == nil { + t.Fatal("LookupOwner(\" \") = nil, want error") + } +} + +func TestLookupOwnerMissing(t *testing.T) { + if _, err := LookupOwner("definitely-no-such-user-banger-test"); err == nil { + t.Fatal("LookupOwner(missing) = nil, want error") + } +} + +func TestLookupOwnerCurrentUser(t *testing.T) { + cur, err := user.Current() + if err != nil { + t.Skipf("user.Current: %v", err) + } + got, err := LookupOwner(cur.Username) + if err != nil { + t.Fatalf("LookupOwner(%q): %v", cur.Username, err) + } + wantUID, _ := strconv.Atoi(cur.Uid) + wantGID, _ := strconv.Atoi(cur.Gid) + if got.OwnerUser != cur.Username || got.OwnerUID != wantUID || got.OwnerGID != wantGID || got.OwnerHome != cur.HomeDir { + t.Fatalf("LookupOwner = %+v, want user=%s uid=%d gid=%d home=%s", + got, cur.Username, wantUID, wantGID, cur.HomeDir) + } +} + +func TestUpdateBuildInfo(t *testing.T) { + path := filepath.Join(t.TempDir(), "install.toml") + original := Metadata{ + OwnerUser: "dev", + OwnerUID: 1000, + OwnerGID: 1000, + OwnerHome: "/home/dev", + InstalledAt: time.Unix(1710000000, 0).UTC(), + Version: "v0.1.0", + Commit: "old", + BuiltAt: "2026-01-01T00:00:00Z", + } + if err := Save(path, original); err != nil { + t.Fatalf("Save: %v", err) + } + + if err := UpdateBuildInfo(path, " v0.2.0 ", " new ", " 2026-04-30T00:00:00Z "); err != nil { + t.Fatalf("UpdateBuildInfo: %v", err) + } + + got, err := Load(path) + if err != nil { + t.Fatalf("Load: %v", err) + } + if got.Version != "v0.2.0" || got.Commit != "new" || got.BuiltAt != "2026-04-30T00:00:00Z" { + t.Fatalf("build fields = %q/%q/%q, want trimmed values", got.Version, got.Commit, got.BuiltAt) + } + // Identity must be preserved. + if got.OwnerUser != original.OwnerUser || got.OwnerUID != original.OwnerUID || + got.OwnerGID != original.OwnerGID || got.OwnerHome != original.OwnerHome || + !got.InstalledAt.Equal(original.InstalledAt) { + t.Fatalf("identity changed: got %+v, want %+v", got, original) + } +} + +func TestUpdateBuildInfoMissingFile(t *testing.T) { + err := UpdateBuildInfo(filepath.Join(t.TempDir(), "missing.toml"), "v1", "c", "t") + if !errors.Is(err, os.ErrNotExist) { + t.Fatalf("UpdateBuildInfo() = %v, want os.ErrNotExist", err) + } +} + +func TestValidateRejectsMissingOwner(t *testing.T) { + err := Metadata{OwnerUID: 1000, OwnerGID: 1000, OwnerHome: "/home/dev"}.Validate() + if err == nil { + t.Fatal("Validate() = nil, want missing owner_user error") + } +} diff --git a/internal/kernelcat/catalog.go b/internal/kernelcat/catalog.go new file mode 100644 index 0000000..d703451 --- /dev/null +++ b/internal/kernelcat/catalog.go @@ -0,0 +1,59 @@ +package kernelcat + +import ( + _ "embed" + "encoding/json" + "fmt" + "os" +) + +//go:embed catalog.json +var embeddedCatalog []byte + +// Catalog is the published list of kernel bundles banger can pull. It ships +// embedded in the banger binary and is updated across releases; Phase 5 +// wires CI to regenerate it. +type Catalog struct { + Version int `json:"version"` + Entries []CatEntry `json:"entries"` +} + +// CatEntry describes one downloadable kernel bundle. +type CatEntry struct { + Name string `json:"name"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` + KernelVersion string `json:"kernel_version,omitempty"` + TarballURL string `json:"tarball_url"` + TarballSHA256 string `json:"tarball_sha256"` + SizeBytes int64 `json:"size_bytes,omitempty"` + Description string `json:"description,omitempty"` +} + +// LoadEmbedded returns the catalog compiled into this banger binary. +func LoadEmbedded() (Catalog, error) { + return ParseCatalog(embeddedCatalog) +} + +// ParseCatalog decodes a catalog.json payload. An empty payload is valid +// and returns a zero Catalog. +func ParseCatalog(data []byte) (Catalog, error) { + var cat Catalog + if len(data) == 0 { + return cat, nil + } + if err := json.Unmarshal(data, &cat); err != nil { + return Catalog{}, fmt.Errorf("parse catalog: %w", err) + } + return cat, nil +} + +// Lookup returns the catalog entry matching name, or os.ErrNotExist. +func (c Catalog) Lookup(name string) (CatEntry, error) { + for _, e := range c.Entries { + if e.Name == name { + return e, nil + } + } + return CatEntry{}, os.ErrNotExist +} diff --git a/internal/kernelcat/catalog.json b/internal/kernelcat/catalog.json new file mode 100644 index 0000000..dea4cd1 --- /dev/null +++ b/internal/kernelcat/catalog.json @@ -0,0 +1,15 @@ +{ + "version": 1, + "entries": [ + { + "name": "generic-6.12", + "distro": "generic", + "arch": "x86_64", + "kernel_version": "6.12.8", + "tarball_url": "https://kernels.thaloco.com/generic-6.12-x86_64.tar.zst", + "tarball_sha256": "d6f9ba2a957260063241cf9d79ae538d0c349107d37f0bfccc33281d29bd0901", + "size_bytes": 9098722, + "description": "Generic Firecracker kernel 6.12.8 (all drivers built-in, no initrd needed)" + } + ] +} diff --git a/internal/kernelcat/catalog_test.go b/internal/kernelcat/catalog_test.go new file mode 100644 index 0000000..2f26463 --- /dev/null +++ b/internal/kernelcat/catalog_test.go @@ -0,0 +1,52 @@ +package kernelcat + +import ( + "errors" + "os" + "testing" +) + +func TestParseCatalogEmpty(t *testing.T) { + t.Parallel() + cat, err := ParseCatalog(nil) + if err != nil { + t.Fatalf("ParseCatalog(nil): %v", err) + } + if len(cat.Entries) != 0 { + t.Fatalf("entries = %d, want 0", len(cat.Entries)) + } +} + +func TestParseCatalogValid(t *testing.T) { + t.Parallel() + cat, err := ParseCatalog([]byte(`{"version":1,"entries":[{"name":"void-6.12","distro":"void","tarball_url":"https://example/v.tar.zst","tarball_sha256":"abc"}]}`)) + if err != nil { + t.Fatalf("ParseCatalog: %v", err) + } + if cat.Version != 1 || len(cat.Entries) != 1 || cat.Entries[0].Name != "void-6.12" { + t.Fatalf("catalog = %+v", cat) + } +} + +func TestCatalogLookup(t *testing.T) { + t.Parallel() + cat := Catalog{Entries: []CatEntry{{Name: "a"}, {Name: "b"}}} + if entry, err := cat.Lookup("b"); err != nil || entry.Name != "b" { + t.Fatalf("Lookup(b) = %+v, %v", entry, err) + } + if _, err := cat.Lookup("c"); !errors.Is(err, os.ErrNotExist) { + t.Fatalf("Lookup(missing) err = %v, want ErrNotExist", err) + } +} + +func TestLoadEmbeddedReturnsValidCatalog(t *testing.T) { + t.Parallel() + cat, err := LoadEmbedded() + if err != nil { + t.Fatalf("LoadEmbedded: %v", err) + } + if cat.Version != 1 { + t.Fatalf("embedded catalog.Version = %d, want 1", cat.Version) + } + // Embedded catalog starts empty; Phase 5 CI populates it. +} diff --git a/internal/kernelcat/fetch.go b/internal/kernelcat/fetch.go new file mode 100644 index 0000000..3a9fe7a --- /dev/null +++ b/internal/kernelcat/fetch.go @@ -0,0 +1,227 @@ +package kernelcat + +import ( + "archive/tar" + "context" + "crypto/sha256" + "encoding/hex" + "fmt" + "io" + "net/http" + "os" + "path/filepath" + "strings" + "time" + + "github.com/klauspost/compress/zstd" +) + +// MaxFetchedKernelBytes caps the compressed kernel-tarball download. +// Without this the previous flow streamed straight into the tar+zstd +// extractor and only verified SHA256 afterwards, so a malicious or +// compromised mirror could fill the host disk before the hash check +// fired. Now we stage to a temp file under targetDir, hash on the +// way in, and refuse to decompress on hash mismatch — worst-case +// disk use is bounded by this cap. Override per-call by setting this +// var before invoking Fetch. +var MaxFetchedKernelBytes int64 = 8 << 30 // 8 GiB + +// Fetch downloads the tarball for entry, verifies its SHA256, extracts it +// into //, and writes a manifest. On failure it +// removes the partially-populated target directory. +// +// The tarball is expected to be a tar+zstd archive whose root contains +// vmlinux and optionally initrd.img and/or a modules/ directory. Path +// traversal entries (..) and absolute-path members are rejected. +func Fetch(ctx context.Context, client *http.Client, kernelsDir string, entry CatEntry) (Entry, error) { + if err := ValidateName(entry.Name); err != nil { + return Entry{}, err + } + if strings.TrimSpace(entry.TarballURL) == "" { + return Entry{}, fmt.Errorf("catalog entry %q has no tarball URL", entry.Name) + } + if strings.TrimSpace(entry.TarballSHA256) == "" { + return Entry{}, fmt.Errorf("catalog entry %q has no tarball sha256", entry.Name) + } + if client == nil { + client = http.DefaultClient + } + + if err := DeleteLocal(kernelsDir, entry.Name); err != nil { + return Entry{}, fmt.Errorf("clear prior catalog entry: %w", err) + } + targetDir := EntryDir(kernelsDir, entry.Name) + if err := os.MkdirAll(targetDir, 0o755); err != nil { + return Entry{}, err + } + + cleanup := func() { _ = os.RemoveAll(targetDir) } + + req, err := http.NewRequestWithContext(ctx, http.MethodGet, entry.TarballURL, nil) + if err != nil { + cleanup() + return Entry{}, err + } + resp, err := client.Do(req) + if err != nil { + cleanup() + return Entry{}, fmt.Errorf("fetch %s: %w", entry.TarballURL, err) + } + defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + cleanup() + return Entry{}, fmt.Errorf("fetch %s: HTTP %s", entry.TarballURL, resp.Status) + } + + if resp.ContentLength > MaxFetchedKernelBytes { + cleanup() + return Entry{}, fmt.Errorf("tarball advertised %d bytes, exceeds %d-byte cap", resp.ContentLength, MaxFetchedKernelBytes) + } + + // Stage compressed download to a temp file first so we can verify + // SHA256 BEFORE decompressing or extracting. Cap reads to + // MaxFetchedKernelBytes+1 — anything larger is refused. + tmp, err := os.CreateTemp(targetDir, "banger-kernel-*.tar.zst") + if err != nil { + cleanup() + return Entry{}, fmt.Errorf("create staging file: %w", err) + } + tmpPath := tmp.Name() + defer os.Remove(tmpPath) + + hasher := sha256.New() + limited := io.LimitReader(resp.Body, MaxFetchedKernelBytes+1) + n, copyErr := io.Copy(io.MultiWriter(tmp, hasher), limited) + if closeErr := tmp.Close(); copyErr == nil && closeErr != nil { + copyErr = closeErr + } + if copyErr != nil { + cleanup() + return Entry{}, fmt.Errorf("download tarball: %w", copyErr) + } + if n > MaxFetchedKernelBytes { + cleanup() + return Entry{}, fmt.Errorf("tarball exceeded %d-byte cap before sha256 check", MaxFetchedKernelBytes) + } + + got := hex.EncodeToString(hasher.Sum(nil)) + if !strings.EqualFold(got, entry.TarballSHA256) { + cleanup() + return Entry{}, fmt.Errorf("tarball sha256 mismatch: got %s, want %s", got, entry.TarballSHA256) + } + + src, err := os.Open(tmpPath) + if err != nil { + cleanup() + return Entry{}, fmt.Errorf("reopen staged tarball: %w", err) + } + defer src.Close() + zr, err := zstd.NewReader(src) + if err != nil { + cleanup() + return Entry{}, fmt.Errorf("init zstd: %w", err) + } + defer zr.Close() + + if err := extractTar(zr, targetDir); err != nil { + cleanup() + return Entry{}, err + } + + kernelPath := filepath.Join(targetDir, kernelFilename) + if _, err := os.Stat(kernelPath); err != nil { + cleanup() + return Entry{}, fmt.Errorf("tarball missing %s: %w", kernelFilename, err) + } + kernelSum, err := SumFile(kernelPath) + if err != nil { + cleanup() + return Entry{}, err + } + + stored := Entry{ + Name: entry.Name, + Distro: entry.Distro, + Arch: entry.Arch, + KernelVersion: entry.KernelVersion, + SHA256: kernelSum, + Source: "pull:" + entry.TarballURL, + ImportedAt: time.Now().UTC(), + } + if err := WriteLocal(kernelsDir, stored); err != nil { + cleanup() + return Entry{}, err + } + return ReadLocal(kernelsDir, entry.Name) +} + +// extractTar writes each regular file / dir / safe symlink from r into +// target, refusing any member whose normalised path would escape target. +func extractTar(r io.Reader, target string) error { + absTarget, err := filepath.Abs(target) + if err != nil { + return err + } + tr := tar.NewReader(r) + for { + hdr, err := tr.Next() + if err == io.EOF { + return nil + } + if err != nil { + return fmt.Errorf("read tarball: %w", err) + } + rel := filepath.Clean(hdr.Name) + if rel == "." || rel == string(filepath.Separator) { + continue + } + if filepath.IsAbs(rel) || strings.HasPrefix(rel, ".."+string(filepath.Separator)) || rel == ".." { + return fmt.Errorf("unsafe path in tarball: %q", hdr.Name) + } + dst := filepath.Join(absTarget, rel) + if dst != absTarget && !strings.HasPrefix(dst, absTarget+string(filepath.Separator)) { + return fmt.Errorf("unsafe path in tarball: %q", hdr.Name) + } + switch hdr.Typeflag { + case tar.TypeDir: + if err := os.MkdirAll(dst, os.FileMode(hdr.Mode)|0o755); err != nil { + return err + } + case tar.TypeReg: + if err := os.MkdirAll(filepath.Dir(dst), 0o755); err != nil { + return err + } + f, err := os.OpenFile(dst, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.FileMode(hdr.Mode)|0o600) + if err != nil { + return err + } + if _, err := io.Copy(f, tr); err != nil { + _ = f.Close() + return err + } + if err := f.Close(); err != nil { + return err + } + case tar.TypeSymlink: + if err := os.MkdirAll(filepath.Dir(dst), 0o755); err != nil { + return err + } + // Absolute targets are interpreted at runtime against the + // eventual rootfs (`/` inside the VM), so they're rooted + // inside absTarget by construction. Only relative targets + // need an escape check at write time. + if !filepath.IsAbs(hdr.Linkname) { + resolved := filepath.Clean(filepath.Join(filepath.Dir(dst), hdr.Linkname)) + if resolved != absTarget && !strings.HasPrefix(resolved, absTarget+string(filepath.Separator)) { + return fmt.Errorf("unsafe symlink in tarball: %q -> %q", hdr.Name, hdr.Linkname) + } + } + if err := os.Symlink(hdr.Linkname, dst); err != nil { + return err + } + default: + // Hardlinks / device nodes / fifos: skip silently. Kernel + // module trees shouldn't need them. + } + } +} diff --git a/internal/kernelcat/fetch_test.go b/internal/kernelcat/fetch_test.go new file mode 100644 index 0000000..3f7db1c --- /dev/null +++ b/internal/kernelcat/fetch_test.go @@ -0,0 +1,231 @@ +package kernelcat + +import ( + "archive/tar" + "bytes" + "context" + "crypto/sha256" + "encoding/hex" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "strings" + "testing" + + "github.com/klauspost/compress/zstd" +) + +// tarballFile describes one member of the test tarball. +type tarballFile struct { + name string + mode int64 + data []byte + link string // for symlinks + dir bool +} + +func buildTestTarball(t *testing.T, files []tarballFile) ([]byte, string) { + t.Helper() + var tarBuf bytes.Buffer + tw := tar.NewWriter(&tarBuf) + for _, f := range files { + hdr := &tar.Header{Name: f.name, Mode: f.mode} + switch { + case f.dir: + hdr.Typeflag = tar.TypeDir + hdr.Mode = 0o755 + case f.link != "": + hdr.Typeflag = tar.TypeSymlink + hdr.Linkname = f.link + default: + hdr.Typeflag = tar.TypeReg + hdr.Size = int64(len(f.data)) + if hdr.Mode == 0 { + hdr.Mode = 0o644 + } + } + if err := tw.WriteHeader(hdr); err != nil { + t.Fatalf("tar WriteHeader: %v", err) + } + if hdr.Typeflag == tar.TypeReg { + if _, err := tw.Write(f.data); err != nil { + t.Fatalf("tar Write: %v", err) + } + } + } + if err := tw.Close(); err != nil { + t.Fatalf("tar Close: %v", err) + } + + var compressed bytes.Buffer + zw, err := zstd.NewWriter(&compressed) + if err != nil { + t.Fatalf("zstd NewWriter: %v", err) + } + if _, err := zw.Write(tarBuf.Bytes()); err != nil { + t.Fatalf("zstd Write: %v", err) + } + if err := zw.Close(); err != nil { + t.Fatalf("zstd Close: %v", err) + } + sum := sha256.Sum256(compressed.Bytes()) + return compressed.Bytes(), hex.EncodeToString(sum[:]) +} + +func serveTarball(t *testing.T, body []byte) *httptest.Server { + t.Helper() + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/octet-stream") + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + return srv +} + +func TestFetchExtractsTarballAndWritesManifest(t *testing.T) { + t.Parallel() + body, sum := buildTestTarball(t, []tarballFile{ + {name: "vmlinux", data: []byte("kernel-bytes")}, + {name: "initrd.img", data: []byte("initrd-bytes")}, + {name: "modules", dir: true}, + {name: "modules/modules.dep", data: []byte("dep")}, + }) + srv := serveTarball(t, body) + + kernelsDir := t.TempDir() + stored, err := Fetch(context.Background(), nil, kernelsDir, CatEntry{ + Name: "void-6.12", + Distro: "void", + Arch: "x86_64", + KernelVersion: "6.12.79_1", + TarballURL: srv.URL + "/pkg.tar.zst", + TarballSHA256: sum, + }) + if err != nil { + t.Fatalf("Fetch: %v", err) + } + if stored.Name != "void-6.12" || stored.Distro != "void" { + t.Fatalf("stored = %+v", stored) + } + if stored.SHA256 == "" { + t.Errorf("SHA256 not populated") + } + + for _, rel := range []string{"vmlinux", "initrd.img", "modules/modules.dep", "manifest.json"} { + if _, err := os.Stat(filepath.Join(kernelsDir, "void-6.12", rel)); err != nil { + t.Errorf("expected %s in catalog: %v", rel, err) + } + } +} + +func TestFetchRejectsShaMismatch(t *testing.T) { + t.Parallel() + body, _ := buildTestTarball(t, []tarballFile{ + {name: "vmlinux", data: []byte("k")}, + }) + srv := serveTarball(t, body) + + kernelsDir := t.TempDir() + _, err := Fetch(context.Background(), nil, kernelsDir, CatEntry{ + Name: "void-6.12", + TarballURL: srv.URL + "/pkg.tar.zst", + TarballSHA256: "000000000000000000000000000000000000000000000000000000000000beef", + }) + if err == nil || !strings.Contains(err.Error(), "sha256 mismatch") { + t.Fatalf("expected sha256 mismatch, got %v", err) + } + if _, statErr := os.Stat(filepath.Join(kernelsDir, "void-6.12")); !os.IsNotExist(statErr) { + t.Fatalf("target dir should be cleaned up on mismatch: %v", statErr) + } +} + +// TestFetchRejectsOversizedTarballBeforeExtraction pins the new +// disk-bound cap: with MaxFetchedKernelBytes set artificially low the +// staged download trips the limit and refuses to decompress, so a +// compromised mirror can't fill the host disk before the SHA check +// fires. +func TestFetchRejectsOversizedTarballBeforeExtraction(t *testing.T) { + body, sum := buildTestTarball(t, []tarballFile{ + {name: "vmlinux", data: bytes.Repeat([]byte("k"), 4096)}, + }) + srv := serveTarball(t, body) + + prev := MaxFetchedKernelBytes + MaxFetchedKernelBytes = 64 + t.Cleanup(func() { MaxFetchedKernelBytes = prev }) + + kernelsDir := t.TempDir() + _, err := Fetch(context.Background(), nil, kernelsDir, CatEntry{ + Name: "void-6.12", + TarballURL: srv.URL + "/pkg.tar.zst", + TarballSHA256: sum, + }) + if err == nil { + t.Fatal("Fetch succeeded against oversized tarball; want size-cap rejection") + } + if !strings.Contains(err.Error(), "cap") { + t.Fatalf("err = %v, want size-cap message", err) + } + // targetDir should be cleaned up by the existing cleanup() path. + if _, statErr := os.Stat(filepath.Join(kernelsDir, "void-6.12")); !os.IsNotExist(statErr) { + t.Fatalf("target dir should be removed on size-cap rejection: %v", statErr) + } +} + +func TestFetchRejectsMissingKernel(t *testing.T) { + t.Parallel() + body, sum := buildTestTarball(t, []tarballFile{ + {name: "initrd.img", data: []byte("i")}, // no vmlinux + }) + srv := serveTarball(t, body) + kernelsDir := t.TempDir() + _, err := Fetch(context.Background(), nil, kernelsDir, CatEntry{ + Name: "broken", + TarballURL: srv.URL + "/pkg.tar.zst", + TarballSHA256: sum, + }) + if err == nil || !strings.Contains(err.Error(), "missing vmlinux") { + t.Fatalf("expected missing vmlinux, got %v", err) + } +} + +func TestFetchRejectsPathTraversal(t *testing.T) { + t.Parallel() + body, sum := buildTestTarball(t, []tarballFile{ + {name: "vmlinux", data: []byte("k")}, + {name: "../escape", data: []byte("bad")}, + }) + srv := serveTarball(t, body) + kernelsDir := t.TempDir() + _, err := Fetch(context.Background(), nil, kernelsDir, CatEntry{ + Name: "bad-tarball", + TarballURL: srv.URL + "/pkg.tar.zst", + TarballSHA256: sum, + }) + if err == nil || !strings.Contains(err.Error(), "unsafe path") { + t.Fatalf("expected unsafe path error, got %v", err) + } + escapePath := filepath.Join(filepath.Dir(kernelsDir), "escape") + if _, statErr := os.Stat(escapePath); !os.IsNotExist(statErr) { + t.Fatalf("traversal escape file should not exist: %v", statErr) + } +} + +func TestFetchRejectsHTTPError(t *testing.T) { + t.Parallel() + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + http.Error(w, "nope", http.StatusNotFound) + })) + t.Cleanup(srv.Close) + + kernelsDir := t.TempDir() + _, err := Fetch(context.Background(), nil, kernelsDir, CatEntry{ + Name: "missing", + TarballURL: srv.URL + "/pkg.tar.zst", + TarballSHA256: "deadbeef", + }) + if err == nil || !strings.Contains(err.Error(), "404") { + t.Fatalf("expected HTTP 404, got %v", err) + } +} diff --git a/internal/kernelcat/import.go b/internal/kernelcat/import.go new file mode 100644 index 0000000..92fc970 --- /dev/null +++ b/internal/kernelcat/import.go @@ -0,0 +1,169 @@ +package kernelcat + +import ( + "encoding/json" + "errors" + "fmt" + "os" + "path/filepath" + "sort" +) + +// DiscoveredArtifacts is what DiscoverPaths returns: absolute paths to a +// kernel, an optional initrd, and an optional modules directory located +// under the staged output of make-*-kernel.sh (or an equivalent layout). +type DiscoveredArtifacts struct { + KernelPath string + InitrdPath string + ModulesDir string +} + +// metadataFile is the optional JSON a kernel-build script can drop +// alongside its staged output to point ReadLocal at specific filenames +// without guessing. +type metadataFile struct { + KernelPath string `json:"kernel_path"` + InitrdPath string `json:"initrd_path"` + ModulesDir string `json:"modules_dir"` +} + +// DiscoverPaths locates kernel / initrd / modules artifacts under fromDir. +// It prefers a metadata.json emitted by make-*-kernel.sh; otherwise it +// falls back to globbing boot/vmlinux-*, boot/vmlinuz-* (for Alpine), +// boot/initramfs-*, and the newest subdir under lib/modules/. +func DiscoverPaths(fromDir string) (DiscoveredArtifacts, error) { + info, err := os.Stat(fromDir) + if err != nil { + return DiscoveredArtifacts{}, err + } + if !info.IsDir() { + return DiscoveredArtifacts{}, fmt.Errorf("%s is not a directory", fromDir) + } + + if paths, ok, err := discoverFromMetadata(fromDir); err != nil { + return DiscoveredArtifacts{}, err + } else if ok { + return paths, nil + } + + bootDir := filepath.Join(fromDir, "boot") + kernel, err := latestMatch(bootDir, []string{"vmlinux-*", "vmlinuz-*"}) + if err != nil { + return DiscoveredArtifacts{}, fmt.Errorf("locate kernel under %s: %w", bootDir, err) + } + initrd, err := latestMatch(bootDir, []string{"initramfs-*"}) + if err != nil && !errors.Is(err, os.ErrNotExist) { + return DiscoveredArtifacts{}, fmt.Errorf("locate initrd under %s: %w", bootDir, err) + } + modules, err := latestSubdir(filepath.Join(fromDir, "lib", "modules")) + if err != nil && !errors.Is(err, os.ErrNotExist) { + return DiscoveredArtifacts{}, fmt.Errorf("locate modules under %s: %w", fromDir, err) + } + return DiscoveredArtifacts{ + KernelPath: kernel, + InitrdPath: initrd, + ModulesDir: modules, + }, nil +} + +func discoverFromMetadata(fromDir string) (DiscoveredArtifacts, bool, error) { + data, err := os.ReadFile(filepath.Join(fromDir, "metadata.json")) + if err != nil { + if errors.Is(err, os.ErrNotExist) { + return DiscoveredArtifacts{}, false, nil + } + return DiscoveredArtifacts{}, false, err + } + var meta metadataFile + if err := json.Unmarshal(data, &meta); err != nil { + return DiscoveredArtifacts{}, false, fmt.Errorf("parse metadata.json in %s: %w", fromDir, err) + } + kernel := absoluteOrAnchored(fromDir, meta.KernelPath) + if kernel == "" { + return DiscoveredArtifacts{}, false, nil + } + if _, err := os.Stat(kernel); err != nil { + return DiscoveredArtifacts{}, false, fmt.Errorf("metadata.json references missing kernel %s: %w", kernel, err) + } + out := DiscoveredArtifacts{KernelPath: kernel} + if meta.InitrdPath != "" { + candidate := absoluteOrAnchored(fromDir, meta.InitrdPath) + if _, err := os.Stat(candidate); err == nil { + out.InitrdPath = candidate + } + } + if meta.ModulesDir != "" { + candidate := absoluteOrAnchored(fromDir, meta.ModulesDir) + if info, err := os.Stat(candidate); err == nil && info.IsDir() { + out.ModulesDir = candidate + } + } + return out, true, nil +} + +// absoluteOrAnchored returns path as-is if absolute; otherwise joins it to +// anchor. Empty input returns "". +func absoluteOrAnchored(anchor, path string) string { + path = filepath.Clean(path) + if path == "" || path == "." { + return "" + } + if filepath.IsAbs(path) { + return path + } + return filepath.Join(anchor, path) +} + +// latestMatch returns the lexicographically latest file in dir matching any +// of patterns (filename globs, not full paths). Returns os.ErrNotExist if no +// match. +func latestMatch(dir string, patterns []string) (string, error) { + if _, err := os.Stat(dir); err != nil { + return "", err + } + entries, err := os.ReadDir(dir) + if err != nil { + return "", err + } + var matches []string + for _, entry := range entries { + if entry.IsDir() { + continue + } + for _, pattern := range patterns { + ok, _ := filepath.Match(pattern, entry.Name()) + if ok { + matches = append(matches, entry.Name()) + break + } + } + } + if len(matches) == 0 { + return "", os.ErrNotExist + } + sort.Strings(matches) + return filepath.Join(dir, matches[len(matches)-1]), nil +} + +// latestSubdir returns the lexicographically latest subdirectory of root. +// Returns os.ErrNotExist if root is missing or has no subdirs. +func latestSubdir(root string) (string, error) { + if _, err := os.Stat(root); err != nil { + return "", err + } + entries, err := os.ReadDir(root) + if err != nil { + return "", err + } + var dirs []string + for _, entry := range entries { + if entry.IsDir() { + dirs = append(dirs, entry.Name()) + } + } + if len(dirs) == 0 { + return "", os.ErrNotExist + } + sort.Strings(dirs) + return filepath.Join(root, dirs[len(dirs)-1]), nil +} diff --git a/internal/kernelcat/import_test.go b/internal/kernelcat/import_test.go new file mode 100644 index 0000000..5147f6d --- /dev/null +++ b/internal/kernelcat/import_test.go @@ -0,0 +1,133 @@ +package kernelcat + +import ( + "errors" + "os" + "path/filepath" + "testing" +) + +func writeFile(t *testing.T, path string, data string) { + t.Helper() + if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(path, []byte(data), 0o644); err != nil { + t.Fatal(err) + } +} + +func TestDiscoverPathsPrefersMetadataJSON(t *testing.T) { + t.Parallel() + dir := t.TempDir() + writeFile(t, filepath.Join(dir, "boot", "vmlinux-custom"), "ignored") + writeFile(t, filepath.Join(dir, "boot", "initramfs-custom"), "ignored") + writeFile(t, filepath.Join(dir, "boot", "vmlinux-pick-me"), "kernel") + writeFile(t, filepath.Join(dir, "boot", "initramfs-pick-me"), "initrd") + if err := os.MkdirAll(filepath.Join(dir, "lib", "modules", "6.12.79_1"), 0o755); err != nil { + t.Fatal(err) + } + metadata := `{ +"kernel_path": "boot/vmlinux-pick-me", +"initrd_path": "boot/initramfs-pick-me", +"modules_dir": "lib/modules/6.12.79_1" +}` + writeFile(t, filepath.Join(dir, "metadata.json"), metadata) + + got, err := DiscoverPaths(dir) + if err != nil { + t.Fatalf("DiscoverPaths: %v", err) + } + if got.KernelPath != filepath.Join(dir, "boot", "vmlinux-pick-me") { + t.Errorf("KernelPath = %q", got.KernelPath) + } + if got.InitrdPath != filepath.Join(dir, "boot", "initramfs-pick-me") { + t.Errorf("InitrdPath = %q", got.InitrdPath) + } + if got.ModulesDir != filepath.Join(dir, "lib", "modules", "6.12.79_1") { + t.Errorf("ModulesDir = %q", got.ModulesDir) + } +} + +func TestDiscoverPathsFallsBackToGlobbing(t *testing.T) { + t.Parallel() + dir := t.TempDir() + writeFile(t, filepath.Join(dir, "boot", "vmlinux-6.12.0"), "k") + writeFile(t, filepath.Join(dir, "boot", "vmlinux-6.12.1"), "newer") + writeFile(t, filepath.Join(dir, "boot", "initramfs-6.12.1"), "i") + if err := os.MkdirAll(filepath.Join(dir, "lib", "modules", "6.12.0"), 0o755); err != nil { + t.Fatal(err) + } + if err := os.MkdirAll(filepath.Join(dir, "lib", "modules", "6.12.1"), 0o755); err != nil { + t.Fatal(err) + } + + got, err := DiscoverPaths(dir) + if err != nil { + t.Fatalf("DiscoverPaths: %v", err) + } + if got.KernelPath != filepath.Join(dir, "boot", "vmlinux-6.12.1") { + t.Errorf("KernelPath = %q, want latest", got.KernelPath) + } + if got.InitrdPath != filepath.Join(dir, "boot", "initramfs-6.12.1") { + t.Errorf("InitrdPath = %q", got.InitrdPath) + } + if got.ModulesDir != filepath.Join(dir, "lib", "modules", "6.12.1") { + t.Errorf("ModulesDir = %q, want latest subdir", got.ModulesDir) + } +} + +func TestDiscoverPathsAlpineVmlinuzFallback(t *testing.T) { + t.Parallel() + dir := t.TempDir() + // Alpine older layouts may only ship vmlinuz-virt. + writeFile(t, filepath.Join(dir, "boot", "vmlinuz-virt"), "k") + writeFile(t, filepath.Join(dir, "boot", "initramfs-virt"), "i") + + got, err := DiscoverPaths(dir) + if err != nil { + t.Fatalf("DiscoverPaths: %v", err) + } + if got.KernelPath != filepath.Join(dir, "boot", "vmlinuz-virt") { + t.Errorf("KernelPath = %q, want vmlinuz-virt fallback", got.KernelPath) + } +} + +func TestDiscoverPathsMissingKernelIsError(t *testing.T) { + t.Parallel() + dir := t.TempDir() + // boot/ exists but contains no kernel + if err := os.MkdirAll(filepath.Join(dir, "boot"), 0o755); err != nil { + t.Fatal(err) + } + _, err := DiscoverPaths(dir) + if err == nil { + t.Fatal("expected error when no kernel present") + } + if !errors.Is(err, os.ErrNotExist) && !containsErr(err, "locate kernel") { + t.Fatalf("error shape: %v", err) + } +} + +func TestDiscoverPathsNotADirectory(t *testing.T) { + t.Parallel() + path := filepath.Join(t.TempDir(), "file") + writeFile(t, path, "") + _, err := DiscoverPaths(path) + if err == nil { + t.Fatal("expected error when fromDir is a file") + } +} + +func containsErr(err error, substr string) bool { + return err != nil && (err.Error() == substr || len(err.Error()) >= len(substr) && errContains(err.Error(), substr)) +} + +func errContains(s, substr string) bool { + for i := 0; i+len(substr) <= len(s); i++ { + if s[i:i+len(substr)] == substr { + return true + } + } + return false +} diff --git a/internal/kernelcat/kernelcat.go b/internal/kernelcat/kernelcat.go new file mode 100644 index 0000000..d203134 --- /dev/null +++ b/internal/kernelcat/kernelcat.go @@ -0,0 +1,184 @@ +// Package kernelcat is the on-disk catalog of Firecracker-ready kernel +// bundles. Each entry lives at // and contains a +// manifest.json alongside the vmlinux, optional initrd.img, and optional +// modules/ tree. The package owns the layout, manifest read/write, and +// validation; it does not talk to the network (remote pulls are layered on +// later). +package kernelcat + +import ( + "crypto/sha256" + "encoding/hex" + "encoding/json" + "errors" + "fmt" + "io" + "os" + "path/filepath" + "regexp" + "sort" + "strings" + "time" +) + +// Filenames used inside an entry directory. +const ( + manifestFilename = "manifest.json" + kernelFilename = "vmlinux" + initrdFilename = "initrd.img" + modulesDirName = "modules" +) + +// Entry describes a cataloged kernel bundle. Paths are absolute and +// populated from the entry's on-disk layout when read via ReadLocal / +// ListLocal; they are never written into the manifest itself. +type Entry struct { + Name string `json:"name"` + Distro string `json:"distro,omitempty"` + Arch string `json:"arch,omitempty"` + KernelVersion string `json:"kernel_version,omitempty"` + SHA256 string `json:"sha256,omitempty"` + Source string `json:"source,omitempty"` + ImportedAt time.Time `json:"imported_at"` + + // Populated on read, not persisted: + KernelPath string `json:"-"` + InitrdPath string `json:"-"` + ModulesDir string `json:"-"` +} + +// namePattern matches names that are safe as single filesystem components. +// Intentionally strict so entry names stay short and script-friendly. +var namePattern = regexp.MustCompile(`^[a-zA-Z0-9][a-zA-Z0-9._-]{0,63}$`) + +// ValidateName returns an error unless name is a non-empty identifier made +// of alphanumerics, dots, hyphens, and underscores, starting with an +// alphanumeric and at most 64 characters long. +func ValidateName(name string) error { + if strings.TrimSpace(name) == "" { + return errors.New("kernel name is required") + } + if !namePattern.MatchString(name) { + return fmt.Errorf("invalid kernel name %q: use alphanumerics, dots, hyphens, underscores (<=64 chars, starts with alphanumeric)", name) + } + return nil +} + +// EntryDir returns the absolute directory path for name under kernelsDir. +func EntryDir(kernelsDir, name string) string { + return filepath.Join(kernelsDir, name) +} + +// ReadLocal reads the manifest for name and resolves per-artifact paths. +// Returns os.ErrNotExist-compatible error if the entry is missing. +func ReadLocal(kernelsDir, name string) (Entry, error) { + if err := ValidateName(name); err != nil { + return Entry{}, err + } + dir := EntryDir(kernelsDir, name) + data, err := os.ReadFile(filepath.Join(dir, manifestFilename)) + if err != nil { + return Entry{}, err + } + var entry Entry + if err := json.Unmarshal(data, &entry); err != nil { + return Entry{}, fmt.Errorf("parse manifest for %q: %w", name, err) + } + if entry.Name == "" { + entry.Name = name + } + if entry.Name != name { + return Entry{}, fmt.Errorf("manifest name %q does not match directory %q", entry.Name, name) + } + entry.KernelPath = filepath.Join(dir, kernelFilename) + if fi, err := os.Stat(filepath.Join(dir, initrdFilename)); err == nil && !fi.IsDir() { + entry.InitrdPath = filepath.Join(dir, initrdFilename) + } + if fi, err := os.Stat(filepath.Join(dir, modulesDirName)); err == nil && fi.IsDir() { + entry.ModulesDir = filepath.Join(dir, modulesDirName) + } + return entry, nil +} + +// ListLocal returns every entry under kernelsDir with a readable manifest, +// sorted by name. Directories without a manifest are skipped silently so +// partial imports don't break the list. +func ListLocal(kernelsDir string) ([]Entry, error) { + dirEntries, err := os.ReadDir(kernelsDir) + if err != nil { + if os.IsNotExist(err) { + return nil, nil + } + return nil, err + } + entries := make([]Entry, 0, len(dirEntries)) + for _, de := range dirEntries { + if !de.IsDir() { + continue + } + name := de.Name() + if err := ValidateName(name); err != nil { + continue + } + entry, err := ReadLocal(kernelsDir, name) + if err != nil { + if os.IsNotExist(err) { + continue + } + return nil, err + } + entries = append(entries, entry) + } + sort.Slice(entries, func(i, j int) bool { return entries[i].Name < entries[j].Name }) + return entries, nil +} + +// WriteLocal persists entry's manifest.json. The caller is responsible for +// placing vmlinux / initrd.img / modules/ under the entry dir first. +func WriteLocal(kernelsDir string, entry Entry) error { + if err := ValidateName(entry.Name); err != nil { + return err + } + dir := EntryDir(kernelsDir, entry.Name) + if err := os.MkdirAll(dir, 0o755); err != nil { + return err + } + if entry.ImportedAt.IsZero() { + entry.ImportedAt = time.Now().UTC() + } + data, err := json.MarshalIndent(entry, "", " ") + if err != nil { + return err + } + return os.WriteFile(filepath.Join(dir, manifestFilename), append(data, '\n'), 0o644) +} + +// DeleteLocal removes the entry directory entirely. Missing entries are a +// no-op so callers can idempotently clean up. +func DeleteLocal(kernelsDir, name string) error { + if err := ValidateName(name); err != nil { + return err + } + dir := EntryDir(kernelsDir, name) + if _, err := os.Stat(dir); err != nil { + if os.IsNotExist(err) { + return nil + } + return err + } + return os.RemoveAll(dir) +} + +// SumFile returns the hex-encoded SHA256 of the file at path. +func SumFile(path string) (string, error) { + f, err := os.Open(path) + if err != nil { + return "", err + } + defer f.Close() + hasher := sha256.New() + if _, err := io.Copy(hasher, f); err != nil { + return "", err + } + return hex.EncodeToString(hasher.Sum(nil)), nil +} diff --git a/internal/kernelcat/kernelcat_test.go b/internal/kernelcat/kernelcat_test.go new file mode 100644 index 0000000..ee935d5 --- /dev/null +++ b/internal/kernelcat/kernelcat_test.go @@ -0,0 +1,171 @@ +package kernelcat + +import ( + "errors" + "os" + "path/filepath" + "testing" + "time" +) + +func TestValidateName(t *testing.T) { + t.Parallel() + cases := []struct { + name string + wantErr bool + }{ + {"void-6.12", false}, + {"alpine_3.20", false}, + {"a", false}, + {"Void-6.12", false}, + {"", true}, + {"-leading-dash", true}, + {".leading-dot", true}, + {"has space", true}, + {"has/slash", true}, + {"../escape", true}, + } + for _, tc := range cases { + err := ValidateName(tc.name) + if tc.wantErr && err == nil { + t.Errorf("ValidateName(%q) err=nil, want error", tc.name) + } + if !tc.wantErr && err != nil { + t.Errorf("ValidateName(%q) err=%v, want nil", tc.name, err) + } + } +} + +func TestWriteAndReadLocalRoundTrip(t *testing.T) { + t.Parallel() + dir := t.TempDir() + entry := Entry{ + Name: "void-6.12", + Distro: "void", + Arch: "x86_64", + KernelVersion: "6.12.79_1", + SHA256: "deadbeef", + Source: "import:testdata", + } + if err := WriteLocal(dir, entry); err != nil { + t.Fatalf("WriteLocal: %v", err) + } + + kernelPath := filepath.Join(dir, entry.Name, "vmlinux") + if err := os.WriteFile(kernelPath, []byte("kernel-bytes"), 0o644); err != nil { + t.Fatalf("write kernel: %v", err) + } + modulesPath := filepath.Join(dir, entry.Name, "modules", "6.12.79_1", "modules.dep") + if err := os.MkdirAll(filepath.Dir(modulesPath), 0o755); err != nil { + t.Fatalf("mkdir modules: %v", err) + } + if err := os.WriteFile(modulesPath, []byte(""), 0o644); err != nil { + t.Fatalf("write modules stub: %v", err) + } + + got, err := ReadLocal(dir, entry.Name) + if err != nil { + t.Fatalf("ReadLocal: %v", err) + } + if got.Name != entry.Name || got.Distro != "void" || got.KernelVersion != "6.12.79_1" { + t.Fatalf("ReadLocal round-trip mismatch: %+v", got) + } + if got.KernelPath != kernelPath { + t.Errorf("KernelPath = %q, want %q", got.KernelPath, kernelPath) + } + if got.InitrdPath != "" { + t.Errorf("InitrdPath = %q, want empty (no initrd on disk)", got.InitrdPath) + } + if got.ModulesDir != filepath.Join(dir, entry.Name, "modules") { + t.Errorf("ModulesDir = %q", got.ModulesDir) + } + if got.ImportedAt.IsZero() { + t.Errorf("ImportedAt not populated by WriteLocal") + } + if time.Since(got.ImportedAt) > time.Minute { + t.Errorf("ImportedAt too far in the past: %v", got.ImportedAt) + } +} + +func TestReadLocalRejectsMismatchedName(t *testing.T) { + t.Parallel() + dir := t.TempDir() + entryDir := filepath.Join(dir, "void-6.12") + if err := os.MkdirAll(entryDir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(entryDir, manifestFilename), []byte(`{"name":"other"}`), 0o644); err != nil { + t.Fatal(err) + } + if _, err := ReadLocal(dir, "void-6.12"); err == nil { + t.Fatal("ReadLocal should reject manifest with mismatched name") + } +} + +func TestListLocalSkipsManifestless(t *testing.T) { + t.Parallel() + dir := t.TempDir() + if err := WriteLocal(dir, Entry{Name: "alpine-3.20"}); err != nil { + t.Fatal(err) + } + if err := os.MkdirAll(filepath.Join(dir, "orphan"), 0o755); err != nil { + t.Fatal(err) + } + entries, err := ListLocal(dir) + if err != nil { + t.Fatalf("ListLocal: %v", err) + } + if len(entries) != 1 || entries[0].Name != "alpine-3.20" { + t.Fatalf("ListLocal = %+v, want one alpine-3.20", entries) + } +} + +func TestListLocalReturnsEmptyForMissingDir(t *testing.T) { + t.Parallel() + entries, err := ListLocal(filepath.Join(t.TempDir(), "nope")) + if err != nil { + t.Fatalf("ListLocal: %v", err) + } + if len(entries) != 0 { + t.Fatalf("ListLocal = %v, want empty", entries) + } +} + +func TestDeleteLocalRemovesEntry(t *testing.T) { + t.Parallel() + dir := t.TempDir() + if err := WriteLocal(dir, Entry{Name: "void-6.12"}); err != nil { + t.Fatal(err) + } + if err := DeleteLocal(dir, "void-6.12"); err != nil { + t.Fatalf("DeleteLocal: %v", err) + } + if _, err := os.Stat(EntryDir(dir, "void-6.12")); !errors.Is(err, os.ErrNotExist) { + t.Fatalf("expected entry dir removed, stat err=%v", err) + } +} + +func TestDeleteLocalIdempotent(t *testing.T) { + t.Parallel() + if err := DeleteLocal(t.TempDir(), "never-existed"); err != nil { + t.Fatalf("DeleteLocal on missing entry: %v", err) + } +} + +func TestSumFileMatchesSHA256(t *testing.T) { + t.Parallel() + dir := t.TempDir() + path := filepath.Join(dir, "data") + if err := os.WriteFile(path, []byte("banger"), 0o644); err != nil { + t.Fatal(err) + } + sum, err := SumFile(path) + if err != nil { + t.Fatalf("SumFile: %v", err) + } + // precomputed sha256("banger") + const want = "e0c69eae8afb38872fa425c2cdba794176f3b9d97e8eefb7b0e7c831f566458f" + if sum != want { + t.Fatalf("SumFile = %q, want %q", sum, want) + } +} diff --git a/internal/model/types.go b/internal/model/types.go index 0cfb904..e533602 100644 --- a/internal/model/types.go +++ b/internal/model/types.go @@ -11,18 +11,18 @@ import ( ) const ( - DefaultBridgeName = "br-fc" - DefaultBridgeIP = "172.16.0.1" - DefaultCIDR = "24" - DefaultDNS = "1.1.1.1" - DefaultSystemOverlaySize = 8 * 1024 * 1024 * 1024 - DefaultWorkDiskSize = 8 * 1024 * 1024 * 1024 - DefaultMemoryMiB = 2048 - DefaultVCPUCount = 2 - DefaultStatsPollInterval = 10 * time.Second - DefaultStaleSweepInterval = 1 * time.Minute - DefaultMetricsPollInterval = 15 * time.Second - MaxDiskBytes int64 = 128 * 1024 * 1024 * 1024 + DefaultBridgeName = "br-fc" + DefaultBridgeIP = "172.16.0.1" + DefaultCIDR = "24" + DefaultDNS = "1.1.1.1" + DefaultSystemOverlaySize = 8 * 1024 * 1024 * 1024 + DefaultWorkDiskSize = 8 * 1024 * 1024 * 1024 + DefaultMemoryMiB = 2048 + DefaultVCPUCount = 2 + DefaultStatsPollInterval = 10 * time.Second + DefaultStaleSweepInterval = 1 * time.Minute + MaxDiskBytes int64 = 128 * 1024 * 1024 * 1024 + DefaultJailerBinary = "/usr/bin/jailer" ) type VMState string @@ -35,19 +35,37 @@ const ( ) type DaemonConfig struct { - LogLevel string - WebListenAddr string - FirecrackerBin string - SSHKeyPath string - AutoStopStaleAfter time.Duration - StatsPollInterval time.Duration - MetricsPollInterval time.Duration - BridgeName string - BridgeIP string - CIDR string - TapPoolSize int - DefaultDNS string - DefaultImageName string + LogLevel string + FirecrackerBin string + JailerBin string + JailerEnabled bool + JailerChrootBase string + SSHKeyPath string + HostHomeDir string + AutoStopStaleAfter time.Duration + StatsPollInterval time.Duration + BridgeName string + BridgeIP string + CIDR string + TapPoolSize int + DefaultDNS string + DefaultImageName string + FileSync []FileSyncEntry + VMDefaults VMDefaultsOverride +} + +// FileSyncEntry is a user-declared host→guest file or directory copy +// applied to each VM's work disk at vm create time. Host is expanded +// against the configured owner home for "~/..." and must stay within +// that home; Guest is expanded against /root (banger VMs are +// single-user root). If the host path is a directory, it's copied +// recursively; if it's a file, it's copied as a file. Missing host +// paths are a soft skip (warned, not fatal). Mode defaults to 0600 +// for files and 0755 for directories. +type FileSyncEntry struct { + Host string + Guest string + Mode string } type Image struct { @@ -62,7 +80,6 @@ type Image struct { ModulesDir string `json:"modules_dir,omitempty"` BuildSize string `json:"build_size,omitempty"` SeededSSHPublicKeyFingerprint string `json:"seeded_ssh_public_key_fingerprint,omitempty"` - Docker bool `json:"docker"` CreatedAt time.Time `json:"created_at"` UpdatedAt time.Time `json:"updated_at"` } @@ -75,25 +92,40 @@ type VMSpec struct { NATEnabled bool `json:"nat_enabled"` } +// VMRuntime holds the durable runtime state that the daemon needs +// to reach a VM: identity, declared state, and deterministic derived +// paths. The authoritative live handle set still lives on VMHandles, +// but teardown-critical storage/network identifiers are mirrored here +// as recovery fallbacks so restart-time cleanup still works when +// handles.json is missing or corrupt. +// +// Everything in VMRuntime is safe to persist: the paths are +// deterministic from (VM ID, layout) and survive restart unchanged; +// GuestIP and DNSName are assigned at create time and never move; +// LastError carries the last failure message for debugging. State +// mirrors VMRecord.State. type VMRuntime struct { - State VMState `json:"state"` - PID int `json:"pid,omitempty"` - GuestIP string `json:"guest_ip"` - TapDevice string `json:"tap_device,omitempty"` - APISockPath string `json:"api_sock_path,omitempty"` - VSockPath string `json:"vsock_path,omitempty"` - VSockCID uint32 `json:"vsock_cid,omitempty"` - LogPath string `json:"log_path,omitempty"` - MetricsPath string `json:"metrics_path,omitempty"` - DNSName string `json:"dns_name,omitempty"` - VMDir string `json:"vm_dir"` - SystemOverlay string `json:"system_overlay_path"` - WorkDiskPath string `json:"work_disk_path"` - BaseLoop string `json:"base_loop,omitempty"` - COWLoop string `json:"cow_loop,omitempty"` - DMName string `json:"dm_name,omitempty"` - DMDev string `json:"dm_dev,omitempty"` - LastError string `json:"last_error,omitempty"` + State VMState `json:"state"` + GuestIP string `json:"guest_ip"` + APISockPath string `json:"api_sock_path,omitempty"` + VSockPath string `json:"vsock_path,omitempty"` + VSockCID uint32 `json:"vsock_cid,omitempty"` + LogPath string `json:"log_path,omitempty"` + MetricsPath string `json:"metrics_path,omitempty"` + DNSName string `json:"dns_name,omitempty"` + VMDir string `json:"vm_dir"` + // Teardown fallback fields mirror the handle cache onto the VM row. + // They are recovery-only: while the daemon is alive, VMHandles stays + // authoritative. On restart, cleanup can fall back to these values if + // handles.json is missing or corrupt. + TapDevice string `json:"tap_device,omitempty"` + BaseLoop string `json:"base_loop,omitempty"` + COWLoop string `json:"cow_loop,omitempty"` + DMName string `json:"dm_name,omitempty"` + DMDev string `json:"dm_dev,omitempty"` + SystemOverlay string `json:"system_overlay_path"` + WorkDiskPath string `json:"work_disk_path"` + LastError string `json:"last_error,omitempty"` } type VMStats struct { @@ -107,16 +139,17 @@ type VMStats struct { } type VMRecord struct { - ID string `json:"id"` - Name string `json:"name"` - ImageID string `json:"image_id"` - State VMState `json:"state"` - CreatedAt time.Time `json:"created_at"` - UpdatedAt time.Time `json:"updated_at"` - LastTouchedAt time.Time `json:"last_touched_at"` - Spec VMSpec `json:"spec"` - Runtime VMRuntime `json:"runtime"` - Stats VMStats `json:"stats"` + ID string `json:"id"` + Name string `json:"name"` + ImageID string `json:"image_id"` + State VMState `json:"state"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` + LastTouchedAt time.Time `json:"last_touched_at"` + Spec VMSpec `json:"spec"` + Runtime VMRuntime `json:"runtime"` + Stats VMStats `json:"stats"` + Workspace VMWorkspace `json:"workspace"` } type VMCreateRequest struct { @@ -138,14 +171,38 @@ type VMSetRequest struct { NATEnabled *bool } -type ImageBuildRequest struct { - Name string - FromImage string - Size string - KernelPath string - InitrdPath string - ModulesDir string - Docker bool +// VMWorkspace records the last successful workspace.prepare result on +// a VM so callers can skip re-stating the source path on every exec +// and so banger can detect drift between the guest copy and the host +// repo. Stored as workspace_json in the vms table; zero value means +// no workspace has been prepared on this VM yet. +type VMWorkspace struct { + GuestPath string `json:"guest_path,omitempty"` + SourcePath string `json:"source_path,omitempty"` + HeadCommit string `json:"head_commit,omitempty"` + PreparedAt time.Time `json:"prepared_at,omitempty"` +} + +type WorkspacePrepareMode string + +const ( + WorkspacePrepareModeShallowOverlay WorkspacePrepareMode = "shallow_overlay" + WorkspacePrepareModeFullCopy WorkspacePrepareMode = "full_copy" + WorkspacePrepareModeMetadataOnly WorkspacePrepareMode = "metadata_only" +) + +type WorkspacePrepareResult struct { + VMID string `json:"vm_id"` + SourcePath string `json:"source_path"` + RepoRoot string `json:"repo_root"` + RepoName string `json:"repo_name"` + GuestPath string `json:"guest_path"` + Mode WorkspacePrepareMode `json:"mode"` + HeadCommit string `json:"head_commit,omitempty"` + CurrentBranch string `json:"current_branch,omitempty"` + BranchName string `json:"branch_name,omitempty"` + BaseCommit string `json:"base_commit,omitempty"` + PreparedAt time.Time `json:"prepared_at"` } func Now() time.Time { @@ -160,6 +217,21 @@ func NewID() (string, error) { return hex.EncodeToString(buf), nil } +// NewOpID returns a short identifier for tracing a single RPC +// operation across the daemon, the root helper, and the user-visible +// CLI error string. Format: "op-" + 12 hex chars (48 bits of entropy +// — collisions inside one daemon session are vanishingly unlikely +// and don't matter beyond it). Short enough to copy-paste from a +// CLI error into a journalctl --grep, long enough to actually +// disambiguate. +func NewOpID() (string, error) { + buf := make([]byte, 6) + if _, err := rand.Read(buf); err != nil { + return "", err + } + return "op-" + hex.EncodeToString(buf), nil +} + func ParseSize(raw string) (int64, error) { if raw == "" { return 0, errors.New("size is required") @@ -168,23 +240,26 @@ func ParseSize(raw string) (int64, error) { if raw == "" { return 0, errors.New("size is required") } - unit := raw[len(raw)-1] + // Strip an optional "IB" suffix so that "GiB", "MiB", "KiB" work the + // same as "G", "M", "K" (case-insensitive after ToUpper). + number := strings.TrimSuffix(raw, "IB") + unit := number[len(number)-1] multiplier := int64(1024 * 1024) - number := raw switch unit { case 'K': multiplier = 1024 - number = raw[:len(raw)-1] + number = number[:len(number)-1] case 'M': multiplier = 1024 * 1024 - number = raw[:len(raw)-1] + number = number[:len(number)-1] case 'G': multiplier = 1024 * 1024 * 1024 - number = raw[:len(raw)-1] + number = number[:len(number)-1] default: if unit < '0' || unit > '9' { return 0, fmt.Errorf("unsupported size suffix: %q", string(unit)) } + number = raw // no suffix stripped — keep original digits-only string } value, err := strconv.ParseInt(number, 10, 64) if err != nil { diff --git a/internal/model/types_test.go b/internal/model/types_test.go new file mode 100644 index 0000000..d2c0de4 --- /dev/null +++ b/internal/model/types_test.go @@ -0,0 +1,135 @@ +package model + +import ( + "strings" + "testing" +) + +func TestParseSize(t *testing.T) { + const ( + kib = int64(1024) + mib = int64(1024 * 1024) + gib = int64(1024 * 1024 * 1024) + ) + + cases := []struct { + name string + input string + want int64 + wantErrSub string + }{ + // Happy path — short suffixes. + {"1G", "1G", gib, ""}, + {"512M", "512M", 512 * mib, ""}, + {"4K", "4K", 4 * kib, ""}, + {"4G", "4G", 4 * gib, ""}, + + // GiB/MiB/KiB suffixes — parser now accepts these. + {"4GiB", "4GiB", 4 * gib, ""}, + {"512MiB", "512MiB", 512 * mib, ""}, + {"4KiB", "4KiB", 4 * kib, ""}, + + // Lowercase — ToUpper normalises; should work like uppercase. + {"lowercase 1g", "1g", gib, ""}, + {"lowercase 512m", "512m", 512 * mib, ""}, + {"lowercase 4gib", "4gib", 4 * gib, ""}, + + // No-suffix — treated as MiB (the parser's default multiplier is 1 MiB). + // "1024" → 1024 MiB, "1" → 1 MiB. + {"no-suffix 1024", "1024", 1024 * mib, ""}, + {"no-suffix 1", "1", mib, ""}, + + // Whitespace trimming. + {"leading space", " 2G", 2 * gib, ""}, + {"trailing space", "2G ", 2 * gib, ""}, + {"both spaces", " 2G ", 2 * gib, ""}, + + // Error cases. + {"empty string", "", 0, "required"}, + {"whitespace only", " ", 0, "required"}, + {"unknown suffix B", "512B", 0, "unsupported size suffix"}, + {"negative", "-1G", 0, "positive"}, + {"zero", "0G", 0, "positive"}, + {"overflow MaxDiskBytes", "129G", 0, "exceeds max"}, + {"non-numeric", "xG", 0, "parse size"}, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got, err := ParseSize(tc.input) + if tc.wantErrSub != "" { + if err == nil { + t.Fatalf("ParseSize(%q) = %d, want error containing %q", tc.input, got, tc.wantErrSub) + } + if !strings.Contains(err.Error(), tc.wantErrSub) { + t.Fatalf("ParseSize(%q) error = %q, want substring %q", tc.input, err.Error(), tc.wantErrSub) + } + return + } + if err != nil { + t.Fatalf("ParseSize(%q) unexpected error: %v", tc.input, err) + } + if got != tc.want { + t.Fatalf("ParseSize(%q) = %d, want %d", tc.input, got, tc.want) + } + }) + } +} + +func TestFormatSizeBytes(t *testing.T) { + const ( + kib = int64(1024) + mib = int64(1024 * 1024) + gib = int64(1024 * 1024 * 1024) + ) + + cases := []struct { + name string + input int64 + want string + }{ + // FormatSizeBytes(0): 0 is divisible by GiB so it formats as "0G". + {"0", 0, "0G"}, + {"1 byte", 1, "1"}, + {"1 KiB", kib, "1K"}, + {"4 KiB", 4 * kib, "4K"}, + {"1 MiB", mib, "1M"}, + {"512 MiB", 512 * mib, "512M"}, + {"1 GiB", gib, "1G"}, + {"4 GiB", 4 * gib, "4G"}, + {"128 GiB (max disk)", 128 * gib, "128G"}, + // Non-round: falls through to raw bytes. + {"non-round bytes", 1500, "1500"}, + {"non-round MiB", 3*mib + 1, "3145729"}, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := FormatSizeBytes(tc.input) + if got != tc.want { + t.Fatalf("FormatSizeBytes(%d) = %q, want %q", tc.input, got, tc.want) + } + }) + } +} + +func TestParseSizeFormatRoundTrip(t *testing.T) { + const ( + kib = int64(1024) + mib = int64(1024 * 1024) + gib = int64(1024 * 1024 * 1024) + ) + + boundaries := []int64{kib, 4 * kib, mib, 512 * mib, gib, 4 * gib, 8 * gib} + for _, n := range boundaries { + formatted := FormatSizeBytes(n) + parsed, err := ParseSize(formatted) + if err != nil { + t.Errorf("ParseSize(FormatSizeBytes(%d) = %q): %v", n, formatted, err) + continue + } + if parsed != n { + t.Errorf("round-trip(%d): FormatSizeBytes → %q → ParseSize → %d", n, formatted, parsed) + } + } +} diff --git a/internal/model/vm_defaults.go b/internal/model/vm_defaults.go new file mode 100644 index 0000000..24e783e --- /dev/null +++ b/internal/model/vm_defaults.go @@ -0,0 +1,134 @@ +package model + +import "fmt" + +// VMDefaults captures the baseline sizing applied to a new VM when the +// user omits the corresponding --vcpu / --memory / --disk-size flags. +// Each field carries a Source tag explaining where the number came +// from so UI layers can surface provenance ("auto" vs "config" vs +// "built-in default"). +type VMDefaults struct { + VCPUCount int + MemoryMiB int + WorkDiskSizeBytes int64 + SystemOverlaySizeByte int64 + + // Source describes which layer won for each field, one of: + // "config" — user set it in config.toml + // "auto" — computed from host resources + // "builtin"— hardcoded fallback + VCPUSource string + MemorySource string + WorkDiskSource string + SystemOverlaySource string +} + +// VMDefaultsOverride is the optional block users can place in +// config.toml's [vm_defaults]. Zero-value fields mean "not set, let +// banger decide." +type VMDefaultsOverride struct { + VCPUCount int + MemoryMiB int + WorkDiskSizeBytes int64 + SystemOverlaySizeByte int64 +} + +// ResolveVMDefaults picks effective VM defaults from (in order) the +// user's config overrides, then host-derived heuristics, then baked-in +// constants. hostCPUs and hostMemoryBytes are what `system.ReadHost +// Resources` reports; 0 on either is treated as "unknown" and skipped, +// which pushes that field down to the builtin fallback. +func ResolveVMDefaults(override VMDefaultsOverride, hostCPUs int, hostMemoryBytes int64) VMDefaults { + d := VMDefaults{ + VCPUCount: DefaultVCPUCount, + MemoryMiB: DefaultMemoryMiB, + WorkDiskSizeBytes: DefaultWorkDiskSize, + SystemOverlaySizeByte: DefaultSystemOverlaySize, + VCPUSource: "builtin", + MemorySource: "builtin", + WorkDiskSource: "builtin", + SystemOverlaySource: "builtin", + } + + // vCPU: config > auto > builtin. + switch { + case override.VCPUCount > 0: + d.VCPUCount = override.VCPUCount + d.VCPUSource = "config" + case hostCPUs > 0: + d.VCPUCount = autoVCPU(hostCPUs) + d.VCPUSource = "auto" + } + + // Memory MiB: config > auto > builtin. + switch { + case override.MemoryMiB > 0: + d.MemoryMiB = override.MemoryMiB + d.MemorySource = "config" + case hostMemoryBytes > 0: + d.MemoryMiB = autoMemoryMiB(hostMemoryBytes) + d.MemorySource = "auto" + } + + // Work disk: config > builtin. Disk is a COW overlay — growing + // the allocation with host RAM gives nothing useful, so no auto. + if override.WorkDiskSizeBytes > 0 { + d.WorkDiskSizeBytes = override.WorkDiskSizeBytes + d.WorkDiskSource = "config" + } + + // System overlay: config > builtin. + if override.SystemOverlaySizeByte > 0 { + d.SystemOverlaySizeByte = override.SystemOverlaySizeByte + d.SystemOverlaySource = "config" + } + + return d +} + +// autoVCPU clamps cpus/4 into [1, 4]. A 2-vcpu sandbox is the sweet +// spot for most dev loops; going higher rarely helps interactive use +// and starves the host of cores. +func autoVCPU(hostCPUs int) int { + candidate := hostCPUs / 4 + if candidate < 1 { + candidate = 1 + } + if candidate > 4 { + candidate = 4 + } + return candidate +} + +// autoMemoryMiB caps at host/8, floor 1 GiB, ceiling 8 GiB. 1/8 leaves +// plenty of headroom for the host even if several VMs run +// concurrently; 8 GiB is enough for most language toolchains without +// being hostile on 32 GiB laptops. +func autoMemoryMiB(hostMemoryBytes int64) int { + const ( + mib = int64(1024 * 1024) + gib = 1024 * mib + floorMiB = 1024 // 1 GiB + cappedMiB = 8 * 1024 // 8 GiB + ) + candidate := hostMemoryBytes / 8 / mib + if candidate < floorMiB { + candidate = floorMiB + } + if candidate > cappedMiB { + candidate = cappedMiB + } + // Round down to 256 MiB multiples for tidier output. + candidate -= candidate % 256 + if candidate < floorMiB { + candidate = floorMiB + } + return int(candidate) +} + +// FormatSpecLine renders a one-line summary of VM sizing suitable for +// progress output or doctor display. +func (d VMDefaults) FormatSpecLine() string { + return fmt.Sprintf("%d vcpu · %d MiB · %s disk", + d.VCPUCount, d.MemoryMiB, FormatSizeBytes(d.WorkDiskSizeBytes)) +} diff --git a/internal/model/vm_defaults_test.go b/internal/model/vm_defaults_test.go new file mode 100644 index 0000000..f7f47d8 --- /dev/null +++ b/internal/model/vm_defaults_test.go @@ -0,0 +1,107 @@ +package model + +import ( + "strings" + "testing" +) + +func TestResolveVMDefaultsBuiltinFallback(t *testing.T) { + // No config override, no host info → every field is "builtin". + d := ResolveVMDefaults(VMDefaultsOverride{}, 0, 0) + + if d.VCPUCount != DefaultVCPUCount || d.VCPUSource != "builtin" { + t.Errorf("vcpu = %d (%s), want %d (builtin)", d.VCPUCount, d.VCPUSource, DefaultVCPUCount) + } + if d.MemoryMiB != DefaultMemoryMiB || d.MemorySource != "builtin" { + t.Errorf("memory = %d (%s), want %d (builtin)", d.MemoryMiB, d.MemorySource, DefaultMemoryMiB) + } + if d.WorkDiskSizeBytes != DefaultWorkDiskSize || d.WorkDiskSource != "builtin" { + t.Errorf("disk = %d (%s), want %d (builtin)", d.WorkDiskSizeBytes, d.WorkDiskSource, DefaultWorkDiskSize) + } +} + +func TestResolveVMDefaultsAutoFromHost(t *testing.T) { + // 8 host cores, 16 GiB RAM → 2 vcpus, 2 GiB memory. + d := ResolveVMDefaults(VMDefaultsOverride{}, 8, 16*gib) + + if d.VCPUCount != 2 || d.VCPUSource != "auto" { + t.Errorf("vcpu = %d (%s), want 2 (auto)", d.VCPUCount, d.VCPUSource) + } + if d.MemoryMiB != 2048 || d.MemorySource != "auto" { + t.Errorf("memory = %d (%s), want 2048 (auto)", d.MemoryMiB, d.MemorySource) + } + // Disk has no auto policy — still builtin. + if d.WorkDiskSource != "builtin" { + t.Errorf("disk source = %s, want builtin", d.WorkDiskSource) + } +} + +func TestResolveVMDefaultsConfigWinsOverAuto(t *testing.T) { + override := VMDefaultsOverride{VCPUCount: 6, MemoryMiB: 4096, WorkDiskSizeBytes: 16 * gib} + d := ResolveVMDefaults(override, 8, 16*gib) + + if d.VCPUCount != 6 || d.VCPUSource != "config" { + t.Errorf("vcpu = %d (%s), want 6 (config)", d.VCPUCount, d.VCPUSource) + } + if d.MemoryMiB != 4096 || d.MemorySource != "config" { + t.Errorf("memory = %d (%s), want 4096 (config)", d.MemoryMiB, d.MemorySource) + } + if d.WorkDiskSizeBytes != 16*gib || d.WorkDiskSource != "config" { + t.Errorf("disk = %d (%s), want 16*gib (config)", d.WorkDiskSizeBytes, d.WorkDiskSource) + } +} + +func TestAutoVCPUClamps(t *testing.T) { + cases := []struct { + host, want int + }{ + {1, 1}, // floor + {2, 1}, + {4, 1}, + {5, 1}, + {7, 1}, + {8, 2}, + {16, 4}, + {32, 4}, // ceiling + {128, 4}, // ceiling sticks + } + for _, tc := range cases { + if got := autoVCPU(tc.host); got != tc.want { + t.Errorf("autoVCPU(%d) = %d, want %d", tc.host, got, tc.want) + } + } +} + +func TestAutoMemoryCappedAndFloor(t *testing.T) { + // 4 GiB host → floor 1024 MiB. + if got := autoMemoryMiB(4 * gib); got != 1024 { + t.Errorf("4 GiB → got %d, want 1024", got) + } + // 32 GiB host → 32/8 = 4 GiB = 4096 MiB. + if got := autoMemoryMiB(32 * gib); got != 4096 { + t.Errorf("32 GiB → got %d, want 4096", got) + } + // 128 GiB host → 128/8 = 16 GiB, capped at 8 GiB = 8192 MiB. + if got := autoMemoryMiB(128 * gib); got != 8192 { + t.Errorf("128 GiB → got %d, want 8192", got) + } +} + +func TestAutoMemoryRoundsTo256MiB(t *testing.T) { + // 17 GiB host → 17/8 = 2.125 GiB ≈ 2176 MiB → rounded to 2048. + if got := autoMemoryMiB(17 * gib); got%256 != 0 { + t.Errorf("%d MiB not a 256 multiple", got) + } +} + +func TestFormatSpecLine(t *testing.T) { + d := VMDefaults{VCPUCount: 2, MemoryMiB: 2048, WorkDiskSizeBytes: 8 * gib} + line := d.FormatSpecLine() + for _, want := range []string{"2 vcpu", "2048 MiB", "disk"} { + if !strings.Contains(line, want) { + t.Errorf("line %q missing %q", line, want) + } + } +} + +const gib = int64(1024 * 1024 * 1024) diff --git a/internal/model/vm_handles.go b/internal/model/vm_handles.go new file mode 100644 index 0000000..1eb5708 --- /dev/null +++ b/internal/model/vm_handles.go @@ -0,0 +1,50 @@ +package model + +// VMHandles captures the transient, per-boot kernel/process handles +// that banger obtains while starting a VM and releases when stopping +// it. Unlike VMRuntime (durable spec + identity + derived paths), +// VMHandles is the authoritative live-handle view while the daemon is +// up. On restart, the daemon rebuilds it from the OS plus the per-VM +// scratch file; teardown-critical fields are also mirrored onto +// VMRuntime so cleanup can still proceed if that scratch file is +// missing or corrupt. +// +// The daemon keeps an in-memory cache keyed by VM ID. Lifecycle +// transitions update the cache and a small `handles.json` scratch +// file in the VM's state directory; daemon startup reconciles +// by loading that file and verifying each handle against the live +// OS state. If anything is stale the VM is marked stopped and the +// cache entry is dropped. +// +// VMHandles itself never appears in the `vms` SQLite rows. Some fields +// are mirrored onto VMRuntime as crash-recovery fallback state, but the +// cache + scratch file remain the canonical live source. +type VMHandles struct { + // PID is the firecracker process PID. Zero means "not running + // (from our perspective)". Always verifiable via + // /proc//cmdline matching the api socket path. + PID int `json:"pid,omitempty"` + + // TapDevice is the kernel tap interface name (e.g. "tap-fc-0001") + // bound to the VM's virtio-net. Released on stop. + TapDevice string `json:"tap_device,omitempty"` + + // BaseLoop and COWLoop are the two loop devices backing the + // dm-snapshot layer (read-only base = rootfs; read-write overlay + // = per-VM COW file). Released via losetup -d on stop. + BaseLoop string `json:"base_loop,omitempty"` + COWLoop string `json:"cow_loop,omitempty"` + + // DMName is the device-mapper target name; deterministic from the + // VM ID (see dmsnap.SnapshotName). DMDev is the corresponding + // /dev/mapper/ path. Torn down by `dmsetup remove` on stop. + DMName string `json:"dm_name,omitempty"` + DMDev string `json:"dm_dev,omitempty"` +} + +// IsZero reports whether every handle field is unset. Useful as a +// cheap "this VM has no kernel/process resources held on our behalf" +// check. +func (h VMHandles) IsZero() bool { + return h.PID == 0 && h.TapDevice == "" && h.BaseLoop == "" && h.COWLoop == "" && h.DMName == "" && h.DMDev == "" +} diff --git a/internal/model/vm_name.go b/internal/model/vm_name.go new file mode 100644 index 0000000..c45a43d --- /dev/null +++ b/internal/model/vm_name.go @@ -0,0 +1,45 @@ +package model + +import ( + "errors" + "fmt" +) + +// MaxVMNameLen is the upper bound on a user-provided VM name. DNS +// labels (RFC 1123) allow up to 63 octets; the name ends up as the +// first label of `.vm` records served by banger's vmdns, and +// also as the guest's /etc/hostname — so fitting both invariants in +// a single ceiling keeps the model simple. +const MaxVMNameLen = 63 + +// ValidateVMName rejects names that aren't safe to use as a DNS +// label, a Linux hostname, a kernel-command-line token, or a +// file-path component. Concretely: lowercase ASCII letters, digits, +// and '-', 1..MaxVMNameLen chars, no leading or trailing hyphen. +// +// No normalization (trimming, case folding) — the VM name becomes +// the user-visible identifier (store lookup key, `ssh .vm`, +// `vm show `), and a silent rewrite would hand the user back +// a different name than they typed. Reject early with an explicit +// message instead. +func ValidateVMName(name string) error { + if name == "" { + return errors.New("vm name is required") + } + if len(name) > MaxVMNameLen { + return fmt.Errorf("vm name %q is %d characters; max is %d (DNS label limit)", name, len(name), MaxVMNameLen) + } + if name[0] == '-' || name[len(name)-1] == '-' { + return fmt.Errorf("vm name %q cannot start or end with '-'", name) + } + for i, r := range name { + switch { + case r >= 'a' && r <= 'z': + case r >= '0' && r <= '9': + case r == '-': + default: + return fmt.Errorf("vm name %q has invalid character %q at position %d (allowed: lowercase a-z, 0-9, '-')", name, r, i) + } + } + return nil +} diff --git a/internal/model/vm_name_test.go b/internal/model/vm_name_test.go new file mode 100644 index 0000000..656837e --- /dev/null +++ b/internal/model/vm_name_test.go @@ -0,0 +1,68 @@ +package model + +import ( + "strings" + "testing" +) + +func TestValidateVMName(t *testing.T) { + cases := []struct { + name string + input string + wantOK bool + wantErrSub string + }{ + // Happy path. + {"simple", "mybox", true, ""}, + {"with-hyphen", "my-box", true, ""}, + {"digits", "box-123", true, ""}, + {"digits-only", "1234", true, ""}, + {"single-char", "a", true, ""}, + {"max length", strings.Repeat("a", MaxVMNameLen), true, ""}, + {"namegen style", "ace-fox", true, ""}, + + // Empty / length. + {"empty", "", false, "required"}, + {"over max length", strings.Repeat("a", MaxVMNameLen+1), false, "max is"}, + + // Hyphen position. + {"leading hyphen", "-box", false, "cannot start or end with '-'"}, + {"trailing hyphen", "box-", false, "cannot start or end with '-'"}, + {"lone hyphen", "-", false, "cannot start or end with '-'"}, + + // Character class. + {"uppercase", "MyBox", false, "invalid character"}, + {"space", "my box", false, "invalid character"}, + {"newline", "my\nbox", false, "invalid character"}, + {"tab", "my\tbox", false, "invalid character"}, + {"dot", "my.box", false, "invalid character"}, + {"dot-vm suffix", "box.vm", false, "invalid character"}, + {"slash", "my/box", false, "invalid character"}, + {"underscore", "my_box", false, "invalid character"}, + {"at sign", "user@box", false, "invalid character"}, + {"colon (kernel cmdline separator)", "my:box", false, "invalid character"}, + {"equals (kernel cmdline)", "a=b", false, "invalid character"}, + {"quote", "my\"box", false, "invalid character"}, + {"unicode letter", "box-α", false, "invalid character"}, + {"leading space", " box", false, "invalid character"}, + {"trailing space", "box ", false, "invalid character"}, + {"control char NUL", "my\x00box", false, "invalid character"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + err := ValidateVMName(tc.input) + if tc.wantOK { + if err != nil { + t.Fatalf("ValidateVMName(%q) = %v, want nil", tc.input, err) + } + return + } + if err == nil { + t.Fatalf("ValidateVMName(%q) = nil, want error containing %q", tc.input, tc.wantErrSub) + } + if !strings.Contains(err.Error(), tc.wantErrSub) { + t.Fatalf("ValidateVMName(%q) = %v, want error containing %q", tc.input, err, tc.wantErrSub) + } + }) + } +} diff --git a/internal/namegen/namegen_test.go b/internal/namegen/namegen_test.go new file mode 100644 index 0000000..8e7e9e8 --- /dev/null +++ b/internal/namegen/namegen_test.go @@ -0,0 +1,54 @@ +package namegen + +import ( + "strings" + "testing" +) + +func TestGenerate(t *testing.T) { + adjSet := make(map[string]struct{}, len(adjectives)) + for _, a := range adjectives { + adjSet[a] = struct{}{} + } + subSet := make(map[string]struct{}, len(substantives)) + for _, s := range substantives { + subSet[s] = struct{}{} + } + + seen := make(map[string]int) + for i := 0; i < 200; i++ { + name := Generate() + parts := strings.Split(name, "-") + if len(parts) != 2 { + t.Fatalf("expected adj-noun form, got %q", name) + } + if _, ok := adjSet[parts[0]]; !ok { + t.Fatalf("unknown adjective %q in %q", parts[0], name) + } + if _, ok := subSet[parts[1]]; !ok { + t.Fatalf("unknown substantive %q in %q", parts[1], name) + } + seen[name]++ + } + + // Minimal variety check: adj-noun cartesian product is thousands of + // combinations; 200 draws should hit more than a couple. + if len(seen) < 10 { + t.Fatalf("expected varied output, only saw %d distinct names", len(seen)) + } +} + +func TestRandomIndex(t *testing.T) { + if got := randomIndex(0); got != 0 { + t.Fatalf("randomIndex(0) = %d, want 0", got) + } + if got := randomIndex(1); got != 0 { + t.Fatalf("randomIndex(1) = %d, want 0", got) + } + for i := 0; i < 100; i++ { + n := randomIndex(7) + if n < 0 || n >= 7 { + t.Fatalf("randomIndex(7) = %d, out of range", n) + } + } +} diff --git a/internal/opencode/opencode.go b/internal/opencode/opencode.go deleted file mode 100644 index 7a2af47..0000000 --- a/internal/opencode/opencode.go +++ /dev/null @@ -1,104 +0,0 @@ -package opencode - -import ( - "context" - "fmt" - "log/slog" - "strings" - "time" - - "banger/internal/vsockagent" -) - -const ( - Port = 4096 - Host = "0.0.0.0" - GuestBinaryPath = "/usr/local/bin/opencode" - ShimPath = "/root/.local/share/mise/shims/opencode" - ServiceName = "banger-opencode.service" - RunitServiceName = "banger-opencode" - ReadyTimeout = 45 * time.Second - pollInterval = 200 * time.Millisecond -) - -func ServiceUnit() string { - return fmt.Sprintf(`[Unit] -Description=Banger opencode server -After=network.target -RequiresMountsFor=/root - -[Service] -Type=simple -Environment=HOME=/root -WorkingDirectory=/root -ExecStart=%s serve --hostname %s --port %d -Restart=on-failure -RestartSec=1 - -[Install] -WantedBy=multi-user.target -`, GuestBinaryPath, Host, Port) -} - -func RunitRunScript() string { - return fmt.Sprintf(`#!/bin/sh -set -e -export HOME=/root -cd /root -exec %s serve --hostname %s --port %d -`, GuestBinaryPath, Host, Port) -} - -func Ready(listeners []vsockagent.PortListener) bool { - for _, listener := range listeners { - if strings.ToLower(strings.TrimSpace(listener.Proto)) != "tcp" { - continue - } - if listener.Port == Port { - return true - } - } - return false -} - -func WaitReady(ctx context.Context, logger *slog.Logger, socketPath string, report func(stage, detail string)) error { - return waitReady(ctx, logger, socketPath, ReadyTimeout, report) -} - -func waitReady(ctx context.Context, logger *slog.Logger, socketPath string, timeout time.Duration, report func(stage, detail string)) error { - waitCtx, cancel := context.WithTimeout(ctx, timeout) - defer cancel() - - ticker := time.NewTicker(pollInterval) - defer ticker.Stop() - - var lastErr error - for { - portsCtx, portsCancel := context.WithTimeout(waitCtx, 3*time.Second) - listeners, err := vsockagent.Ports(portsCtx, logger, socketPath) - portsCancel() - if err == nil { - if Ready(listeners) { - return nil - } - if report != nil { - report("wait_opencode", fmt.Sprintf("waiting for opencode on guest port %d", Port)) - } - lastErr = fmt.Errorf("guest port %d is not listening yet", Port) - } else { - if report != nil { - report("wait_guest_ready", "waiting for guest services") - } - lastErr = err - } - - select { - case <-waitCtx.Done(): - if lastErr != nil { - return fmt.Errorf("opencode server did not become ready on guest port %d: %w", Port, lastErr) - } - return fmt.Errorf("opencode server did not become ready on guest port %d before timeout", Port) - case <-ticker.C: - } - } -} diff --git a/internal/opencode/opencode_test.go b/internal/opencode/opencode_test.go deleted file mode 100644 index 8855960..0000000 --- a/internal/opencode/opencode_test.go +++ /dev/null @@ -1,151 +0,0 @@ -package opencode - -import ( - "context" - "fmt" - "net" - "os" - "path/filepath" - "strings" - "testing" - "time" - - "banger/internal/vsockagent" -) - -func TestServiceUnitContainsExpectedExecStart(t *testing.T) { - unit := ServiceUnit() - for _, snippet := range []string{ - "RequiresMountsFor=/root", - "WorkingDirectory=/root", - "Environment=HOME=/root", - "ExecStart=/usr/local/bin/opencode serve --hostname 0.0.0.0 --port 4096", - "WantedBy=multi-user.target", - } { - if !strings.Contains(unit, snippet) { - t.Fatalf("service unit missing snippet %q\nunit:\n%s", snippet, unit) - } - } -} - -func TestRunitRunScriptContainsExpectedExec(t *testing.T) { - script := RunitRunScript() - for _, snippet := range []string{ - "export HOME=/root", - "cd /root", - "exec /usr/local/bin/opencode serve --hostname 0.0.0.0 --port 4096", - } { - if !strings.Contains(script, snippet) { - t.Fatalf("runit script missing snippet %q\nscript:\n%s", snippet, script) - } - } -} - -func TestReadyMatchesTCPPort(t *testing.T) { - if Ready([]vsockagent.PortListener{{Proto: "udp", Port: Port}}) { - t.Fatal("udp listener should not satisfy readiness") - } - if Ready([]vsockagent.PortListener{{Proto: "tcp", Port: 8080}}) { - t.Fatal("wrong tcp port should not satisfy readiness") - } - if !Ready([]vsockagent.PortListener{{Proto: "tcp", Port: Port}}) { - t.Fatal("tcp listener on opencode port should satisfy readiness") - } -} - -func TestWaitReadyReturnsWhenPortIsListening(t *testing.T) { - socketPath := filepath.Join(t.TempDir(), "opencode.vsock") - listener, err := net.Listen("unix", socketPath) - if err != nil { - skipIfSocketRestricted(t, err) - t.Fatalf("listen: %v", err) - } - t.Cleanup(func() { - _ = listener.Close() - _ = os.Remove(socketPath) - }) - - serverDone := make(chan error, 1) - go func() { - conn, err := listener.Accept() - if err != nil { - serverDone <- err - return - } - defer conn.Close() - buf := make([]byte, 512) - n, err := conn.Read(buf) - if err != nil { - serverDone <- err - return - } - if got := string(buf[:n]); got != "CONNECT 42070\n" { - serverDone <- fmt.Errorf("unexpected connect message %q", got) - return - } - if _, err := conn.Write([]byte("OK 1\n")); err != nil { - serverDone <- err - return - } - reqBuf := make([]byte, 0, 512) - for { - n, err = conn.Read(buf) - if err != nil { - serverDone <- err - return - } - reqBuf = append(reqBuf, buf[:n]...) - if strings.Contains(string(reqBuf), "\r\n\r\n") { - break - } - } - if !strings.Contains(string(reqBuf), "GET /ports HTTP/1.1\r\n") { - serverDone <- fmt.Errorf("unexpected ports payload %q", string(reqBuf)) - return - } - body := []byte(`{"listeners":[{"proto":"tcp","bind_address":"0.0.0.0","port":4096}]}`) - _, err = conn.Write([]byte(fmt.Sprintf("HTTP/1.1 200 OK\r\nContent-Type: application/json\r\nContent-Length: %d\r\n\r\n%s", len(body), body))) - serverDone <- err - }() - - if err := waitReady(context.Background(), nil, socketPath, time.Second, nil); err != nil { - t.Fatalf("waitReady: %v", err) - } - if err := <-serverDone; err != nil { - t.Fatalf("server: %v", err) - } -} - -func TestWaitReadyReportsGuestServicesWhenPortsUnavailable(t *testing.T) { - t.Parallel() - - var reports []string - err := waitReady( - context.Background(), - nil, - filepath.Join(t.TempDir(), "missing.vsock"), - 50*time.Millisecond, - func(stage, detail string) { - reports = append(reports, stage+":"+detail) - }, - ) - if err == nil { - t.Fatal("waitReady() error = nil, want timeout") - } - if len(reports) == 0 { - t.Fatal("waitReady() did not report progress") - } - if got := reports[0]; got != "wait_guest_ready:waiting for guest services" { - t.Fatalf("first report = %q, want guest services wait", got) - } -} - -func skipIfSocketRestricted(t *testing.T, err error) { - t.Helper() - if err == nil { - return - } - if strings.Contains(strings.ToLower(err.Error()), "operation not permitted") { - t.Skipf("socket creation is restricted in this environment: %v", err) - } -} diff --git a/internal/paths/layout_test.go b/internal/paths/layout_test.go new file mode 100644 index 0000000..9a15b5d --- /dev/null +++ b/internal/paths/layout_test.go @@ -0,0 +1,193 @@ +package paths + +import ( + "os" + "path/filepath" + "strings" + "testing" +) + +func TestResolveUsesXDGOverrides(t *testing.T) { + dir := t.TempDir() + t.Setenv("XDG_CONFIG_HOME", filepath.Join(dir, "config")) + t.Setenv("XDG_STATE_HOME", filepath.Join(dir, "state")) + t.Setenv("XDG_CACHE_HOME", filepath.Join(dir, "cache")) + t.Setenv("XDG_RUNTIME_DIR", filepath.Join(dir, "run")) + + layout, err := Resolve() + if err != nil { + t.Fatalf("Resolve: %v", err) + } + if layout.ConfigDir != filepath.Join(dir, "config", "banger") { + t.Errorf("ConfigDir = %q", layout.ConfigDir) + } + if layout.StateDir != filepath.Join(dir, "state", "banger") { + t.Errorf("StateDir = %q", layout.StateDir) + } + if layout.CacheDir != filepath.Join(dir, "cache", "banger") { + t.Errorf("CacheDir = %q", layout.CacheDir) + } + if layout.RuntimeDir != filepath.Join(dir, "run", "banger") { + t.Errorf("RuntimeDir = %q", layout.RuntimeDir) + } + if !strings.HasSuffix(layout.SocketPath, "bangerd.sock") { + t.Errorf("SocketPath = %q", layout.SocketPath) + } + if !strings.HasSuffix(layout.DBPath, "state.db") { + t.Errorf("DBPath = %q", layout.DBPath) + } +} + +func TestResolveUserForHomeUsesProvidedHome(t *testing.T) { + home := filepath.Join(t.TempDir(), "owner") + layout, err := ResolveUserForHome(home) + if err != nil { + t.Fatalf("ResolveUserForHome: %v", err) + } + if layout.ConfigDir != filepath.Join(home, ".config", "banger") { + t.Fatalf("ConfigDir = %q", layout.ConfigDir) + } + if layout.StateDir != filepath.Join(home, ".local", "state", "banger") { + t.Fatalf("StateDir = %q", layout.StateDir) + } + if layout.KnownHostsPath != filepath.Join(home, ".local", "state", "banger", "ssh", "known_hosts") { + t.Fatalf("KnownHostsPath = %q", layout.KnownHostsPath) + } +} + +func TestResolveSystemUsesFixedPaths(t *testing.T) { + layout := ResolveSystem() + if layout.SocketPath != "/run/banger/bangerd.sock" { + t.Fatalf("SocketPath = %q", layout.SocketPath) + } + if layout.StateDir != "/var/lib/banger" { + t.Fatalf("StateDir = %q", layout.StateDir) + } + if layout.KnownHostsPath != "/var/lib/banger/ssh/known_hosts" { + t.Fatalf("KnownHostsPath = %q", layout.KnownHostsPath) + } +} + +func TestResolveFallsBackWhenRuntimeUnset(t *testing.T) { + t.Setenv("XDG_RUNTIME_DIR", "") + layout, err := Resolve() + if err != nil { + t.Fatalf("Resolve: %v", err) + } + if !strings.Contains(layout.RuntimeDir, "banger-runtime-") { + t.Errorf("expected fallback runtime dir, got %q", layout.RuntimeDir) + } +} + +func TestEnsureCreatesAllDirs(t *testing.T) { + base := t.TempDir() + layout := Layout{ + ConfigDir: filepath.Join(base, "config"), + StateDir: filepath.Join(base, "state"), + CacheDir: filepath.Join(base, "cache"), + RuntimeDir: filepath.Join(base, "runtime"), + VMsDir: filepath.Join(base, "state/vms"), + ImagesDir: filepath.Join(base, "state/images"), + KernelsDir: filepath.Join(base, "state/kernels"), + OCICacheDir: filepath.Join(base, "cache/oci"), + } + if err := Ensure(layout); err != nil { + t.Fatalf("Ensure: %v", err) + } + for _, dir := range []string{ + layout.ConfigDir, + layout.StateDir, + layout.CacheDir, + layout.RuntimeDir, + layout.VMsDir, + layout.ImagesDir, + layout.KernelsDir, + layout.OCICacheDir, + } { + info, err := os.Stat(dir) + if err != nil { + t.Errorf("stat %q: %v", dir, err) + continue + } + if !info.IsDir() { + t.Errorf("%q is not a directory", dir) + } + } + + // RuntimeDir holds sockets; must be 0700. + info, err := os.Stat(layout.RuntimeDir) + if err != nil { + t.Fatalf("stat runtime: %v", err) + } + if perm := info.Mode().Perm(); perm != 0o700 { + t.Errorf("RuntimeDir mode = %#o, want 0700", perm) + } + + // Idempotent. + if err := Ensure(layout); err != nil { + t.Fatalf("Ensure (second run): %v", err) + } +} + +func TestEnsureTightensStaleRuntimeDirMode(t *testing.T) { + base := t.TempDir() + runtime := filepath.Join(base, "runtime") + if err := os.MkdirAll(runtime, 0o755); err != nil { + t.Fatalf("MkdirAll: %v", err) + } + if err := Ensure(Layout{RuntimeDir: runtime}); err != nil { + t.Fatalf("Ensure: %v", err) + } + info, err := os.Stat(runtime) + if err != nil { + t.Fatalf("stat: %v", err) + } + if perm := info.Mode().Perm(); perm != 0o700 { + t.Errorf("mode = %#o, want 0700 after Ensure", perm) + } +} + +func TestBangerdPathEnvOverride(t *testing.T) { + t.Setenv("BANGER_DAEMON_BIN", "/tmp/custom-bangerd") + got, err := BangerdPath() + if err != nil { + t.Fatalf("BangerdPath: %v", err) + } + if got != "/tmp/custom-bangerd" { + t.Errorf("got %q, want /tmp/custom-bangerd", got) + } +} + +func TestBangerdPathFindsSiblingBinary(t *testing.T) { + t.Setenv("BANGER_DAEMON_BIN", "") + + root := t.TempDir() + sibling := filepath.Join(root, "bangerd") + if err := os.WriteFile(sibling, []byte("#!/bin/sh\n"), 0o755); err != nil { + t.Fatalf("WriteFile: %v", err) + } + original := executablePath + executablePath = func() (string, error) { return filepath.Join(root, "banger"), nil } + t.Cleanup(func() { executablePath = original }) + + got, err := BangerdPath() + if err != nil { + t.Fatalf("BangerdPath: %v", err) + } + if got != sibling { + t.Errorf("got %q, want %q", got, sibling) + } +} + +func TestBangerdPathNotFound(t *testing.T) { + t.Setenv("BANGER_DAEMON_BIN", "") + + root := t.TempDir() + original := executablePath + executablePath = func() (string, error) { return filepath.Join(root, "banger"), nil } + t.Cleanup(func() { executablePath = original }) + + if _, err := BangerdPath(); err == nil { + t.Fatal("expected error when no sibling bangerd exists") + } +} diff --git a/internal/paths/paths.go b/internal/paths/paths.go index 0eeacba..25afbdc 100644 --- a/internal/paths/paths.go +++ b/internal/paths/paths.go @@ -4,27 +4,46 @@ import ( "errors" "fmt" "os" + "os/user" "path/filepath" "strings" + "syscall" + + "banger/internal/installmeta" ) type Layout struct { - ConfigHome string - StateHome string - CacheHome string - RuntimeHome string - ConfigDir string - StateDir string - CacheDir string - RuntimeDir string - SocketPath string - DBPath string - DaemonLog string - VMsDir string - ImagesDir string + ConfigHome string + StateHome string + CacheHome string + RuntimeHome string + ConfigDir string + StateDir string + CacheDir string + RuntimeDir string + SocketPath string + DBPath string + DaemonLog string + VMsDir string + ImagesDir string + KernelsDir string + OCICacheDir string + SSHDir string + KnownHostsPath string + + // runtimeHomeFallback is true when we fabricated the RuntimeHome path + // under /tmp because XDG_RUNTIME_DIR was unset. Ensure() uses the flag + // to apply strict ownership + mode checks on the fallback parent (a + // world-writable /tmp needs us to own and lock the subtree ourselves; + // a systemd-provisioned /run/user/ is already 0700 and trusted). + runtimeHomeFallback bool } func Resolve() (Layout, error) { + return ResolveUser() +} + +func ResolveUser() (Layout, error) { home, err := os.UserHomeDir() if err != nil { return Layout{}, err @@ -33,34 +52,198 @@ func Resolve() (Layout, error) { stateHome := getenvDefault("XDG_STATE_HOME", filepath.Join(home, ".local", "state")) cacheHome := getenvDefault("XDG_CACHE_HOME", filepath.Join(home, ".cache")) runtimeHome := os.Getenv("XDG_RUNTIME_DIR") + runtimeFallback := false if runtimeHome == "" { runtimeHome = filepath.Join(os.TempDir(), fmt.Sprintf("banger-runtime-%d", os.Getuid())) + runtimeFallback = true } layout := Layout{ - ConfigHome: configHome, - StateHome: stateHome, - CacheHome: cacheHome, - RuntimeHome: runtimeHome, - ConfigDir: filepath.Join(configHome, "banger"), - StateDir: filepath.Join(stateHome, "banger"), - CacheDir: filepath.Join(cacheHome, "banger"), - RuntimeDir: filepath.Join(runtimeHome, "banger"), + ConfigHome: configHome, + StateHome: stateHome, + CacheHome: cacheHome, + RuntimeHome: runtimeHome, + runtimeHomeFallback: runtimeFallback, + ConfigDir: filepath.Join(configHome, "banger"), + StateDir: filepath.Join(stateHome, "banger"), + CacheDir: filepath.Join(cacheHome, "banger"), + RuntimeDir: filepath.Join(runtimeHome, "banger"), } layout.SocketPath = filepath.Join(layout.RuntimeDir, "bangerd.sock") layout.DBPath = filepath.Join(layout.StateDir, "state.db") layout.DaemonLog = filepath.Join(layout.StateDir, "bangerd.log") layout.VMsDir = filepath.Join(layout.StateDir, "vms") layout.ImagesDir = filepath.Join(layout.StateDir, "images") + layout.KernelsDir = filepath.Join(layout.StateDir, "kernels") + layout.OCICacheDir = filepath.Join(layout.CacheDir, "oci") + layout.SSHDir = filepath.Join(layout.StateDir, "ssh") + layout.KnownHostsPath = filepath.Join(layout.SSHDir, "known_hosts") return layout, nil } +func ResolveUserForHome(home string) (Layout, error) { + home = strings.TrimSpace(home) + if home == "" { + return Layout{}, errors.New("home directory is required") + } + if !filepath.IsAbs(home) { + return Layout{}, fmt.Errorf("home directory %q must be absolute", home) + } + configHome := filepath.Join(home, ".config") + stateHome := filepath.Join(home, ".local", "state") + cacheHome := filepath.Join(home, ".cache") + layout := Layout{ + ConfigHome: configHome, + StateHome: stateHome, + CacheHome: cacheHome, + ConfigDir: filepath.Join(configHome, "banger"), + StateDir: filepath.Join(stateHome, "banger"), + CacheDir: filepath.Join(cacheHome, "banger"), + SSHDir: filepath.Join(stateHome, "banger", "ssh"), + } + layout.KnownHostsPath = filepath.Join(layout.SSHDir, "known_hosts") + return layout, nil +} + +func ResolveSystem() Layout { + layout := Layout{ + ConfigHome: "/etc", + StateHome: "/var/lib", + CacheHome: "/var/cache", + RuntimeHome: "/run", + ConfigDir: installmeta.DefaultDir, + StateDir: "/var/lib/banger", + CacheDir: "/var/cache/banger", + RuntimeDir: "/run/banger", + } + layout.SocketPath = installmeta.DefaultSocketPath + layout.DBPath = filepath.Join(layout.StateDir, "state.db") + layout.VMsDir = filepath.Join(layout.StateDir, "vms") + layout.ImagesDir = filepath.Join(layout.StateDir, "images") + layout.KernelsDir = filepath.Join(layout.StateDir, "kernels") + layout.OCICacheDir = filepath.Join(layout.CacheDir, "oci") + layout.SSHDir = filepath.Join(layout.StateDir, "ssh") + layout.KnownHostsPath = filepath.Join(layout.SSHDir, "known_hosts") + return layout +} + func Ensure(layout Layout) error { - for _, dir := range []string{layout.ConfigDir, layout.StateDir, layout.CacheDir, layout.RuntimeDir, layout.VMsDir, layout.ImagesDir} { + // When we're using the /tmp fallback, we must create and own the + // runtime-home parent ourselves and reject any pre-existing directory + // that isn't 0700 + owned by the current uid. Otherwise a local + // attacker could pre-create that path and have banger's control + // sockets land inside a directory they control. + if layout.runtimeHomeFallback && strings.TrimSpace(layout.RuntimeHome) != "" { + if err := ensureSafeRuntimeHome(layout.RuntimeHome); err != nil { + return err + } + } + // RuntimeDir holds bangerd.sock + per-VM firecracker API + vsock + // sockets. Lock it to 0700 unconditionally so even if the parent + // runtime-home is traversable by others, none of our sockets are + // reachable. + if strings.TrimSpace(layout.RuntimeDir) != "" { + if err := os.MkdirAll(layout.RuntimeDir, 0o700); err != nil { + return err + } + if err := os.Chmod(layout.RuntimeDir, 0o700); err != nil { + return err + } + } + for _, dir := range []string{layout.ConfigDir, layout.StateDir, layout.CacheDir, layout.VMsDir, layout.ImagesDir, layout.KernelsDir, layout.OCICacheDir} { + if strings.TrimSpace(dir) == "" { + continue + } if err := os.MkdirAll(dir, 0o755); err != nil { return err } } + // SSH material (private key, known_hosts) — 0700 like ~/.ssh so + // strict SSH clients don't complain and no other host user can + // read it. Empty SSHDir means the caller built a Layout by hand + // (tests) and doesn't need the subdir; skip silently. + if strings.TrimSpace(layout.SSHDir) != "" { + if err := os.MkdirAll(layout.SSHDir, 0o700); err != nil { + return err + } + } + return nil +} + +func EnsureSystem(layout Layout) error { + if strings.TrimSpace(layout.ConfigDir) != "" { + if err := os.MkdirAll(layout.ConfigDir, 0o755); err != nil { + return err + } + } + for _, dir := range []string{layout.StateDir, layout.CacheDir, layout.VMsDir, layout.ImagesDir, layout.KernelsDir, layout.OCICacheDir, layout.SSHDir} { + if strings.TrimSpace(dir) == "" { + continue + } + if err := os.MkdirAll(dir, 0o700); err != nil { + return err + } + if err := os.Chmod(dir, 0o700); err != nil { + return err + } + } + if strings.TrimSpace(layout.RuntimeDir) != "" { + if err := os.MkdirAll(layout.RuntimeDir, 0o711); err != nil { + return err + } + if err := os.Chmod(layout.RuntimeDir, 0o711); err != nil { + return err + } + } + return nil +} + +// EnsureSystemOwned prepares the systemd-managed directories the +// owner-user daemon needs once systemd has already created the top-level +// state/cache/runtime roots on its behalf. Unlike EnsureSystem, it does +// not touch /etc/banger and it never assumes root ownership. +func EnsureSystemOwned(layout Layout) error { + for _, dir := range []string{layout.StateDir, layout.CacheDir, layout.RuntimeDir, layout.VMsDir, layout.ImagesDir, layout.KernelsDir, layout.OCICacheDir, layout.SSHDir} { + if strings.TrimSpace(dir) == "" { + continue + } + if err := os.MkdirAll(dir, 0o700); err != nil { + return err + } + if err := os.Chmod(dir, 0o700); err != nil { + return err + } + } + return nil +} + +// ensureSafeRuntimeHome creates path at 0700 if missing, or validates +// existing ownership + mode. Returns an error describing how to remediate +// when the existing directory doesn't meet the bar. +func ensureSafeRuntimeHome(path string) error { + if err := os.MkdirAll(path, 0o700); err != nil { + return err + } + info, err := os.Lstat(path) + if err != nil { + return err + } + // Must be a real directory, not a symlink an attacker could swap. + if info.Mode()&os.ModeSymlink != 0 { + return fmt.Errorf("runtime dir %s is a symlink; refusing to place sockets there — remove it or set XDG_RUNTIME_DIR", path) + } + if !info.IsDir() { + return fmt.Errorf("runtime dir %s exists but is not a directory", path) + } + sys, ok := info.Sys().(*syscall.Stat_t) + if ok && int(sys.Uid) != os.Getuid() { + return fmt.Errorf("runtime dir %s is owned by uid %d, not %d; remove it or set XDG_RUNTIME_DIR", path, sys.Uid, os.Getuid()) + } + if info.Mode().Perm() != 0o700 { + if err := os.Chmod(path, 0o700); err != nil { + return fmt.Errorf("runtime dir %s has insecure mode %#o and chmod failed: %w", path, info.Mode().Perm(), err) + } + } return nil } @@ -86,6 +269,21 @@ func BangerdPath() (string, error) { return "", errors.New("bangerd binary not found next to banger; run `make build`") } +func BangerPath() (string, error) { + if env := os.Getenv("BANGER_BIN"); env != "" { + return env, nil + } + return executablePath() +} + +func CurrentUsername() (string, error) { + entry, err := user.Current() + if err != nil { + return "", err + } + return entry.Username, nil +} + func CompanionBinaryPath(name string) (string, error) { envNames := []string{ "BANGER_" + strings.ToUpper(strings.NewReplacer("-", "_", ".", "_").Replace(name)) + "_BIN", diff --git a/internal/roothelper/roothelper.go b/internal/roothelper/roothelper.go new file mode 100644 index 0000000..3aec14e --- /dev/null +++ b/internal/roothelper/roothelper.go @@ -0,0 +1,1537 @@ +package roothelper + +import ( + "bufio" + "context" + "encoding/json" + "errors" + "fmt" + "log/slog" + "net" + "os" + "path/filepath" + "strconv" + "strings" + "time" + + "golang.org/x/sys/unix" + + "banger/internal/daemon/dmsnap" + "banger/internal/daemon/fcproc" + "banger/internal/firecracker" + "banger/internal/hostnat" + "banger/internal/installmeta" + "banger/internal/paths" + "banger/internal/rpc" + "banger/internal/system" +) + +const ( + methodEnsureBridge = "priv.ensure_bridge" + methodCreateTap = "priv.create_tap" + methodDeleteTap = "priv.delete_tap" + methodSyncResolverRouting = "priv.sync_resolver_routing" + methodClearResolverRouting = "priv.clear_resolver_routing" + methodEnsureNAT = "priv.ensure_nat" + methodCreateDMSnapshot = "priv.create_dm_snapshot" + methodCleanupDMSnapshot = "priv.cleanup_dm_snapshot" + methodRemoveDMSnapshot = "priv.remove_dm_snapshot" + methodFsckSnapshot = "priv.fsck_snapshot" + methodReadExt4File = "priv.read_ext4_file" + methodWriteExt4Files = "priv.write_ext4_files" + methodResolveFirecrackerBin = "priv.resolve_firecracker_binary" + methodLaunchFirecracker = "priv.launch_firecracker" + methodEnsureSocketAccess = "priv.ensure_socket_access" + methodFindFirecrackerPID = "priv.find_firecracker_pid" + methodKillProcess = "priv.kill_process" + methodSignalProcess = "priv.signal_process" + methodProcessRunning = "priv.process_running" + methodCleanupJailerChroot = "priv.cleanup_jailer_chroot" + rootfsDMNamePrefix = "fc-rootfs-" + vmTapPrefix = "tap-fc-" + tapPoolPrefix = "tap-pool-" + vmResolverRouteDomain = "~vm" + defaultFirecrackerBinaryName = "firecracker" +) + +type NetworkConfig struct { + BridgeName string `json:"bridge_name"` + BridgeIP string `json:"bridge_ip"` + CIDR string `json:"cidr"` +} + +type Ext4Write struct { + GuestPath string `json:"guest_path"` + Data []byte `json:"data"` + Mode uint32 `json:"mode"` +} + +type FirecrackerLaunchRequest struct { + BinaryPath string `json:"binary_path"` + VMID string `json:"vm_id"` + SocketPath string `json:"socket_path"` + LogPath string `json:"log_path"` + MetricsPath string `json:"metrics_path"` + KernelImagePath string `json:"kernel_image_path"` + InitrdPath string `json:"initrd_path,omitempty"` + KernelArgs string `json:"kernel_args"` + Drives []firecracker.DriveConfig `json:"drives"` + TapDevice string `json:"tap_device"` + VSockPath string `json:"vsock_path"` + VSockCID uint32 `json:"vsock_cid"` + VCPUCount int `json:"vcpu_count"` + MemoryMiB int `json:"memory_mib"` + Network NetworkConfig `json:"network"` + Jailer *JailerLaunchOpts `json:"jailer,omitempty"` +} + +// JailerLaunchOpts mirrors firecracker.JailerOpts for the RPC wire. UID +// and GID are the (un)privileged target the jailer drops to; the helper +// enforces they match the registered owner so the daemon can't ask the +// helper to run firecracker as an arbitrary user. +type JailerLaunchOpts struct { + Binary string `json:"binary"` + ChrootBaseDir string `json:"chroot_base_dir"` + UID int `json:"uid"` + GID int `json:"gid"` +} + +type findPIDResult struct { + PID int `json:"pid"` +} + +type processRunningResult struct { + Running bool `json:"running"` +} + +type readExt4FileResult struct { + Data []byte `json:"data"` +} + +type resolveFirecrackerResult struct { + Path string `json:"path"` +} + +type launchFirecrackerResult struct { + PID int `json:"pid"` +} + +type Client struct { + socketPath string +} + +func NewClient(socketPath string) *Client { + return &Client{socketPath: strings.TrimSpace(socketPath)} +} + +func (c *Client) EnsureBridge(ctx context.Context, cfg NetworkConfig) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodEnsureBridge, cfg) + return err +} + +func (c *Client) CreateTap(ctx context.Context, cfg NetworkConfig, tapName string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodCreateTap, struct { + NetworkConfig + TapName string `json:"tap_name"` + }{NetworkConfig: cfg, TapName: tapName}) + return err +} + +func (c *Client) DeleteTap(ctx context.Context, tapName string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodDeleteTap, struct { + TapName string `json:"tap_name"` + }{TapName: tapName}) + return err +} + +func (c *Client) SyncResolverRouting(ctx context.Context, bridgeName, serverAddr string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodSyncResolverRouting, struct { + BridgeName string `json:"bridge_name"` + ServerAddr string `json:"server_addr"` + }{BridgeName: bridgeName, ServerAddr: serverAddr}) + return err +} + +func (c *Client) ClearResolverRouting(ctx context.Context, bridgeName string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodClearResolverRouting, struct { + BridgeName string `json:"bridge_name"` + }{BridgeName: bridgeName}) + return err +} + +func (c *Client) EnsureNAT(ctx context.Context, guestIP, tap string, enable bool) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodEnsureNAT, struct { + GuestIP string `json:"guest_ip"` + Tap string `json:"tap"` + Enable bool `json:"enable"` + }{GuestIP: guestIP, Tap: tap, Enable: enable}) + return err +} + +func (c *Client) CreateDMSnapshot(ctx context.Context, rootfsPath, cowPath, dmName string) (dmsnap.Handles, error) { + return rpc.Call[dmsnap.Handles](ctx, c.socketPath, methodCreateDMSnapshot, struct { + RootfsPath string `json:"rootfs_path"` + COWPath string `json:"cow_path"` + DMName string `json:"dm_name"` + }{RootfsPath: rootfsPath, COWPath: cowPath, DMName: dmName}) +} + +func (c *Client) CleanupDMSnapshot(ctx context.Context, handles dmsnap.Handles) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodCleanupDMSnapshot, handles) + return err +} + +func (c *Client) RemoveDMSnapshot(ctx context.Context, target string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodRemoveDMSnapshot, struct { + Target string `json:"target"` + }{Target: target}) + return err +} + +func (c *Client) FsckSnapshot(ctx context.Context, dmDev string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodFsckSnapshot, struct { + DMDev string `json:"dm_dev"` + }{DMDev: dmDev}) + return err +} + +func (c *Client) ReadExt4File(ctx context.Context, imagePath, guestPath string) ([]byte, error) { + result, err := rpc.Call[readExt4FileResult](ctx, c.socketPath, methodReadExt4File, struct { + ImagePath string `json:"image_path"` + GuestPath string `json:"guest_path"` + }{ImagePath: imagePath, GuestPath: guestPath}) + if err != nil { + return nil, err + } + return result.Data, nil +} + +func (c *Client) WriteExt4Files(ctx context.Context, imagePath string, files []Ext4Write) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodWriteExt4Files, struct { + ImagePath string `json:"image_path"` + Files []Ext4Write `json:"files"` + }{ImagePath: imagePath, Files: files}) + return err +} + +func (c *Client) ResolveFirecrackerBinary(ctx context.Context, requested string) (string, error) { + result, err := rpc.Call[resolveFirecrackerResult](ctx, c.socketPath, methodResolveFirecrackerBin, struct { + Requested string `json:"requested"` + }{Requested: requested}) + if err != nil { + return "", err + } + return result.Path, nil +} + +func (c *Client) LaunchFirecracker(ctx context.Context, req FirecrackerLaunchRequest) (int, error) { + result, err := rpc.Call[launchFirecrackerResult](ctx, c.socketPath, methodLaunchFirecracker, req) + if err != nil { + return 0, err + } + return result.PID, nil +} + +func (c *Client) CleanupJailerChroot(ctx context.Context, chrootRoot string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodCleanupJailerChroot, struct { + ChrootRoot string `json:"chroot_root"` + }{ChrootRoot: chrootRoot}) + return err +} + +func (c *Client) EnsureSocketAccess(ctx context.Context, socketPath, label string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodEnsureSocketAccess, struct { + SocketPath string `json:"socket_path"` + Label string `json:"label"` + }{SocketPath: socketPath, Label: label}) + return err +} + +func (c *Client) FindFirecrackerPID(ctx context.Context, apiSock string) (int, error) { + result, err := rpc.Call[findPIDResult](ctx, c.socketPath, methodFindFirecrackerPID, struct { + APISock string `json:"api_sock"` + }{APISock: apiSock}) + if err != nil { + return 0, err + } + return result.PID, nil +} + +func (c *Client) KillProcess(ctx context.Context, pid int) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodKillProcess, struct { + PID int `json:"pid"` + }{PID: pid}) + return err +} + +func (c *Client) SignalProcess(ctx context.Context, pid int, signal string) error { + _, err := rpc.Call[struct{}](ctx, c.socketPath, methodSignalProcess, struct { + PID int `json:"pid"` + Signal string `json:"signal"` + }{PID: pid, Signal: signal}) + return err +} + +func (c *Client) ProcessRunning(ctx context.Context, pid int, apiSock string) (bool, error) { + result, err := rpc.Call[processRunningResult](ctx, c.socketPath, methodProcessRunning, struct { + PID int `json:"pid"` + APISock string `json:"api_sock"` + }{PID: pid, APISock: apiSock}) + if err != nil { + return false, err + } + return result.Running, nil +} + +type Server struct { + meta installmeta.Metadata + runner system.CommandRunner + logger *slog.Logger + listener net.Listener +} + +func Open() (*Server, error) { + meta, err := installmeta.Load(installmeta.DefaultPath) + if err != nil { + return nil, err + } + if err := os.MkdirAll(installmeta.DefaultRootHelperRuntimeDir, 0o711); err != nil { + return nil, err + } + if err := os.Chmod(installmeta.DefaultRootHelperRuntimeDir, 0o711); err != nil { + return nil, err + } + return &Server{ + meta: meta, + runner: system.NewRunner(), + // JSON to match bangerd. Mixed text/JSON streams in the + // merged journalctl made the daemon side painful to grep; + // this aligns the helper so a single greppable shape spans + // both units. + logger: slog.New(slog.NewJSONHandler(os.Stderr, &slog.HandlerOptions{Level: slog.LevelInfo})), + }, nil +} + +func (s *Server) Close() error { + if s == nil || s.listener == nil { + return nil + } + return s.listener.Close() +} + +func (s *Server) Serve(ctx context.Context) error { + _ = os.Remove(installmeta.DefaultRootHelperSocketPath) + listener, err := net.Listen("unix", installmeta.DefaultRootHelperSocketPath) + if err != nil { + return err + } + s.listener = listener + defer listener.Close() + defer os.Remove(installmeta.DefaultRootHelperSocketPath) + if err := os.Chmod(installmeta.DefaultRootHelperSocketPath, 0o600); err != nil { + return err + } + if err := os.Chown(installmeta.DefaultRootHelperSocketPath, s.meta.OwnerUID, s.meta.OwnerGID); err != nil { + return err + } + + done := make(chan struct{}) + defer close(done) + go func() { + select { + case <-ctx.Done(): + _ = listener.Close() + case <-done: + } + }() + + for { + conn, err := listener.Accept() + if err != nil { + select { + case <-ctx.Done(): + return nil + default: + } + var netErr net.Error + if errors.As(err, &netErr) && netErr.Temporary() { + time.Sleep(100 * time.Millisecond) + continue + } + return err + } + go s.handleConn(conn) + } +} + +func (s *Server) handleConn(conn net.Conn) { + defer conn.Close() + if err := s.authorizeConn(conn); err != nil { + _ = json.NewEncoder(conn).Encode(rpc.NewError("unauthorized", err.Error())) + return + } + var req rpc.Request + if err := json.NewDecoder(bufio.NewReader(conn)).Decode(&req); err != nil { + _ = json.NewEncoder(conn).Encode(rpc.NewError("bad_request", err.Error())) + return + } + // Adopt the daemon's op id so a single greppable id covers the + // whole call chain (CLI → daemon → helper). Entry log at debug + // level keeps production quiet; the completion log fires at + // info-on-success / error-on-failure with duration so an + // operator can see at a glance how long each privileged op + // took. + ctx := rpc.WithOpID(context.Background(), req.OpID) + start := time.Now() + if s.logger != nil { + s.logger.Debug("helper rpc", "method", req.Method, "op_id", req.OpID) + } + resp := s.dispatch(ctx, req) + if !resp.OK && resp.Error != nil && resp.Error.OpID == "" && req.OpID != "" { + resp.Error.OpID = req.OpID + } + if s.logger != nil { + duration := time.Since(start).Milliseconds() + if !resp.OK && resp.Error != nil { + s.logger.Error("helper rpc failed", "method", req.Method, "op_id", req.OpID, "duration_ms", duration, "code", resp.Error.Code, "message", resp.Error.Message) + } else { + s.logger.Debug("helper rpc completed", "method", req.Method, "op_id", req.OpID, "duration_ms", duration) + } + } + _ = json.NewEncoder(conn).Encode(resp) +} + +func (s *Server) authorizeConn(conn net.Conn) error { + unixConn, ok := conn.(*net.UnixConn) + if !ok { + return errors.New("root helper requires unix connections") + } + rawConn, err := unixConn.SyscallConn() + if err != nil { + return err + } + var cred *unix.Ucred + var controlErr error + if err := rawConn.Control(func(fd uintptr) { + cred, controlErr = unix.GetsockoptUcred(int(fd), unix.SOL_SOCKET, unix.SO_PEERCRED) + }); err != nil { + return err + } + if controlErr != nil { + return controlErr + } + if cred == nil { + return errors.New("missing peer credentials") + } + if int(cred.Uid) == 0 || int(cred.Uid) == s.meta.OwnerUID { + return nil + } + return fmt.Errorf("uid %d is not allowed to use the root helper", cred.Uid) +} + +func (s *Server) dispatch(ctx context.Context, req rpc.Request) rpc.Response { + switch req.Method { + case methodEnsureBridge: + params, err := rpc.DecodeParams[NetworkConfig](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // Without these the helper would happily run `ip link add` + // against arbitrary names, `ip addr add` with arbitrary + // IP/CIDR, and `ip link set up` against any host + // iface a compromised daemon might pick. + if err := validateNetworkConfig(params); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, s.ensureBridge(ctx, params)) + case methodCreateTap: + params, err := rpc.DecodeParams[struct { + NetworkConfig + TapName string `json:"tap_name"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // Pin both the bridge config (so the new TAP can't be + // attached to e.g. eth0 via `ip link set master`) and + // the tap name itself. + if err := validateNetworkConfig(params.NetworkConfig); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, s.createTap(ctx, params.NetworkConfig, params.TapName)) + case methodDeleteTap: + params, err := rpc.DecodeParams[struct { + TapName string `json:"tap_name"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, s.deleteTap(ctx, params.TapName)) + case methodSyncResolverRouting: + params, err := rpc.DecodeParams[struct { + BridgeName string `json:"bridge_name"` + ServerAddr string `json:"server_addr"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // syncResolverRouting short-circuits on empty input; only + // validate when actually doing something. validateBanger + // BridgeName is stricter than the previous validateLinux + // IfaceName: it stops a compromised daemon from pointing + // resolvectl at any host interface, not just refusing + // obviously-malformed names. + if strings.TrimSpace(params.BridgeName) != "" || strings.TrimSpace(params.ServerAddr) != "" { + if err := validateBangerBridgeName(params.BridgeName); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := validateResolverAddr(params.ServerAddr); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + } + return marshalResultOrError(struct{}{}, s.syncResolverRouting(ctx, params.BridgeName, params.ServerAddr)) + case methodClearResolverRouting: + params, err := rpc.DecodeParams[struct { + BridgeName string `json:"bridge_name"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if strings.TrimSpace(params.BridgeName) != "" { + if err := validateBangerBridgeName(params.BridgeName); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + } + return marshalResultOrError(struct{}{}, s.clearResolverRouting(ctx, params.BridgeName)) + case methodEnsureNAT: + params, err := rpc.DecodeParams[struct { + GuestIP string `json:"guest_ip"` + Tap string `json:"tap"` + Enable bool `json:"enable"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // Without these the helper installs iptables rules with + // daemon-supplied identifiers; argv-style exec rules out + // command injection, but a compromised daemon could still + // install MASQUERADE rules tied to arbitrary IPs/interfaces. + if err := validateIPv4(params.GuestIP); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := validateTapName(params.Tap); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, hostnat.Ensure(ctx, s.runner, params.GuestIP, params.Tap, params.Enable)) + case methodCreateDMSnapshot: + params, err := rpc.DecodeParams[struct { + RootfsPath string `json:"rootfs_path"` + COWPath string `json:"cow_path"` + DMName string `json:"dm_name"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := s.validateManagedPath(params.RootfsPath, paths.ResolveSystem().StateDir); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := s.validateManagedPath(params.COWPath, paths.ResolveSystem().StateDir); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := validateDMName(params.DMName); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + result, err := dmsnap.Create(ctx, s.runner, params.RootfsPath, params.COWPath, params.DMName) + return marshalResultOrError(result, err) + case methodCleanupDMSnapshot: + params, err := rpc.DecodeParams[dmsnap.Handles](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // Each Handles field flows into a `dmsetup remove` / + // `losetup -d` shell-out as root. Without these checks a + // compromised daemon could ask the helper to detach + // arbitrary loop devices or remove unrelated DM targets. + if err := validateDMSnapshotHandles(params); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, dmsnap.Cleanup(ctx, s.runner, params)) + case methodRemoveDMSnapshot: + params, err := rpc.DecodeParams[struct { + Target string `json:"target"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := validateDMRemoveTarget(params.Target); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, dmsnap.Remove(ctx, s.runner, params.Target)) + case methodFsckSnapshot: + params, err := rpc.DecodeParams[struct { + DMDev string `json:"dm_dev"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, s.fsckSnapshot(ctx, params.DMDev)) + case methodReadExt4File: + params, err := rpc.DecodeParams[struct { + ImagePath string `json:"image_path"` + GuestPath string `json:"guest_path"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // Without this validation a compromised daemon can drive + // debugfs as root against any path on the host; it would have + // to be a real ext4 image to leak data, but the constraint is + // trivially expressed and adds no operational cost. + if err := s.validateExt4ImagePath(params.ImagePath); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + data, readErr := system.ReadExt4File(ctx, s.runner, params.ImagePath, params.GuestPath) + return marshalResultOrError(readExt4FileResult{Data: data}, readErr) + case methodWriteExt4Files: + params, err := rpc.DecodeParams[struct { + ImagePath string `json:"image_path"` + Files []Ext4Write `json:"files"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := s.validateExt4ImagePath(params.ImagePath); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(struct{}{}, s.writeExt4Files(ctx, params.ImagePath, params.Files)) + case methodResolveFirecrackerBin: + params, err := rpc.DecodeParams[struct { + Requested string `json:"requested"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + path, resolveErr := s.resolveFirecrackerBinary(params.Requested) + return marshalResultOrError(resolveFirecrackerResult{Path: path}, resolveErr) + case methodLaunchFirecracker: + params, err := rpc.DecodeParams[FirecrackerLaunchRequest](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + pid, launchErr := s.launchFirecracker(ctx, params) + return marshalResultOrError(launchFirecrackerResult{PID: pid}, launchErr) + case methodEnsureSocketAccess: + params, err := rpc.DecodeParams[struct { + SocketPath string `json:"socket_path"` + Label string `json:"label"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + // Without these checks the helper's chown/chmod becomes an + // arbitrary file-ownership primitive: a daemon-uid attacker + // could plant a symlink at any path under RuntimeDir (or just + // pass /etc/shadow) and have the helper transfer ownership to + // the daemon UID. The fcproc layer also chowns/chmods via + // O_PATH|O_NOFOLLOW so the leaf can't be a symlink at the time + // of the syscall — these checks are belt + braces and give a + // clear error before we even open the path. + if err := s.validateManagedPath(params.SocketPath, paths.ResolveSystem().RuntimeDir); err != nil { + return rpc.NewError("invalid_path", err.Error()) + } + if err := validateNotSymlink(params.SocketPath); err != nil { + return rpc.NewError("invalid_path", err.Error()) + } + return marshalResultOrError(struct{}{}, s.ensureSocketAccess(ctx, params.SocketPath, params.Label)) + case methodFindFirecrackerPID: + params, err := rpc.DecodeParams[struct { + APISock string `json:"api_sock"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + pid, findErr := fcproc.New(s.runner, fcproc.Config{}, s.logger).FindPID(ctx, params.APISock) + return marshalResultOrError(findPIDResult{PID: pid}, findErr) + case methodKillProcess: + params, err := rpc.DecodeParams[struct { + PID int `json:"pid"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := validateFirecrackerPID(params.PID); err != nil { + return rpc.NewError("invalid_pid", err.Error()) + } + _, killErr := s.runner.Run(ctx, "kill", "-KILL", strconv.Itoa(params.PID)) + return marshalResultOrError(struct{}{}, killErr) + case methodSignalProcess: + params, err := rpc.DecodeParams[struct { + PID int `json:"pid"` + Signal string `json:"signal"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + if err := validateFirecrackerPID(params.PID); err != nil { + return rpc.NewError("invalid_pid", err.Error()) + } + signal := strings.TrimSpace(params.Signal) + if signal == "" { + signal = "TERM" + } + if err := validateSignalName(signal); err != nil { + return rpc.NewError("bad_params", err.Error()) + } + _, signalErr := s.runner.Run(ctx, "kill", "-"+signal, strconv.Itoa(params.PID)) + return marshalResultOrError(struct{}{}, signalErr) + case methodProcessRunning: + params, err := rpc.DecodeParams[struct { + PID int `json:"pid"` + APISock string `json:"api_sock"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + return marshalResultOrError(processRunningResult{Running: system.ProcessRunning(params.PID, params.APISock)}, nil) + case methodCleanupJailerChroot: + params, err := rpc.DecodeParams[struct { + ChrootRoot string `json:"chroot_root"` + }](req) + if err != nil { + return rpc.NewError("bad_params", err.Error()) + } + systemLayout := paths.ResolveSystem() + if err := s.validateManagedPath(params.ChrootRoot, systemLayout.StateDir, systemLayout.RuntimeDir); err != nil { + return rpc.NewError("invalid_path", err.Error()) + } + // validateManagedPath only does textual prefix matching. A + // symlink at e.g. /var/lib/banger/jail/x → / would pass the + // prefix check, and the subsequent `umount --recursive --lazy` + // would detach real host mounts. Reject leaf symlinks before + // we go anywhere near unmount/rm. + if err := validateNotSymlink(params.ChrootRoot); err != nil { + return rpc.NewError("invalid_path", err.Error()) + } + err = fcproc.New(s.runner, fcproc.Config{}, s.logger).CleanupJailerChroot(ctx, params.ChrootRoot) + return marshalResultOrError(struct{}{}, err) + default: + return rpc.NewError("unknown_method", req.Method) + } +} + +func (s *Server) ensureBridge(ctx context.Context, cfg NetworkConfig) error { + return fcproc.New(s.runner, fcproc.Config{ + BridgeName: cfg.BridgeName, + BridgeIP: cfg.BridgeIP, + CIDR: cfg.CIDR, + }, s.logger).EnsureBridge(ctx) +} + +func (s *Server) createTap(ctx context.Context, cfg NetworkConfig, tapName string) error { + if err := validateTapName(tapName); err != nil { + return err + } + return fcproc.New(s.runner, fcproc.Config{ + BridgeName: cfg.BridgeName, + BridgeIP: cfg.BridgeIP, + CIDR: cfg.CIDR, + }, s.logger).CreateTapOwned(ctx, tapName, s.meta.OwnerUID, s.meta.OwnerGID) +} + +func (s *Server) deleteTap(ctx context.Context, tapName string) error { + if err := validateTapName(tapName); err != nil { + return err + } + _, err := s.runner.Run(ctx, "ip", "link", "del", tapName) + return err +} + +func (s *Server) syncResolverRouting(ctx context.Context, bridgeName, serverAddr string) error { + if strings.TrimSpace(bridgeName) == "" || strings.TrimSpace(serverAddr) == "" { + return nil + } + if _, err := system.LookupExecutable("resolvectl"); err != nil { + return nil + } + if _, err := s.runner.Run(ctx, "resolvectl", "dns", bridgeName, serverAddr); err != nil { + return err + } + if _, err := s.runner.Run(ctx, "resolvectl", "domain", bridgeName, vmResolverRouteDomain); err != nil { + return err + } + _, err := s.runner.Run(ctx, "resolvectl", "default-route", bridgeName, "no") + return err +} + +func (s *Server) clearResolverRouting(ctx context.Context, bridgeName string) error { + if strings.TrimSpace(bridgeName) == "" { + return nil + } + if _, err := system.LookupExecutable("resolvectl"); err != nil { + return nil + } + _, err := s.runner.Run(ctx, "resolvectl", "revert", bridgeName) + return err +} + +func (s *Server) fsckSnapshot(ctx context.Context, dmDev string) error { + // Helper runs as root with -fy (auto-yes); without the prefix check + // a compromised daemon could fsck arbitrary block devices like + // /dev/sda1 and corrupt the host filesystem. + if err := validateDMDevicePath(dmDev); err != nil { + return err + } + if _, err := s.runner.Run(ctx, "e2fsck", "-fy", dmDev); err != nil { + if code := system.ExitCode(err); code < 0 || code > 1 { + return fmt.Errorf("fsck snapshot: %w", err) + } + } + return nil +} + +func (s *Server) writeExt4Files(ctx context.Context, imagePath string, files []Ext4Write) error { + for _, file := range files { + mode := os.FileMode(file.Mode) + if mode == 0 { + mode = 0o644 + } + if err := system.WriteExt4FileOwned(ctx, s.runner, imagePath, file.GuestPath, mode, 0, 0, file.Data); err != nil { + return err + } + } + return nil +} + +func (s *Server) resolveFirecrackerBinary(requested string) (string, error) { + requested = strings.TrimSpace(requested) + if requested == "" { + requested = defaultFirecrackerBinaryName + } + cfg := fcproc.Config{FirecrackerBin: requested} + resolved, err := fcproc.New(s.runner, cfg, s.logger).ResolveBinary() + if err != nil { + return "", err + } + if err := validateRootExecutable(resolved); err != nil { + return "", err + } + return resolved, nil +} + +func (s *Server) launchFirecracker(ctx context.Context, req FirecrackerLaunchRequest) (int, error) { + systemLayout := paths.ResolveSystem() + for _, path := range []string{req.SocketPath, req.VSockPath} { + if err := s.validateManagedPath(path, systemLayout.RuntimeDir); err != nil { + return 0, err + } + } + for _, path := range []string{req.LogPath, req.MetricsPath, req.KernelImagePath} { + if err := s.validateManagedPath(path, systemLayout.StateDir); err != nil { + return 0, err + } + } + if strings.TrimSpace(req.InitrdPath) != "" { + if err := s.validateManagedPath(req.InitrdPath, systemLayout.StateDir); err != nil { + return 0, err + } + } + if err := validateTapName(req.TapDevice); err != nil { + return 0, err + } + if err := validateRootExecutable(req.BinaryPath); err != nil { + return 0, err + } + for _, drive := range req.Drives { + if err := s.validateLaunchDrivePath(drive, systemLayout.StateDir); err != nil { + return 0, err + } + } + mgr := fcproc.New(s.runner, fcproc.Config{BridgeName: req.Network.BridgeName, BridgeIP: req.Network.BridgeIP, CIDR: req.Network.CIDR}, s.logger) + mc, err := s.buildLaunchMachineConfig(ctx, req, systemLayout, mgr) + if err != nil { + return 0, err + } + // Pre-Start symlink: see localPrivilegedOps.LaunchFirecracker for + // the AF_UNIX sun_path-length rationale. + if err := s.exposeJailerSockets(req); err != nil { + return 0, fmt.Errorf("expose jailer sockets: %w", err) + } + machine, err := firecracker.NewMachine(ctx, mc) + if err != nil { + return 0, err + } + if err := machine.Start(ctx); err != nil { + if pid := mgr.ResolvePID(context.Background(), machine, mc.SocketPath); pid > 0 { + _, _ = s.runner.Run(context.Background(), "kill", "-KILL", strconv.Itoa(pid)) + } + return 0, err + } + if req.Jailer == nil { + // Belt-and-suspenders only on the legacy direct-firecracker path; + // the jailer drops to the configured uid before creating the + // socket, so its perms are correct by construction. + if err := mgr.EnsureSocketAccessFor(ctx, mc.SocketPath, "firecracker api socket", s.meta.OwnerUID, s.meta.OwnerGID); err != nil { + return 0, err + } + if strings.TrimSpace(mc.VSockPath) != "" { + if err := mgr.EnsureSocketAccessFor(ctx, mc.VSockPath, "firecracker vsock socket", s.meta.OwnerUID, s.meta.OwnerGID); err != nil { + return 0, err + } + } + } + pid := mgr.ResolvePID(context.Background(), machine, mc.SocketPath) + if pid <= 0 { + return 0, errors.New("firecracker started but pid could not be resolved") + } + return pid, nil +} + +// buildLaunchMachineConfig assembles the firecracker.MachineConfig used by +// launchFirecracker, performing the chroot staging when jailer is enabled. +// In the non-jailer case it's a straight field copy from the request. +// +// In the jailer case it: +// - validates JailerLaunchOpts (binary executable, chroot under RuntimeDir, +// uid/gid match the registered owner — the daemon can't ask the helper to +// drop firecracker into an arbitrary uid) +// - calls fcproc.PrepareJailerChroot to build the chroot tree +// - rewrites SocketPath and VSockPath to host-visible chroot paths and +// KernelImagePath/InitrdPath/Drives[].Path to chroot-internal names +func (s *Server) buildLaunchMachineConfig(ctx context.Context, req FirecrackerLaunchRequest, layout paths.Layout, mgr *fcproc.Manager) (firecracker.MachineConfig, error) { + mc := firecracker.MachineConfig{ + BinaryPath: req.BinaryPath, + VMID: req.VMID, + SocketPath: req.SocketPath, + LogPath: req.LogPath, + MetricsPath: req.MetricsPath, + KernelImagePath: req.KernelImagePath, + InitrdPath: req.InitrdPath, + KernelArgs: req.KernelArgs, + Drives: req.Drives, + TapDevice: req.TapDevice, + VSockPath: req.VSockPath, + VSockCID: req.VSockCID, + VCPUCount: req.VCPUCount, + MemoryMiB: req.MemoryMiB, + Logger: s.logger, + } + if req.Jailer == nil { + return mc, nil + } + if err := s.validateJailerOpts(*req.Jailer, layout); err != nil { + return firecracker.MachineConfig{}, err + } + chrootRoot := firecracker.JailerChrootRoot(req.Jailer.ChrootBaseDir, req.VMID) + driveSpecs := make([]fcproc.ChrootDriveSpec, 0, len(req.Drives)) + chrootDrives := make([]firecracker.DriveConfig, 0, len(req.Drives)) + for _, d := range req.Drives { + name := chrootDriveName(d) + driveSpecs = append(driveSpecs, fcproc.ChrootDriveSpec{ChrootName: name, HostPath: d.Path}) + chrootDrives = append(chrootDrives, firecracker.DriveConfig{ + ID: d.ID, + Path: "/" + name, + ReadOnly: d.ReadOnly, + IsRoot: d.IsRoot, + }) + } + wantVSock := strings.TrimSpace(req.VSockPath) != "" + if err := mgr.PrepareJailerChroot(ctx, chrootRoot, + req.Jailer.UID, req.Jailer.GID, + req.BinaryPath, + req.KernelImagePath, "vmlinux", + req.InitrdPath, "initrd", + driveSpecs, wantVSock, + ); err != nil { + return firecracker.MachineConfig{}, fmt.Errorf("prepare jailer chroot: %w", err) + } + // See localPrivilegedOps.buildLaunchMachineConfig for why SocketPath + // stays the short req path but VSockPath becomes chroot-internal. + _ = chrootRoot + if wantVSock { + mc.VSockPath = firecracker.JailerVSockName + } + mc.KernelImagePath = "/vmlinux" + if strings.TrimSpace(req.InitrdPath) != "" { + mc.InitrdPath = "/initrd" + } else { + mc.InitrdPath = "" + } + mc.Drives = chrootDrives + // LogPath stays set so buildProcessRunner's openLogFile captures firecracker + // stderr via cmd.Stderr. buildConfig clears sdk.Config.LogPath for jailer + // mode to avoid PUT /logger with a host path firecracker can't open. + mc.MetricsPath = "" + mc.Jailer = &firecracker.JailerOpts{ + Binary: req.Jailer.Binary, + ChrootBaseDir: req.Jailer.ChrootBaseDir, + UID: req.Jailer.UID, + GID: req.Jailer.GID, + } + return mc, nil +} + +func (s *Server) validateJailerOpts(opts JailerLaunchOpts, layout paths.Layout) error { + if err := validateRootExecutable(opts.Binary); err != nil { + return fmt.Errorf("jailer binary: %w", err) + } + // Chroot base must live under StateDir so hard-links into the chroot + // share a filesystem with the image cache (RuntimeDir is tmpfs and + // would EXDEV on os.Link). RuntimeDir is also accepted because the + // jailer is happy on tmpfs when the kernel/drives happen to colocate + // (e.g. tests). + if err := s.validateManagedPath(opts.ChrootBaseDir, layout.StateDir, layout.RuntimeDir); err != nil { + return fmt.Errorf("jailer chroot base: %w", err) + } + if opts.UID != s.meta.OwnerUID || opts.GID != s.meta.OwnerGID { + return fmt.Errorf("jailer uid/gid (%d:%d) must match registered owner (%d:%d)", opts.UID, opts.GID, s.meta.OwnerUID, s.meta.OwnerGID) + } + return nil +} + +// exposeJailerSockets makes the chroot-internal sockets reachable at the +// host paths the daemon already references (sc.apiSock, vm.Runtime.VSockPath). +// AF_UNIX connect(2) follows symlinks, so a symlink keeps the rest of the +// daemon code unchanged. Computes both host targets from the chroot root and +// the chroot-internal name, so the API socket and the vsock socket stay in +// sync regardless of how the launch request laid them out. +func (s *Server) exposeJailerSockets(req FirecrackerLaunchRequest) error { + if req.Jailer == nil { + return nil + } + chrootRoot := firecracker.JailerChrootRoot(req.Jailer.ChrootBaseDir, req.VMID) + hostAPI := filepath.Join(chrootRoot, strings.TrimPrefix(firecracker.JailerSocketName, "/")) + if err := atomicSymlink(hostAPI, req.SocketPath); err != nil { + return fmt.Errorf("api socket symlink: %w", err) + } + if strings.TrimSpace(req.VSockPath) != "" { + hostVSock := filepath.Join(chrootRoot, strings.TrimPrefix(firecracker.JailerVSockName, "/")) + if err := atomicSymlink(hostVSock, req.VSockPath); err != nil { + return fmt.Errorf("vsock symlink: %w", err) + } + } + return nil +} + +func atomicSymlink(target, link string) error { + if err := os.Remove(link); err != nil && !os.IsNotExist(err) { + return err + } + return os.Symlink(target, link) +} + +// chrootDriveName returns the bare filename a drive should appear as inside +// the chroot. We use the drive ID when present (rootfs, work, …) so the +// chroot listing is self-explanatory; falling back to the source's basename +// covers the unnamed case. +func chrootDriveName(d firecracker.DriveConfig) string { + if id := strings.TrimSpace(d.ID); id != "" { + return id + } + return filepath.Base(d.Path) +} + +func (s *Server) validateLaunchDrivePath(drive firecracker.DriveConfig, stateDir string) error { + if err := s.validateManagedPath(drive.Path, stateDir); err == nil { + return nil + } + if drive.IsRoot { + if err := validateDMDevicePath(drive.Path); err == nil { + return nil + } + } + return fmt.Errorf("path %q is outside banger-managed directories", drive.Path) +} + +func (s *Server) ensureSocketAccess(ctx context.Context, socketPath, label string) error { + return fcproc.New(s.runner, fcproc.Config{}, s.logger).EnsureSocketAccessFor(ctx, socketPath, label, s.meta.OwnerUID, s.meta.OwnerGID) +} + +func (s *Server) validateManagedPath(path string, roots ...string) error { + path = strings.TrimSpace(path) + if path == "" { + return errors.New("path is required") + } + if !filepath.IsAbs(path) { + return fmt.Errorf("path %q must be absolute", path) + } + cleaned := filepath.Clean(path) + var matched string + for _, root := range roots { + root = strings.TrimSpace(root) + if root == "" { + continue + } + root = filepath.Clean(root) + if cleaned == root || strings.HasPrefix(cleaned, root+string(os.PathSeparator)) { + matched = root + break + } + } + if matched == "" { + return fmt.Errorf("path %q is outside banger-managed directories", path) + } + // Walk each component below the matched root with Lstat and refuse + // symlinks. Without this, validation was textual-only: a daemon-UID + // attacker could plant a symlink under StateDir/RuntimeDir and get + // the helper to drive losetup, ln -f, debugfs, e2cp, fsck, etc. at + // the dereferenced target (host devices, /etc/shadow, …). + // + // ENOENT is tolerated: some callers pass paths that firecracker + // creates after this check (sockets, log files). Anything missing + // can't be a symlink at this instant; whoever materialises it later + // goes through the helper's create primitives, which validate again. + if cleaned == matched { + return nil + } + suffix := strings.TrimPrefix(cleaned, matched+string(os.PathSeparator)) + cur := matched + for _, seg := range strings.Split(suffix, string(os.PathSeparator)) { + if seg == "" { + continue + } + cur = filepath.Join(cur, seg) + info, err := os.Lstat(cur) + if err != nil { + if os.IsNotExist(err) { + return nil + } + return fmt.Errorf("inspect %q: %w", cur, err) + } + if info.Mode()&os.ModeSymlink != 0 { + return fmt.Errorf("path %q has a symlink at %q", path, cur) + } + } + return nil +} + +// validateExt4ImagePath accepts a path that is either inside the +// banger StateDir (regular ext4 image files we manage) or a managed +// DM-snapshot device (/dev/mapper/fc-rootfs-*). Both shapes are +// legitimate inputs for the helper's debugfs/e2cp/e2rm RPCs; anything +// else would let a compromised daemon point those tools at arbitrary +// host files. +func (s *Server) validateExt4ImagePath(path string) error { + if err := s.validateManagedPath(path, paths.ResolveSystem().StateDir); err == nil { + return nil + } + if err := validateDMDevicePath(path); err == nil { + return nil + } + return fmt.Errorf("path %q is not a banger-managed ext4 image", path) +} + +// bangerBridgeNamePrefix pins the only iface-name shape the helper +// will mutate via priv.ensure_bridge / priv.create_tap / the resolver +// routing RPCs. Anything that doesn't match — host primary interfaces +// like eth0/wlan0/lo, foreign managed bridges like docker0/virbr0, +// arbitrary attacker-chosen names — is refused outright. Banger's +// daemon-config default for BridgeName is "br-fc"; users wanting a +// different name must keep the "br-fc-" prefix so the helper can +// recognise it as banger-managed. +const bangerBridgeNamePrefix = "br-fc" + +// validateBangerBridgeName enforces the banger naming convention on +// any bridge name a helper RPC mutates. Without this, a compromised +// owner-uid daemon could ask the helper (which runs with +// CAP_NET_ADMIN) to bring up arbitrary host interfaces, attach +// per-VM taps to other users' bridges, or flap the host's primary +// iface — argv-style exec rules out shell injection but the kernel +// happily honours these requests against any iface the caller +// names. +func validateBangerBridgeName(name string) error { + if err := validateLinuxIfaceName(name); err != nil { + return err + } + trimmed := strings.TrimSpace(name) + if trimmed == bangerBridgeNamePrefix { + return nil + } + if strings.HasPrefix(trimmed, bangerBridgeNamePrefix+"-") { + return nil + } + return fmt.Errorf("bridge name %q is not banger-managed (must equal %q or start with %q)", name, bangerBridgeNamePrefix, bangerBridgeNamePrefix+"-") +} + +// validateCIDRPrefix accepts a numeric IPv4 prefix length in [8, 32]. +// fcproc.EnsureBridge concatenates BridgeIP + "/" + CIDR into the +// `ip addr add` argument, so anything that doesn't parse as a small +// integer in that range either errors out (helpful) or, worse, +// silently widens the bridge subnet beyond what the daemon intends. +func validateCIDRPrefix(s string) error { + trimmed := strings.TrimSpace(s) + if trimmed == "" { + return errors.New("cidr prefix is required") + } + n, err := strconv.Atoi(trimmed) + if err != nil { + return fmt.Errorf("cidr prefix %q is not numeric", s) + } + if n < 8 || n > 32 { + return fmt.Errorf("cidr prefix %d is outside [8, 32]", n) + } + return nil +} + +// validateNetworkConfig is the single chokepoint for every helper RPC +// that takes a bridge name + IP + CIDR triple. Bundling the checks +// here keeps every caller in lockstep on what counts as a +// well-formed banger network config. +func validateNetworkConfig(cfg NetworkConfig) error { + if err := validateBangerBridgeName(cfg.BridgeName); err != nil { + return err + } + if err := validateIPv4(cfg.BridgeIP); err != nil { + return fmt.Errorf("bridge ip: %w", err) + } + if err := validateCIDRPrefix(cfg.CIDR); err != nil { + return fmt.Errorf("bridge cidr: %w", err) + } + return nil +} + +// validateLoopDevicePath confirms path is `/dev/loopN` for some N≥0. +// dmsnap.Cleanup detaches loops via `losetup -d `; without this +// a compromised daemon could ask the helper to detach an arbitrary +// device node. +func validateLoopDevicePath(path string) error { + path = strings.TrimSpace(path) + if path == "" { + return errors.New("loop device path is required") + } + const prefix = "/dev/loop" + if !strings.HasPrefix(path, prefix) { + return fmt.Errorf("loop device %q must live under /dev/loop", path) + } + suffix := path[len(prefix):] + if suffix == "" { + return fmt.Errorf("loop device %q is missing its index", path) + } + for _, r := range suffix { + if r < '0' || r > '9' { + return fmt.Errorf("loop device %q has non-numeric suffix", path) + } + } + return nil +} + +// validateDMSnapshotHandles checks every non-empty field on a Handles +// passed to priv.cleanup_dm_snapshot. Empty fields are tolerated (the +// dmsnap layer treats them as "nothing to clean here") but anything +// set must look like a banger-managed object. +func validateDMSnapshotHandles(h dmsnap.Handles) error { + if h.DMName != "" { + if err := validateDMName(h.DMName); err != nil { + return err + } + } + if h.DMDev != "" { + if err := validateDMDevicePath(h.DMDev); err != nil { + return err + } + } + if h.BaseLoop != "" { + if err := validateLoopDevicePath(h.BaseLoop); err != nil { + return err + } + } + if h.COWLoop != "" { + if err := validateLoopDevicePath(h.COWLoop); err != nil { + return err + } + } + return nil +} + +// validateDMRemoveTarget covers the union accepted by `dmsetup remove`: +// either the bare DM name or the /dev/mapper/ path. Both shapes +// are produced by dmsnap.Cleanup; nothing else should reach the helper. +func validateDMRemoveTarget(target string) error { + target = strings.TrimSpace(target) + if target == "" { + return errors.New("dm target is required") + } + if strings.HasPrefix(target, "/dev/mapper/") { + return validateDMDevicePath(target) + } + return validateDMName(target) +} + +// validateLinuxIfaceName mirrors the kernel's __dev_valid_name rules +// in a permissive subset: 1-15 chars, no whitespace, no slash, no +// colon, and not the special "." or "..". Used for bridge-name +// arguments to resolvectl. argv-style exec already prevents shell +// injection, but a compromised daemon could otherwise flap any +// system-managed link by passing its name here. +func validateLinuxIfaceName(name string) error { + name = strings.TrimSpace(name) + if name == "" { + return errors.New("interface name is required") + } + if len(name) > 15 { + return fmt.Errorf("interface %q exceeds 15 chars", name) + } + if name == "." || name == ".." { + return fmt.Errorf("interface name %q is reserved", name) + } + for _, r := range name { + if r <= ' ' || r == '/' || r == ':' || r == 0x7f { + return fmt.Errorf("interface %q contains invalid char %q", name, r) + } + } + return nil +} + +// validateIPv4 confirms ip parses as an IPv4 address. The NAT helpers +// build /32 iptables rules from this string; non-v4 input would +// produce malformed rules at best and unexpected ones at worst. +func validateIPv4(ip string) error { + ip = strings.TrimSpace(ip) + if ip == "" { + return errors.New("ipv4 address is required") + } + parsed := net.ParseIP(ip) + if parsed == nil || parsed.To4() == nil { + return fmt.Errorf("invalid ipv4 address %q", ip) + } + return nil +} + +// validateResolverAddr confirms s parses as an IP address, optionally +// with a ":port" suffix. resolvectl accepts both bare IPs and the +// "IP:port" form (used to point at a non-default DNS port — banger's +// in-process server binds to 127.0.0.1:42069). Reject anything that +// doesn't parse so a compromised daemon can't wedge resolved with +// garbage input. +func validateResolverAddr(s string) error { + s = strings.TrimSpace(s) + if s == "" { + return errors.New("resolver address is required") + } + if net.ParseIP(s) != nil { + return nil + } + if host, _, err := net.SplitHostPort(s); err == nil && net.ParseIP(host) != nil { + return nil + } + return fmt.Errorf("invalid resolver address %q", s) +} + +func validateTapName(tapName string) error { + tapName = strings.TrimSpace(tapName) + if strings.HasPrefix(tapName, vmTapPrefix) || strings.HasPrefix(tapName, tapPoolPrefix) { + return nil + } + return fmt.Errorf("tap %q is outside banger-managed naming", tapName) +} + +func validateDMName(dmName string) error { + dmName = strings.TrimSpace(dmName) + if strings.HasPrefix(dmName, rootfsDMNamePrefix) { + return nil + } + return fmt.Errorf("dm target %q is outside banger-managed naming", dmName) +} + +func validateDMDevicePath(path string) error { + path = strings.TrimSpace(path) + if path == "" { + return errors.New("dm device path is required") + } + if !filepath.IsAbs(path) { + return fmt.Errorf("dm device path %q must be absolute", path) + } + cleaned := filepath.Clean(path) + if filepath.Dir(cleaned) != "/dev/mapper" { + return fmt.Errorf("dm device path %q is outside /dev/mapper", path) + } + return validateDMName(filepath.Base(cleaned)) +} + +// validateNotSymlink rejects paths whose final component is a symlink. +// validateManagedPath does textual prefix matching only; pairing it +// with an Lstat check stops a daemon-uid attacker from planting a +// symlink at a managed path and using helper RPCs that operate on +// that path (chown/chmod sockets, umount/rm chroot trees) to reach +// arbitrary host objects. There is a small TOCTOU window between +// this check and the syscall that follows; for sockets the +// fcproc-level O_PATH|O_NOFOLLOW open closes that window, and for +// the chroot cleanup the umount step is bracketed by a findmnt +// guard inside fcproc.CleanupJailerChroot. +func validateNotSymlink(path string) error { + info, err := os.Lstat(path) + if err != nil { + return fmt.Errorf("inspect %s: %w", path, err) + } + if info.Mode()&os.ModeSymlink != 0 { + return fmt.Errorf("path %q must not be a symlink", path) + } + return nil +} + +// validateFirecrackerPID confirms pid refers to a running firecracker +// process that banger itself launched, not just any firecracker on +// the host. Two acceptance modes: +// +// - Cgroup match (the supported path): /proc//cgroup contains +// bangerd-root.service. systemd places every direct child of the +// helper unit into this cgroup at fork time and the kernel keeps +// it there for the process's lifetime, so no daemon-UID code can +// forge it. Other users' firecracker processes live in different +// cgroups (e.g. user@1000.service) and fail this check. +// - API-socket match (direct/legacy and orphan-recovery fallback): +// /proc//cmdline carries `--api-sock `, and the path +// is under banger's RuntimeDir. Firecracker launched directly +// (no jailer) keeps the host socket path in cmdline; a leftover +// firecracker after a helper crash might also still match this +// way, so daemon reconcile can clean it up. +// +// Without these checks the helper's previous substring-only +// "firecracker is in the cmdline" gate let any owner-UID caller +// signal any firecracker process on the host — a shared-host +// problem when multiple users run firecracker. +func validateFirecrackerPID(pid int) error { + if pid <= 0 { + return fmt.Errorf("pid %d is invalid", pid) + } + procDir := filepath.Join("/proc", strconv.Itoa(pid)) + cmdlineData, err := os.ReadFile(filepath.Join(procDir, "cmdline")) + if err != nil { + return fmt.Errorf("inspect pid %d: %w", pid, err) + } + cmdline := strings.ReplaceAll(string(cmdlineData), "\x00", " ") + if !strings.Contains(cmdline, "firecracker") { + return fmt.Errorf("pid %d is not a firecracker process", pid) + } + + // Primary check: the kernel-managed cgroup. systemd assigns every + // service child to that service's cgroup; a firecracker launched + // by another systemd unit, by a user's shell, or in someone else's + // container won't be in bangerd-root.service. + if cgroupData, err := os.ReadFile(filepath.Join(procDir, "cgroup")); err == nil { + if strings.Contains(string(cgroupData), installmeta.DefaultRootHelperService) { + return nil + } + } + + // Fallback: cmdline carries the host-side --api-sock under banger's + // RuntimeDir. Catches the legacy direct-firecracker path (no + // jailer, no chroot) and helps daemon reconcile clean up after a + // helper crash that orphaned firecracker children outside the + // service cgroup. + if apiSock := extractFirecrackerAPISock(cmdline); apiSock != "" { + cleaned := filepath.Clean(apiSock) + if pathIsUnder(cleaned, paths.ResolveSystem().RuntimeDir) { + return nil + } + } + + return fmt.Errorf("pid %d is firecracker but not a banger-managed instance", pid) +} + +// pathIsUnder reports whether p is exactly root or sits inside root, +// both pre-cleaned. Pulled out so the check stays consistent with +// validateManagedPath's prefix logic. +func pathIsUnder(p, root string) bool { + root = filepath.Clean(root) + if root == "" { + return false + } + return p == root || strings.HasPrefix(p, root+string(os.PathSeparator)) +} + +// extractFirecrackerAPISock pulls the --api-sock argument out of a +// space-separated cmdline. Accepts both `--api-sock VALUE` and +// `--api-sock=VALUE` forms; firecracker also accepts the short flag +// `-a VALUE` so we cover that too. +func extractFirecrackerAPISock(cmdline string) string { + fields := strings.Fields(cmdline) + for i, f := range fields { + switch { + case (f == "--api-sock" || f == "-a") && i+1 < len(fields): + return fields[i+1] + case strings.HasPrefix(f, "--api-sock="): + return strings.TrimPrefix(f, "--api-sock=") + } + } + return "" +} + +// signalAllowlist captures the small set of signals banger needs for +// VM lifecycle: graceful stop (TERM, INT, QUIT, HUP), force-stop +// (KILL), and process-introspection signals operators occasionally +// reach for (USR1/USR2, ABRT). Real-time signals, STOP/CONT, and +// numeric forms are refused — the helper running as root must not be +// a generic "send arbitrary signal to my pid" primitive. +var signalAllowlist = map[string]struct{}{ + "TERM": {}, "SIGTERM": {}, + "KILL": {}, "SIGKILL": {}, + "INT": {}, "SIGINT": {}, + "HUP": {}, "SIGHUP": {}, + "QUIT": {}, "SIGQUIT": {}, + "USR1": {}, "SIGUSR1": {}, + "USR2": {}, "SIGUSR2": {}, + "ABRT": {}, "SIGABRT": {}, +} + +// validateSignalName accepts only an explicit name from the allowlist +// (case-insensitive, with or without the SIG prefix). Numeric signals +// are rejected outright — `kill -9` callers must spell KILL. +func validateSignalName(name string) error { + upper := strings.ToUpper(strings.TrimSpace(name)) + if upper == "" { + return errors.New("signal name is required") + } + if _, ok := signalAllowlist[upper]; !ok { + return fmt.Errorf("signal %q is not on the helper allowlist (TERM/KILL/INT/HUP/QUIT/USR1/USR2/ABRT)", name) + } + return nil +} + +// validateRootExecutable opens the path with O_PATH|O_NOFOLLOW and re-checks +// every constraint via Fstat on the resulting fd. Going through O_PATH (rather +// than the previous os.Stat) gives two improvements: +// +// - O_NOFOLLOW rejects path-level symlinks outright, so a swap of the +// binary's path component to point at an attacker-controlled target is +// caught here rather than slipping through to the SDK. +// - Fstat reads metadata from the inode the kernel just resolved, narrowing +// the TOCTOU window between validation and exec to the time it takes the +// SDK to fork+exec — sub-millisecond on a healthy host. The window can't +// be fully closed without re-pointing the SDK at /proc/self/fd/N (the +// known-good idiom), which would require keeping the fd alive across +// fork+exec; we accept the tiny residual window for the simpler shape. +func validateRootExecutable(path string) error { + fd, err := unix.Open(path, unix.O_PATH|unix.O_NOFOLLOW|unix.O_CLOEXEC, 0) + if err != nil { + return fmt.Errorf("open executable %q: %w", path, err) + } + defer unix.Close(fd) + var st unix.Stat_t + if err := unix.Fstat(fd, &st); err != nil { + return fmt.Errorf("fstat executable %q: %w", path, err) + } + if st.Mode&unix.S_IFMT != unix.S_IFREG { + return fmt.Errorf("firecracker binary %q is not a regular file", path) + } + if st.Mode&0o111 == 0 { + return fmt.Errorf("firecracker binary %q is not executable", path) + } + if st.Mode&0o022 != 0 { + return fmt.Errorf("firecracker binary %q must not be group/world writable", path) + } + if st.Uid != 0 { + return fmt.Errorf("firecracker binary %q must be root-owned in system mode", path) + } + return nil +} + +func marshalResultOrError(v any, err error) rpc.Response { + if err != nil { + return rpc.NewError("operation_failed", err.Error()) + } + resp, marshalErr := rpc.NewResult(v) + if marshalErr != nil { + return rpc.NewError("marshal_failed", marshalErr.Error()) + } + return resp +} diff --git a/internal/roothelper/roothelper_test.go b/internal/roothelper/roothelper_test.go new file mode 100644 index 0000000..441a1e4 --- /dev/null +++ b/internal/roothelper/roothelper_test.go @@ -0,0 +1,673 @@ +package roothelper + +import ( + "os" + "path/filepath" + "testing" + + "banger/internal/daemon/dmsnap" + "banger/internal/firecracker" + "banger/internal/paths" +) + +func TestValidateDMDevicePath(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + path string + ok bool + }{ + {name: "valid", path: "/dev/mapper/fc-rootfs-test", ok: true}, + {name: "wrong_prefix", path: "/dev/mapper/not-banger", ok: false}, + {name: "wrong_dir", path: "/tmp/fc-rootfs-test", ok: false}, + {name: "relative", path: "fc-rootfs-test", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateDMDevicePath(tc.path) + if tc.ok && err != nil { + t.Fatalf("validateDMDevicePath(%q) = %v, want nil", tc.path, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateDMDevicePath(%q) succeeded, want error", tc.path) + } + }) + } +} + +func TestValidateFirecrackerPID(t *testing.T) { + t.Parallel() + + if err := validateFirecrackerPID(0); err == nil { + t.Fatal("validateFirecrackerPID(0) succeeded, want error") + } + if err := validateFirecrackerPID(-1); err == nil { + t.Fatal("validateFirecrackerPID(-1) succeeded, want error") + } + // Self pid points at the go test binary, whose cmdline does not + // contain "firecracker" — rejection proves the helper would refuse + // to kill arbitrary host processes. + if err := validateFirecrackerPID(os.Getpid()); err == nil { + t.Fatal("validateFirecrackerPID(test pid) succeeded, want error") + } + // PID 1 is init/systemd on Linux — a juicy target for a compromised + // daemon, and definitely not firecracker. Make sure we'd refuse. + if err := validateFirecrackerPID(1); err == nil { + t.Fatal("validateFirecrackerPID(1) succeeded, want error") + } +} + +// TestValidateRootExecutableRejectsSymlink pins the O_NOFOLLOW +// guarantee: even if the path string passes a textual check, a symlink +// at the leaf is refused before we ever stat the target. +func TestValidateRootExecutableRejectsSymlink(t *testing.T) { + t.Parallel() + dir := t.TempDir() + regular := filepath.Join(dir, "real") + if err := os.WriteFile(regular, []byte{}, 0o755); err != nil { + t.Fatalf("write regular: %v", err) + } + link := filepath.Join(dir, "link") + if err := os.Symlink(regular, link); err != nil { + t.Fatalf("symlink: %v", err) + } + if err := validateRootExecutable(link); err == nil { + t.Fatal("validateRootExecutable(symlink) succeeded, want error") + } +} + +// TestValidateRootExecutableRejectsNonRootOwned exercises the Fstat +// uid check on a file the test user just created: it can't possibly +// be uid 0, so the validator must refuse it. This is the regression +// guard against the previous os.Stat code path drifting back in. +func TestValidateRootExecutableRejectsNonRootOwned(t *testing.T) { + t.Parallel() + if os.Getuid() == 0 { + t.Skip("test runs as root; cannot construct a non-root-owned file in a tempdir we can write") + } + path := filepath.Join(t.TempDir(), "binary") + if err := os.WriteFile(path, []byte{}, 0o755); err != nil { + t.Fatalf("write: %v", err) + } + err := validateRootExecutable(path) + if err == nil { + t.Fatal("validateRootExecutable(user-owned) succeeded, want error") + } + if !contains(err.Error(), "root-owned") { + t.Fatalf("err = %v, want root-owned rejection", err) + } +} + +func TestValidateRootExecutableRejectsGroupWritable(t *testing.T) { + t.Parallel() + if os.Getuid() == 0 { + t.Skip("test runs as root; can't construct a non-root-owned file") + } + path := filepath.Join(t.TempDir(), "binary") + if err := os.WriteFile(path, []byte{}, 0o775); err != nil { + t.Fatalf("write: %v", err) + } + err := validateRootExecutable(path) + if err == nil { + t.Fatal("validateRootExecutable(group-writable) succeeded, want error") + } +} + +// contains is a local substring helper that mirrors strings.Contains +// without pulling in the package — kept tiny so the test file's +// dependency surface stays close to the thing being tested. +func contains(s, sub string) bool { + for i := 0; i+len(sub) <= len(s); i++ { + if s[i:i+len(sub)] == sub { + return true + } + } + return false +} + +func TestValidateSignalName(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "TERM", arg: "TERM", ok: true}, + {name: "SIGTERM", arg: "SIGTERM", ok: true}, + {name: "lowercase_kill", arg: "kill", ok: true}, + {name: "with_whitespace", arg: " HUP ", ok: true}, + {name: "USR1", arg: "USR1", ok: true}, + {name: "ABRT", arg: "ABRT", ok: true}, + {name: "empty", arg: "", ok: false}, + {name: "numeric_9", arg: "9", ok: false}, + {name: "STOP_DoS", arg: "STOP", ok: false}, + {name: "CONT", arg: "CONT", ok: false}, + {name: "realtime", arg: "RTMIN+1", ok: false}, + {name: "garbage", arg: "FOOBAR", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateSignalName(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateSignalName(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateSignalName(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestExtractFirecrackerAPISock(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + cmdline string + want string + }{ + {name: "long_form_space", cmdline: "firecracker --api-sock /run/banger/fc-abc.sock --id abc", want: "/run/banger/fc-abc.sock"}, + {name: "long_form_equals", cmdline: "firecracker --api-sock=/run/banger/fc-abc.sock --id abc", want: "/run/banger/fc-abc.sock"}, + {name: "short_form", cmdline: "firecracker -a /run/banger/fc-abc.sock --id abc", want: "/run/banger/fc-abc.sock"}, + {name: "absent", cmdline: "firecracker --id abc", want: ""}, + {name: "trailing_flag", cmdline: "firecracker --api-sock", want: ""}, + {name: "empty", cmdline: "", want: ""}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + got := extractFirecrackerAPISock(tc.cmdline) + if got != tc.want { + t.Fatalf("extractFirecrackerAPISock(%q) = %q, want %q", tc.cmdline, got, tc.want) + } + }) + } +} + +func TestPathIsUnder(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + p string + root string + want bool + }{ + {name: "exact", p: "/var/lib/banger", root: "/var/lib/banger", want: true}, + {name: "nested", p: "/var/lib/banger/jail/x", root: "/var/lib/banger", want: true}, + {name: "sibling", p: "/var/lib/banger-other", root: "/var/lib/banger", want: false}, + {name: "outside", p: "/etc/passwd", root: "/var/lib/banger", want: false}, + {name: "empty_root", p: "/anywhere", root: "", want: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + if got := pathIsUnder(tc.p, tc.root); got != tc.want { + t.Fatalf("pathIsUnder(%q, %q) = %v, want %v", tc.p, tc.root, got, tc.want) + } + }) + } +} + +func TestValidateLoopDevicePath(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "loop0", arg: "/dev/loop0", ok: true}, + {name: "loop12", arg: "/dev/loop12", ok: true}, + {name: "no_index", arg: "/dev/loop", ok: false}, + {name: "non_numeric", arg: "/dev/loop-x", ok: false}, + {name: "wrong_prefix", arg: "/dev/sda1", ok: false}, + {name: "empty", arg: "", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateLoopDevicePath(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateLoopDevicePath(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateLoopDevicePath(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateDMRemoveTarget(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "dm_name", arg: "fc-rootfs-abc", ok: true}, + {name: "dm_device_path", arg: "/dev/mapper/fc-rootfs-abc", ok: true}, + {name: "wrong_prefix", arg: "not-banger", ok: false}, + {name: "device_wrong_prefix", arg: "/dev/mapper/not-banger", ok: false}, + {name: "empty", arg: "", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateDMRemoveTarget(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateDMRemoveTarget(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateDMRemoveTarget(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateDMSnapshotHandles(t *testing.T) { + t.Parallel() + + // Empty handles are tolerated — the dmsnap layer treats every + // missing field as a no-op for that step. + if err := validateDMSnapshotHandles(dmsnap.Handles{}); err != nil { + t.Fatalf("validateDMSnapshotHandles(empty) = %v, want nil", err) + } + good := dmsnap.Handles{ + BaseLoop: "/dev/loop0", + COWLoop: "/dev/loop1", + DMName: "fc-rootfs-abc", + DMDev: "/dev/mapper/fc-rootfs-abc", + } + if err := validateDMSnapshotHandles(good); err != nil { + t.Fatalf("validateDMSnapshotHandles(good) = %v, want nil", err) + } + for _, tc := range []struct { + name string + mutate func(dmsnap.Handles) dmsnap.Handles + wantErr bool + }{ + {name: "bad_dm_name", mutate: func(h dmsnap.Handles) dmsnap.Handles { + h.DMName = "rogue" + return h + }, wantErr: true}, + {name: "bad_dm_device", mutate: func(h dmsnap.Handles) dmsnap.Handles { + h.DMDev = "/dev/sda1" + return h + }, wantErr: true}, + {name: "bad_base_loop", mutate: func(h dmsnap.Handles) dmsnap.Handles { + h.BaseLoop = "/dev/sda1" + return h + }, wantErr: true}, + {name: "bad_cow_loop", mutate: func(h dmsnap.Handles) dmsnap.Handles { + h.COWLoop = "/etc/shadow" + return h + }, wantErr: true}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateDMSnapshotHandles(tc.mutate(good)) + if tc.wantErr && err == nil { + t.Fatalf("validateDMSnapshotHandles(%s) succeeded, want error", tc.name) + } + if !tc.wantErr && err != nil { + t.Fatalf("validateDMSnapshotHandles(%s) = %v, want nil", tc.name, err) + } + }) + } +} + +// TestValidateManagedPathRejectsSymlinkLeaf pins the leaf-symlink +// rejection: even when the path string sits inside a managed root, a +// symlink at the final component must be refused. Otherwise a +// daemon-UID attacker could plant `/foo -> /etc/shadow` and +// get the helper to drive privileged tooling against host files. +func TestValidateManagedPathRejectsSymlinkLeaf(t *testing.T) { + t.Parallel() + srv := &Server{} + root := t.TempDir() + target := filepath.Join(t.TempDir(), "outside") + if err := os.WriteFile(target, []byte("secret"), 0o600); err != nil { + t.Fatalf("write target: %v", err) + } + link := filepath.Join(root, "leak") + if err := os.Symlink(target, link); err != nil { + t.Fatalf("symlink: %v", err) + } + err := srv.validateManagedPath(link, root) + if err == nil { + t.Fatal("validateManagedPath(symlink leaf) succeeded, want error") + } +} + +// TestValidateManagedPathRejectsSymlinkIntermediate pins ancestor +// symlink rejection. Without the walk, an attacker plants +// `/dir -> /etc` and a path like `/dir/passwd` +// passes the textual prefix check but resolves to /etc/passwd at op +// time. +func TestValidateManagedPathRejectsSymlinkIntermediate(t *testing.T) { + t.Parallel() + srv := &Server{} + root := t.TempDir() + target := t.TempDir() + link := filepath.Join(root, "redirect") + if err := os.Symlink(target, link); err != nil { + t.Fatalf("symlink: %v", err) + } + err := srv.validateManagedPath(filepath.Join(link, "passwd"), root) + if err == nil { + t.Fatal("validateManagedPath(symlink intermediate) succeeded, want error") + } +} + +// TestValidateManagedPathToleratesMissingLeaf confirms ENOENT does +// not flip the validator into a fail. Several callers pass paths +// firecracker (or the helper's own staging) creates AFTER validation +// — sockets, log files, kernel hard-link targets — and a strict +// existence check would break those flows. +func TestValidateManagedPathToleratesMissingLeaf(t *testing.T) { + t.Parallel() + srv := &Server{} + root := t.TempDir() + missing := filepath.Join(root, "deeper", "not-yet") + if err := srv.validateManagedPath(missing, root); err != nil { + t.Fatalf("validateManagedPath(missing leaf) = %v, want nil", err) + } +} + +// TestValidateManagedPathPassesPlainSubpath is the happy path: a +// regular file inside a real subdir should sail through the new walk. +func TestValidateManagedPathPassesPlainSubpath(t *testing.T) { + t.Parallel() + srv := &Server{} + root := t.TempDir() + subdir := filepath.Join(root, "vms", "abc") + if err := os.MkdirAll(subdir, 0o755); err != nil { + t.Fatalf("mkdir: %v", err) + } + leaf := filepath.Join(subdir, "rootfs.ext4") + if err := os.WriteFile(leaf, []byte("data"), 0o644); err != nil { + t.Fatalf("write leaf: %v", err) + } + if err := srv.validateManagedPath(leaf, root); err != nil { + t.Fatalf("validateManagedPath(plain subpath) = %v, want nil", err) + } +} + +func TestValidateBangerBridgeName(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "default", arg: "br-fc", ok: true}, + {name: "suffixed", arg: "br-fc-alt", ok: true}, + {name: "with_whitespace", arg: " br-fc ", ok: true}, + {name: "wrong_prefix", arg: "br0", ok: false}, + {name: "host_iface", arg: "eth0", ok: false}, + {name: "docker", arg: "docker0", ok: false}, + {name: "loopback", arg: "lo", ok: false}, + {name: "empty", arg: "", ok: false}, + {name: "br_dash_only", arg: "br-", ok: false}, // not "br-fc" exactly + {name: "almost_match", arg: "br-fcx", ok: false}, + {name: "with_slash", arg: "br-fc/x", ok: false}, + {name: "too_long", arg: "br-fc-aaaaaaaaaa", ok: false}, // 16 chars + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateBangerBridgeName(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateBangerBridgeName(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateBangerBridgeName(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateCIDRPrefix(t *testing.T) { + t.Parallel() + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "default_24", arg: "24", ok: true}, + {name: "min_8", arg: "8", ok: true}, + {name: "max_32", arg: "32", ok: true}, + {name: "with_whitespace", arg: " 16 ", ok: true}, + {name: "below_min", arg: "7", ok: false}, + {name: "above_max", arg: "33", ok: false}, + {name: "non_numeric", arg: "abc", ok: false}, + {name: "ipv6_prefix", arg: "64", ok: false}, // outside [8, 32] + {name: "with_slash", arg: "/24", ok: false}, + {name: "empty", arg: "", ok: false}, + {name: "negative", arg: "-1", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateCIDRPrefix(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateCIDRPrefix(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateCIDRPrefix(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateNetworkConfig(t *testing.T) { + t.Parallel() + good := NetworkConfig{ + BridgeName: "br-fc", + BridgeIP: "172.16.0.1", + CIDR: "24", + } + if err := validateNetworkConfig(good); err != nil { + t.Fatalf("validateNetworkConfig(default) = %v, want nil", err) + } + for _, tc := range []struct { + name string + mutate func(NetworkConfig) NetworkConfig + }{ + {name: "bad_bridge", mutate: func(c NetworkConfig) NetworkConfig { c.BridgeName = "eth0"; return c }}, + {name: "bad_ip", mutate: func(c NetworkConfig) NetworkConfig { c.BridgeIP = "::1"; return c }}, + {name: "bad_cidr", mutate: func(c NetworkConfig) NetworkConfig { c.CIDR = "/24"; return c }}, + {name: "missing_ip", mutate: func(c NetworkConfig) NetworkConfig { c.BridgeIP = ""; return c }}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + if err := validateNetworkConfig(tc.mutate(good)); err == nil { + t.Fatalf("validateNetworkConfig(%s) succeeded, want error", tc.name) + } + }) + } +} + +func TestValidateLinuxIfaceName(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "typical_bridge", arg: "br-banger", ok: true}, + {name: "uplink", arg: "enp5s0", ok: true}, + {name: "max_len", arg: "a234567890abcde", ok: true}, // 15 chars + {name: "empty", arg: "", ok: false}, + {name: "too_long", arg: "a234567890abcdef", ok: false}, + {name: "with_slash", arg: "br/0", ok: false}, + {name: "with_space", arg: "br 0", ok: false}, + {name: "with_colon", arg: "br:0", ok: false}, + {name: "dot", arg: ".", ok: false}, + {name: "dotdot", arg: "..", ok: false}, + {name: "control_char", arg: "br\x01", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateLinuxIfaceName(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateLinuxIfaceName(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateLinuxIfaceName(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateIPv4(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "valid", arg: "172.16.0.2", ok: true}, + {name: "with_whitespace", arg: " 10.0.0.1 ", ok: true}, + {name: "empty", arg: "", ok: false}, + {name: "ipv6", arg: "::1", ok: false}, + {name: "garbage", arg: "not-an-ip", ok: false}, + {name: "with_cidr", arg: "10.0.0.1/24", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateIPv4(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateIPv4(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateIPv4(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateResolverAddr(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "ipv4", arg: "192.168.1.1", ok: true}, + {name: "ipv6", arg: "fe80::1", ok: true}, + {name: "ipv4_with_port", arg: "127.0.0.1:42069", ok: true}, + {name: "ipv6_with_port", arg: "[fe80::1]:42069", ok: true}, + {name: "empty", arg: "", ok: false}, + {name: "garbage", arg: "resolver.example", ok: false}, + {name: "garbage_with_port", arg: "resolver.example:53", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := validateResolverAddr(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateResolverAddr(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateResolverAddr(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateExt4ImagePath(t *testing.T) { + t.Parallel() + + srv := &Server{} + stateDir := paths.ResolveSystem().StateDir + for _, tc := range []struct { + name string + arg string + ok bool + }{ + {name: "managed_image", arg: filepath.Join(stateDir, "vms", "abc", "rootfs.ext4"), ok: true}, + {name: "managed_dm_device", arg: "/dev/mapper/fc-rootfs-test", ok: true}, + {name: "outside_state", arg: "/etc/shadow", ok: false}, + {name: "wrong_dm", arg: "/dev/mapper/not-banger", ok: false}, + {name: "relative", arg: "rootfs.ext4", ok: false}, + {name: "empty", arg: "", ok: false}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + err := srv.validateExt4ImagePath(tc.arg) + if tc.ok && err != nil { + t.Fatalf("validateExt4ImagePath(%q) = %v, want nil", tc.arg, err) + } + if !tc.ok && err == nil { + t.Fatalf("validateExt4ImagePath(%q) succeeded, want error", tc.arg) + } + }) + } +} + +func TestValidateNotSymlink(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + regular := filepath.Join(dir, "real") + if err := os.WriteFile(regular, []byte("ok"), 0o600); err != nil { + t.Fatalf("write regular: %v", err) + } + link := filepath.Join(dir, "link") + if err := os.Symlink(regular, link); err != nil { + t.Fatalf("symlink: %v", err) + } + + if err := validateNotSymlink(regular); err != nil { + t.Fatalf("validateNotSymlink(real) = %v, want nil", err) + } + if err := validateNotSymlink(link); err == nil { + t.Fatal("validateNotSymlink(symlink) succeeded, want error") + } + if err := validateNotSymlink(filepath.Join(dir, "missing")); err == nil { + t.Fatal("validateNotSymlink(missing) succeeded, want error") + } + // Symlink pointing into the system tree is the threat we care about. + // A daemon-uid attacker plants this kind of link and hopes the helper + // follows it; this test pins the rejection. + hostileLink := filepath.Join(dir, "hostile") + if err := os.Symlink("/etc/shadow", hostileLink); err != nil { + t.Fatalf("symlink: %v", err) + } + if err := validateNotSymlink(hostileLink); err == nil { + t.Fatal("validateNotSymlink(symlink-to-/etc/shadow) succeeded, want error") + } +} + +func TestValidateLaunchDrivePathAllowsManagedRootDMDevice(t *testing.T) { + t.Parallel() + + srv := &Server{} + if err := srv.validateLaunchDrivePath(firecracker.DriveConfig{ + ID: "rootfs", + Path: "/dev/mapper/fc-rootfs-test", + IsRoot: true, + }, "/var/lib/banger"); err != nil { + t.Fatalf("validateLaunchDrivePath(root dm) = %v, want nil", err) + } + + if err := srv.validateLaunchDrivePath(firecracker.DriveConfig{ + ID: "work", + Path: "/dev/mapper/fc-rootfs-test", + IsRoot: false, + }, "/var/lib/banger"); err == nil { + t.Fatal("validateLaunchDrivePath(non-root dm) succeeded, want error") + } +} diff --git a/internal/rpc/rpc.go b/internal/rpc/rpc.go index 3abfb59..00e1ec3 100644 --- a/internal/rpc/rpc.go +++ b/internal/rpc/rpc.go @@ -18,6 +18,40 @@ type Request struct { Version int `json:"version"` Method string `json:"method"` Params json.RawMessage `json:"params,omitempty"` + // OpID is the per-RPC correlation id. Optional on the wire so + // older clients (which don't set it) and older servers (which + // don't read it) keep interoperating. The daemon attaches it on + // every incoming request via dispatch; rpc.Call forwards + // whatever id is on ctx so a helper RPC carries the same id as + // the daemon RPC that triggered it. + OpID string `json:"op_id,omitempty"` +} + +// opIDKey is the context-value key for the per-RPC correlation id +// that flows from CLI → daemon → root helper. Lives in the rpc +// package because rpc.Call needs to read it without depending on +// the daemon package; daemon and roothelper both import it. +type opIDKey struct{} + +// WithOpID stores opID on ctx. Used by the daemon dispatch layer to +// inject the per-request id; rpc.Call picks it up automatically. +func WithOpID(ctx context.Context, opID string) context.Context { + if ctx == nil || opID == "" { + return ctx + } + return context.WithValue(ctx, opIDKey{}, opID) +} + +// OpIDFromContext returns the op id stored on ctx by WithOpID, or +// "" if none was set. +func OpIDFromContext(ctx context.Context) string { + if ctx == nil { + return "" + } + if id, _ := ctx.Value(opIDKey{}).(string); id != "" { + return id + } + return "" } type Response struct { @@ -29,6 +63,29 @@ type Response struct { type ErrorResponse struct { Code string `json:"code"` Message string `json:"message"` + // OpID is the daemon-assigned correlation id for the RPC that + // produced this error. Optional and may be empty (older daemons + // don't set it); when present the CLI surfaces it so an operator + // can grep journalctl by that id and find the full context. + OpID string `json:"op_id,omitempty"` +} + +// Error makes ErrorResponse satisfy the error interface so callers +// can errors.As it out of an rpc.Call return value and read the +// structured fields directly. The default string form is +// "code: message (op-id)" — the op id only appears when the daemon +// attached one. CLI code paths that want a translated, user-facing +// message render the typed fields themselves; this fallback is for +// log lines, fmt.Errorf %w wrappers, and any caller that hasn't +// bothered to errors.As yet. +func (e *ErrorResponse) Error() string { + if e == nil { + return "" + } + if e.OpID == "" { + return e.Code + ": " + e.Message + } + return e.Code + ": " + e.Message + " (" + e.OpID + ")" } func NewResult(v any) (Response, error) { @@ -43,6 +100,12 @@ func NewError(code, message string) Response { return Response{OK: false, Error: &ErrorResponse{Code: code, Message: message}} } +// NewErrorWithOpID is the variant for daemon dispatch sites that have +// resolved an op id by the time they encode the response. +func NewErrorWithOpID(code, message, opID string) Response { + return Response{OK: false, Error: &ErrorResponse{Code: code, Message: message, OpID: opID}} +} + func DecodeParams[T any](req Request) (T, error) { var zero T if len(req.Params) == 0 { @@ -78,7 +141,7 @@ func Call[T any](ctx context.Context, socketPath, method string, params any) (T, _ = conn.SetDeadline(deadline) } - request := Request{Version: Version, Method: method} + request := Request{Version: Version, Method: method, OpID: OpIDFromContext(ctx)} if params != nil { raw, err := json.Marshal(params) if err != nil { @@ -105,7 +168,10 @@ func Call[T any](ctx context.Context, socketPath, method string, params any) (T, if response.Error == nil { return zero, errors.New("rpc error") } - return zero, fmt.Errorf("%s: %s", response.Error.Code, response.Error.Message) + // Return the typed error directly so callers that need code + // or op_id can errors.As it out. err.Error() format is + // preserved for callers that only print the message. + return zero, response.Error } if len(response.Result) == 0 { return zero, nil diff --git a/internal/rpc/rpc_test.go b/internal/rpc/rpc_test.go index c59a8e9..10e64c2 100644 --- a/internal/rpc/rpc_test.go +++ b/internal/rpc/rpc_test.go @@ -92,6 +92,62 @@ func TestCallReturnsRemoteError(t *testing.T) { } } +func TestCallExposesTypedErrorWithOpID(t *testing.T) { + t.Parallel() + + socketPath, cleanup := serveRPCOnce(t, func(conn net.Conn) { + defer conn.Close() + var req Request + if err := json.NewDecoder(bufio.NewReader(conn)).Decode(&req); err != nil { + t.Fatalf("decode request: %v", err) + } + if err := json.NewEncoder(conn).Encode(NewErrorWithOpID("not_found", "vm \"foo\" not found", "op-deadbeef00ff")); err != nil { + t.Fatalf("encode error response: %v", err) + } + }) + defer cleanup() + + _, err := Call[map[string]string](context.Background(), socketPath, "vm.show", nil) + if err == nil { + t.Fatal("Call() returned nil error") + } + var rpcErr *ErrorResponse + if !errors.As(err, &rpcErr) { + t.Fatalf("Call() error %T (%v) is not *ErrorResponse — CLI cannot read the op_id", err, err) + } + if rpcErr.Code != "not_found" || rpcErr.OpID != "op-deadbeef00ff" { + t.Fatalf("typed error = %+v, want code=not_found op-deadbeef00ff", rpcErr) + } + // String form keeps the op_id in parens so callers that only + // log err.Error() still surface the id. + if got := rpcErr.Error(); !strings.Contains(got, "(op-deadbeef00ff)") { + t.Fatalf("err.Error() = %q, want op-id suffix", got) + } +} + +func TestCallForwardsOpIDFromContext(t *testing.T) { + t.Parallel() + + var seenReq Request + socketPath, cleanup := serveRPCOnce(t, func(conn net.Conn) { + defer conn.Close() + if err := json.NewDecoder(bufio.NewReader(conn)).Decode(&seenReq); err != nil { + t.Fatalf("decode request: %v", err) + } + resp, _ := NewResult(map[string]string{"status": "ok"}) + _ = json.NewEncoder(conn).Encode(resp) + }) + defer cleanup() + + ctx := WithOpID(context.Background(), "op-cafef00d1234") + if _, err := Call[map[string]string](ctx, socketPath, "ping", nil); err != nil { + t.Fatalf("Call: %v", err) + } + if seenReq.OpID != "op-cafef00d1234" { + t.Fatalf("server saw op_id = %q, want op-cafef00d1234", seenReq.OpID) + } +} + func TestCallRejectsMalformedResponse(t *testing.T) { t.Parallel() diff --git a/internal/smoketest/doc.go b/internal/smoketest/doc.go new file mode 100644 index 0000000..af7d17e --- /dev/null +++ b/internal/smoketest/doc.go @@ -0,0 +1,24 @@ +//go:build smoke + +// Package smoketest is the end-to-end smoke gate for banger's supported +// two-service systemd model. It runs only when the build is tagged +// `smoke`, which keeps it out of `go test ./...` on contributor +// machines and CI. +// +// The suite touches global host state: it installs instrumented +// bangerd.service + bangerd-root.service, drives real Firecracker/KVM +// scenarios, copies covdata back out, then purges the smoke-owned +// install on exit. It refuses to run if a non-smoke install is already +// on the host (see the marker file under /etc/banger). +// +// The harness expects three env vars, normally set by `make smoke`: +// +// BANGER_SMOKE_BIN_DIR — instrumented banger / bangerd / vsock-agent +// BANGER_SMOKE_COVER_DIR — coverage output directory (GOCOVERDIR) +// BANGER_SMOKE_XDG_DIR — scratch root for fake homes, fake repos, etc. +// +// Coverage: the test binary itself is not instrumented, but every +// banger / bangerd subprocess it spawns is, and writes covdata into +// BANGER_SMOKE_COVER_DIR. Service-side covdata under /var/lib/banger +// is copied out at teardown. +package smoketest diff --git a/internal/smoketest/fixtures_test.go b/internal/smoketest/fixtures_test.go new file mode 100644 index 0000000..b6e1105 --- /dev/null +++ b/internal/smoketest/fixtures_test.go @@ -0,0 +1,50 @@ +//go:build smoke + +package smoketest + +import ( + "fmt" + "os" + "os/exec" + "path/filepath" +) + +// setupRepoFixture builds the throwaway git repo at runtimeDir/fake-repo +// that every repodir-class scenario consumes. Mirrors +// scripts/smoke.sh:441-456. The path is stored in the package-level +// repoDir so scenarios can reference it directly. +func setupRepoFixture() error { + repoDir = filepath.Join(runtimeDir, "fake-repo") + if err := os.MkdirAll(repoDir, 0o755); err != nil { + return fmt.Errorf("setupRepoFixture: mkdir %s: %w", repoDir, err) + } + steps := [][]string{ + {"git", "init", "-q", "-b", "main"}, + {"git", "config", "commit.gpgsign", "false"}, + {"git", "config", "user.name", "smoke"}, + {"git", "config", "user.email", "smoke@smoke"}, + } + for _, args := range steps { + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = repoDir + if out, err := cmd.CombinedOutput(); err != nil { + return fmt.Errorf("setupRepoFixture: %s: %w\n%s", args, err, out) + } + } + marker := filepath.Join(repoDir, "smoke-file.txt") + if err := os.WriteFile(marker, []byte("smoke-workspace-marker\n"), 0o644); err != nil { + return fmt.Errorf("setupRepoFixture: write marker: %w", err) + } + commit := [][]string{ + {"git", "add", "."}, + {"git", "commit", "-q", "-m", "init"}, + } + for _, args := range commit { + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = repoDir + if out, err := cmd.CombinedOutput(); err != nil { + return fmt.Errorf("setupRepoFixture: %s: %w\n%s", args, err, out) + } + } + return nil +} diff --git a/internal/smoketest/helpers_test.go b/internal/smoketest/helpers_test.go new file mode 100644 index 0000000..4379e73 --- /dev/null +++ b/internal/smoketest/helpers_test.go @@ -0,0 +1,201 @@ +//go:build smoke + +package smoketest + +import ( + "bytes" + "os" + "os/exec" + "strings" + "testing" + "time" +) + +// result captures the output and exit status of a banger invocation. +// stdout / stderr are kept separate so assertions can target one or the +// other (matches the bash suite's `out=$(cmd)` vs `2>&1` patterns). +type result struct { + stdout string + stderr string + rc int +} + +// runCmd executes the given exec.Cmd, capturing stdout and stderr into +// the returned result. Non-zero exits are returned as a non-zero rc, not +// as an error — scenarios decide for themselves whether non-zero is a +// failure or the assertion under test. +func runCmd(t *testing.T, cmd *exec.Cmd) result { + t.Helper() + var outBuf, errBuf bytes.Buffer + cmd.Stdout = &outBuf + cmd.Stderr = &errBuf + err := cmd.Run() + res := result{stdout: outBuf.String(), stderr: errBuf.String()} + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + res.rc = exitErr.ExitCode() + } else { + t.Fatalf("exec %s: %v\nstderr: %s", strings.Join(cmd.Args, " "), err, res.stderr) + } + } + return res +} + +// banger runs the instrumented `banger` binary with the given arguments +// and returns the captured result. GOCOVERDIR is inherited from the +// process environment (TestMain exports it), so child covdata lands +// under BANGER_SMOKE_COVER_DIR automatically. +func banger(t *testing.T, args ...string) result { + t.Helper() + return runCmd(t, exec.Command(bangerBin, args...)) +} + +// mustBanger runs `banger` and Fatals if it exits non-zero. Returns the +// captured stdout for downstream `wantContains`. Most happy-path +// scenarios use this; scenarios that assert on non-zero exits use +// banger() directly. +func mustBanger(t *testing.T, args ...string) string { + t.Helper() + res := banger(t, args...) + if res.rc != 0 { + t.Fatalf("banger %s: exit %d\nstdout: %s\nstderr: %s", + strings.Join(args, " "), res.rc, res.stdout, res.stderr) + } + return res.stdout +} + +// sudoBanger runs `banger` under `sudo env GOCOVERDIR=...`. Sudo strips +// the env by default; explicit re-export keeps coverage flowing for +// scenarios that exercise the privileged path (system install / restart +// / update / daemon stop). +func sudoBanger(t *testing.T, args ...string) result { + t.Helper() + full := append([]string{"env", "GOCOVERDIR=" + coverDir, bangerBin}, args...) + return runCmd(t, exec.Command("sudo", full...)) +} + +// wantContains asserts that haystack contains needle. label is a short +// human-readable identifier for the failure message. +func wantContains(t *testing.T, haystack, needle, label string) { + t.Helper() + if !strings.Contains(haystack, needle) { + t.Fatalf("%s missing %q\ngot: %s", label, needle, haystack) + } +} + +// wantNotContains is the negative-assertion counterpart. Used by +// scenarios that verify a warning has been suppressed (e.g. the post- +// auto-prepare clean-state check in vm_exec) or that an export patch +// did NOT capture a guest-side commit. +func wantNotContains(t *testing.T, haystack, needle, label string) { + t.Helper() + if strings.Contains(haystack, needle) { + t.Fatalf("%s unexpectedly contains %q\ngot: %s", label, needle, haystack) + } +} + +// wantExit asserts the captured result exited with want. Used for +// scenarios that test exit-code propagation or refusal paths. +func wantExit(t *testing.T, got result, want int, label string) { + t.Helper() + if got.rc != want { + t.Fatalf("%s: exit %d, want %d\nstdout: %s\nstderr: %s", + label, got.rc, want, got.stdout, got.stderr) + } +} + +// vmDelete removes a VM, ignoring failure. Used in t.Cleanup hooks +// where the VM may already be gone (deleted by the scenario itself). +func vmDelete(name string) { + cmd := exec.Command(bangerBin, "vm", "delete", name) + _ = cmd.Run() +} + +// vmCreate creates a VM with the given name and registers a cleanup +// hook to delete it. extraArgs is forwarded after `vm create --name X` +// so callers can pass --vcpu N / --nat / --no-start / etc. Fatals if +// creation fails — every scenario that uses vmCreate needs the VM up. +func vmCreate(t *testing.T, name string, extraArgs ...string) { + t.Helper() + args := append([]string{"vm", "create", "--name", name}, extraArgs...) + mustBanger(t, args...) + t.Cleanup(func() { vmDelete(name) }) +} + +// bangerHome runs `banger` with HOME overridden to the given directory. +// Used by ssh-config scenarios that mutate ~/.ssh/config under a fake +// home so the test doesn't touch the contributor's real config. +func bangerHome(t *testing.T, home string, args ...string) result { + t.Helper() + cmd := exec.Command(bangerBin, args...) + cmd.Env = append(os.Environ(), "HOME="+home) + return runCmd(t, cmd) +} + +// mustBangerHome is bangerHome + Fatal-on-non-zero. Returns stdout. +func mustBangerHome(t *testing.T, home string, args ...string) string { + t.Helper() + res := bangerHome(t, home, args...) + if res.rc != 0 { + t.Fatalf("banger %s (HOME=%s): exit %d\nstdout: %s\nstderr: %s", + strings.Join(args, " "), home, res.rc, res.stdout, res.stderr) + } + return res.stdout +} + +// waitForSSH polls `banger vm ssh -- true` until SSH answers, +// up to 120 seconds. The original bash suite used 60s and occasionally +// flaked under load (post-update VM, large parallel pool); 120s gives +// enough headroom for the post-update / post-rollback paths where the +// daemon has just restarted, without making genuine breakage slow to +// surface. +func waitForSSH(t *testing.T, name string) { + t.Helper() + const timeout = 120 * time.Second + deadline := time.Now().Add(timeout) + for time.Now().Before(deadline) { + cmd := exec.Command(bangerBin, "vm", "ssh", name, "--", "true") + if err := cmd.Run(); err == nil { + return + } + time.Sleep(1 * time.Second) + } + t.Fatalf("vm %q ssh did not come up within %s", name, timeout) +} + +// requirePasswordlessSudo skips the test if `sudo -n true` cannot run. +// Mirrors the bash `if ! sudo -n true 2>/dev/null; then return 0; fi` +// pattern used by scenarios that exercise privileged paths. +func requirePasswordlessSudo(t *testing.T) { + t.Helper() + if err := exec.Command("sudo", "-n", "true").Run(); err != nil { + t.Skip("passwordless sudo unavailable") + } +} + +// requireSudoIptables skips the test if iptables can't be queried under +// `sudo -n`. Used by the NAT scenario whose assertions read POSTROUTING. +func requireSudoIptables(t *testing.T) { + t.Helper() + if err := exec.Command("sudo", "-n", "iptables", "-t", "nat", "-S", "POSTROUTING").Run(); err != nil { + t.Skip("passwordless sudo iptables unavailable") + } +} + +// installedVersion reads `/usr/local/bin/banger --version` and returns +// the version token. This is the *installed* binary that `banger update` +// swaps out — the smoke CLI under $BANGER_SMOKE_BIN_DIR is separate +// (and unaffected by update). Mirrors the bash `installed_version` +// helper at scripts/smoke.sh:1156-1162. +func installedVersion(t *testing.T) string { + t.Helper() + out, err := exec.Command("/usr/local/bin/banger", "--version").Output() + if err != nil { + t.Fatalf("read installed version: %v", err) + } + parts := strings.Fields(string(out)) + if len(parts) < 2 { + t.Fatalf("unparseable installed --version output: %q", string(out)) + } + return parts[1] +} diff --git a/internal/smoketest/release_server_test.go b/internal/smoketest/release_server_test.go new file mode 100644 index 0000000..45d5398 --- /dev/null +++ b/internal/smoketest/release_server_test.go @@ -0,0 +1,310 @@ +//go:build smoke + +package smoketest + +import ( + "archive/tar" + "compress/gzip" + "crypto/ecdsa" + "crypto/elliptic" + "crypto/rand" + "crypto/sha256" + "crypto/x509" + "encoding/base64" + "encoding/pem" + "fmt" + "io" + "net/http" + "net/http/httptest" + "os" + "os/exec" + "path/filepath" + "strings" + "sync" +) + +// Release-server state set up lazily by prepareSmokeReleases. The HTTP +// server stays up for the duration of TestMain (shut down in teardown). +// smokeRelOnce serializes concurrent first-callers; smokeRelErr is the +// stored result for replay so subsequent callers see the same outcome. +var ( + smokeRelOnce sync.Once + smokeRelErr error + manifestURL string + pubkeyFile string + releaseHTTPServer *httptest.Server + releaseRelDir string + smokeRelKey *ecdsa.PrivateKey +) + +const ( + smokeReleaseGood = "v0.smoke.0" + smokeReleaseBroken = "v0.smoke.broken-bangerd" +) + +// prepareSmokeReleases is the Go port of scripts/smoke.sh's +// prepare_smoke_releases. It generates an ECDSA P-256 keypair (matching +// cosign blob signatures, which are ASN.1 DER ECDSA over SHA256(body), +// base64-encoded), builds two coverage-instrumented release tarballs +// signed with that key, writes a manifest, and stands up an httptest +// file server. The hidden --manifest-url / --pubkey-file flags on +// `banger update` redirect the updater at this fake bucket. +// +// Idempotent. The first caller pays the build/server cost; later +// callers replay the cached result. +func prepareSmokeReleases() error { + smokeRelOnce.Do(func() { + smokeRelErr = doPrepareSmokeReleases() + }) + return smokeRelErr +} + +func doPrepareSmokeReleases() error { + releaseRelDir = filepath.Join(scratchRoot, "release") + if err := os.RemoveAll(releaseRelDir); err != nil { + return fmt.Errorf("clean release dir: %w", err) + } + if err := os.MkdirAll(releaseRelDir, 0o755); err != nil { + return fmt.Errorf("mkdir release dir: %w", err) + } + + priv, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader) + if err != nil { + return fmt.Errorf("generate ECDSA key: %w", err) + } + smokeRelKey = priv + + pubDER, err := x509.MarshalPKIXPublicKey(&priv.PublicKey) + if err != nil { + return fmt.Errorf("marshal pub key: %w", err) + } + pubPEM := pem.EncodeToMemory(&pem.Block{Type: "PUBLIC KEY", Bytes: pubDER}) + pubkeyFile = filepath.Join(releaseRelDir, "cosign.pub") + if err := os.WriteFile(pubkeyFile, pubPEM, 0o644); err != nil { + return fmt.Errorf("write pub key: %w", err) + } + + if err := buildSmokeReleaseTarball(smokeReleaseGood); err != nil { + return err + } + if err := buildSmokeReleaseTarball(smokeReleaseBroken); err != nil { + return err + } + + releaseHTTPServer = httptest.NewServer(http.FileServer(http.Dir(releaseRelDir))) + manifestPath := filepath.Join(releaseRelDir, "manifest.json") + if err := writeSmokeManifest(manifestPath, releaseHTTPServer.URL); err != nil { + return err + } + manifestURL = releaseHTTPServer.URL + "/manifest.json" + return nil +} + +func shutdownReleaseServer() { + if releaseHTTPServer != nil { + releaseHTTPServer.Close() + } +} + +// buildSmokeReleaseTarball is the Go port of build_smoke_release_tarball +// from scripts/smoke.sh. It compiles banger / bangerd / banger-vsock-agent +// with the requested Version baked in, packages them as a gzip tarball, +// and writes SHA256SUMS + SHA256SUMS.sig alongside. +// +// The v0.smoke.broken-* family ships a shell-script bangerd that passes +// `--check-migrations` (so the swap proceeds) but exits non-zero in +// service mode (so the post-swap restart fails and rollbackAndWrap +// fires). Same trick the bash version uses. +func buildSmokeReleaseTarball(version string) error { + outDir := filepath.Join(releaseRelDir, version) + stage := filepath.Join(outDir, ".stage") + if err := os.MkdirAll(stage, 0o755); err != nil { + return fmt.Errorf("mkdir stage: %w", err) + } + + ldflags := "-X banger/internal/buildinfo.Version=" + version + + " -X banger/internal/buildinfo.Commit=smoke" + + " -X banger/internal/buildinfo.BuiltAt=2026-04-30T00:00:00Z" + + root, err := repoRoot() + if err != nil { + return err + } + + build := func(target, output string, extraEnv ...string) error { + cmd := exec.Command("go", "build", "-ldflags", ldflags, "-o", output, target) + cmd.Dir = root + if len(extraEnv) > 0 { + cmd.Env = append(os.Environ(), extraEnv...) + } + if out, err := cmd.CombinedOutput(); err != nil { + return fmt.Errorf("build %s@%s: %w\n%s", target, version, err, out) + } + return nil + } + + if err := build("./cmd/banger", filepath.Join(stage, "banger")); err != nil { + return err + } + + if strings.HasPrefix(version, "v0.smoke.broken-") { + const brokenScript = `#!/bin/sh +case "$*" in + *--check-migrations*) + printf 'compatible: smoke broken-bangerd pretends to be ready\n' + exit 0 + ;; + *) + printf 'smoke broken-bangerd: refusing to run as daemon\n' >&2 + exit 1 + ;; +esac +` + if err := os.WriteFile(filepath.Join(stage, "bangerd"), []byte(brokenScript), 0o755); err != nil { + return fmt.Errorf("write broken bangerd: %w", err) + } + } else { + if err := build("./cmd/bangerd", filepath.Join(stage, "bangerd")); err != nil { + return err + } + } + + if err := build("./cmd/banger-vsock-agent", filepath.Join(stage, "banger-vsock-agent"), + "CGO_ENABLED=0", "GOOS=linux", "GOARCH=amd64"); err != nil { + return err + } + + tarballName := fmt.Sprintf("banger-%s-linux-amd64.tar.gz", version) + tarballPath := filepath.Join(outDir, tarballName) + if err := writeTarGz(stage, tarballPath); err != nil { + return fmt.Errorf("tar %s: %w", version, err) + } + + body, err := os.ReadFile(tarballPath) + if err != nil { + return fmt.Errorf("read tarball: %w", err) + } + hash := sha256.Sum256(body) + sumsBody := fmt.Sprintf("%x %s\n", hash, tarballName) + if err := os.WriteFile(filepath.Join(outDir, "SHA256SUMS"), []byte(sumsBody), 0o644); err != nil { + return fmt.Errorf("write SHA256SUMS: %w", err) + } + + sig, err := signCosignBlob(smokeRelKey, []byte(sumsBody)) + if err != nil { + return fmt.Errorf("sign SHA256SUMS for %s: %w", version, err) + } + if err := os.WriteFile(filepath.Join(outDir, "SHA256SUMS.sig"), []byte(sig), 0o644); err != nil { + return fmt.Errorf("write sig: %w", err) + } + + return os.RemoveAll(stage) +} + +// signCosignBlob produces a cosign-compatible blob signature: ASN.1 DER +// ECDSA over SHA256(body), base64 encoded with no newline. This is the +// exact wire format cosign produces and the Go updater verifies, and +// matches the bash chain `openssl dgst -sha256 -sign | base64 -w0`. +func signCosignBlob(priv *ecdsa.PrivateKey, body []byte) (string, error) { + hash := sha256.Sum256(body) + sig, err := ecdsa.SignASN1(rand.Reader, priv, hash[:]) + if err != nil { + return "", err + } + return base64.StdEncoding.EncodeToString(sig), nil +} + +// writeTarGz packages every regular file in srcDir at the root of a +// gzip tarball at dst. Mirrors the bash `tar czf` of the staged binary +// trio (banger, bangerd, banger-vsock-agent). +func writeTarGz(srcDir, dst string) error { + out, err := os.Create(dst) + if err != nil { + return err + } + defer out.Close() + gw := gzip.NewWriter(out) + defer gw.Close() + tw := tar.NewWriter(gw) + defer tw.Close() + + entries, err := os.ReadDir(srcDir) + if err != nil { + return err + } + for _, e := range entries { + if !e.Type().IsRegular() { + continue + } + path := filepath.Join(srcDir, e.Name()) + st, err := os.Stat(path) + if err != nil { + return err + } + hdr := &tar.Header{ + Name: e.Name(), + Mode: int64(st.Mode().Perm()), + Size: st.Size(), + ModTime: st.ModTime(), + } + if err := tw.WriteHeader(hdr); err != nil { + return err + } + f, err := os.Open(path) + if err != nil { + return err + } + if _, err := io.Copy(tw, f); err != nil { + f.Close() + return err + } + f.Close() + } + return nil +} + +func writeSmokeManifest(path, base string) error { + body := fmt.Sprintf(`{ + "schema_version": 1, + "latest_stable": %q, + "releases": [ + { + "version": %q, + "tarball_url": "%s/%s/banger-%s-linux-amd64.tar.gz", + "sha256sums_url": "%s/%s/SHA256SUMS", + "sha256sums_sig_url": "%s/%s/SHA256SUMS.sig", + "released_at": "2026-04-29T00:00:00Z" + }, + { + "version": %q, + "tarball_url": "%s/%s/banger-%s-linux-amd64.tar.gz", + "sha256sums_url": "%s/%s/SHA256SUMS", + "sha256sums_sig_url": "%s/%s/SHA256SUMS.sig", + "released_at": "2026-04-30T00:00:00Z" + } + ] +} +`, + smokeReleaseGood, + smokeReleaseGood, + base, smokeReleaseGood, smokeReleaseGood, + base, smokeReleaseGood, + base, smokeReleaseGood, + smokeReleaseBroken, + base, smokeReleaseBroken, smokeReleaseBroken, + base, smokeReleaseBroken, + base, smokeReleaseBroken, + ) + return os.WriteFile(path, []byte(body), 0o644) +} + +// repoRoot resolves the repo root (where go.mod lives) from the test +// binary's cwd. `go test` runs each package's tests from that package's +// source dir, so internal/smoketest -> ../.. lands at the root. +func repoRoot() (string, error) { + cwd, err := os.Getwd() + if err != nil { + return "", err + } + return filepath.Abs(filepath.Join(cwd, "..", "..")) +} diff --git a/internal/smoketest/scenarios_global_test.go b/internal/smoketest/scenarios_global_test.go new file mode 100644 index 0000000..b75ea49 --- /dev/null +++ b/internal/smoketest/scenarios_global_test.go @@ -0,0 +1,368 @@ +//go:build smoke + +package smoketest + +import ( + "os/exec" + "regexp" + "strings" + "testing" +) + +// testInvalidSpec is the Go port of scenario_invalid_spec. Asserts that +// `vm run --rm --vcpu 0 ...` is rejected and that no VM row is leaked +// in the process. Global-class because it asserts on host-wide vm-list +// counts; running concurrently with pure-class VM creation would race. +func testInvalidSpec(t *testing.T) { + preCount := vmListAllCount(t) + + res := banger(t, "vm", "run", "--rm", "--vcpu", "0", "--", "echo", "unused") + if res.rc == 0 { + t.Fatalf("invalid spec: vm run unexpectedly succeeded with --vcpu 0\nstdout: %s\nstderr: %s", + res.stdout, res.stderr) + } + + postCount := vmListAllCount(t) + if preCount != postCount { + t.Fatalf("invalid spec leaked a VM row: pre=%d, post=%d", preCount, postCount) + } +} + +// vmListAllCount returns the line count of `banger vm list --all`. +// Mirrors the bash `vm list --all | wc -l` idiom; the absolute count +// doesn't matter, only that it doesn't change across the rejected +// invocation. +func vmListAllCount(t *testing.T) int { + t.Helper() + out := mustBanger(t, "vm", "list", "--all") + return strings.Count(out, "\n") +} + +// testVMPrune ports scenario_vm_prune. `vm prune -f` should remove +// stopped VMs while preserving running ones. Global-class because it +// asserts on host-wide vm-list contents. +func testVMPrune(t *testing.T) { + mustBanger(t, "vm", "create", "--name", "smoke-prune-running") + t.Cleanup(func() { vmDelete("smoke-prune-running") }) + mustBanger(t, "vm", "create", "--name", "smoke-prune-stopped") + t.Cleanup(func() { vmDelete("smoke-prune-stopped") }) + mustBanger(t, "vm", "stop", "smoke-prune-stopped") + + mustBanger(t, "vm", "prune", "-f") + + if banger(t, "vm", "show", "smoke-prune-running").rc != 0 { + t.Fatalf("vm prune: running VM was deleted (regression!)") + } + if banger(t, "vm", "show", "smoke-prune-stopped").rc == 0 { + t.Fatalf("vm prune: stopped VM survived prune") + } +} + +// guestIPRE captures `"guest_ip": "172.16.0.X"` from `vm show` JSON. +// Used by testNAT to map VMs to their POSTROUTING rule subjects. +var guestIPRE = regexp.MustCompile(`"guest_ip":\s*"([^"]+)"`) + +// vmGuestIP returns the guest_ip field from `vm show`. Fatals if +// missing — every running VM has one. +func vmGuestIP(t *testing.T, name string) string { + t.Helper() + show := mustBanger(t, "vm", "show", name) + m := guestIPRE.FindStringSubmatch(show) + if len(m) != 2 { + t.Fatalf("could not read guest_ip from vm show %q:\n%s", name, show) + } + return m[1] +} + +// testNAT ports scenario_nat. Verifies that `--nat` installs a per-VM +// MASQUERADE rule, that the rule survives stop/start, and that delete +// cleans it up. The control VM (no --nat) must NOT have a rule. +func testNAT(t *testing.T) { + requireSudoIptables(t) + + mustBanger(t, "vm", "create", "--name", "smoke-nat", "--nat") + t.Cleanup(func() { vmDelete("smoke-nat") }) + mustBanger(t, "vm", "create", "--name", "smoke-nocnat") + t.Cleanup(func() { vmDelete("smoke-nocnat") }) + + natIP := vmGuestIP(t, "smoke-nat") + ctlIP := vmGuestIP(t, "smoke-nocnat") + + postrouting := iptablesPostrouting(t) + natRule := "-s " + natIP + "/32" + if !strings.Contains(postrouting, natRule) || !strings.Contains(postrouting, "MASQUERADE") { + t.Fatalf("NAT: --nat VM has no POSTROUTING MASQUERADE rule for %s; got:\n%s", natIP, postrouting) + } + if strings.Contains(postrouting, "-s "+ctlIP+"/32") { + t.Fatalf("NAT: control VM unexpectedly has a MASQUERADE rule for %s", ctlIP) + } + + mustBanger(t, "vm", "stop", "smoke-nat") + mustBanger(t, "vm", "start", "smoke-nat") + postrouting = iptablesPostrouting(t) + count := strings.Count(postrouting, natRule) + if count != 1 { + t.Fatalf("NAT: MASQUERADE rule count for %s = %d after restart, want 1", natIP, count) + } + + mustBanger(t, "vm", "delete", "smoke-nat") + mustBanger(t, "vm", "delete", "smoke-nocnat") + postrouting = iptablesPostrouting(t) + if strings.Contains(postrouting, natRule) { + t.Fatalf("NAT: delete left a MASQUERADE rule behind for %s", natIP) + } +} + +func iptablesPostrouting(t *testing.T) string { + t.Helper() + out, err := exec.Command("sudo", "-n", "iptables", "-t", "nat", "-S", "POSTROUTING").Output() + if err != nil { + t.Fatalf("read iptables POSTROUTING: %v", err) + } + return string(out) +} + +// testInvalidName ports scenario_invalid_name. A handful of malformed +// names must all be rejected and none of them may leak a VM row. +func testInvalidName(t *testing.T) { + preCount := vmListAllCount(t) + for _, bad := range []string{"MyBox", "my box", "box.vm", "-box"} { + res := banger(t, "vm", "create", "--name", bad, "--no-start") + if res.rc == 0 { + t.Fatalf("invalid name: vm create accepted %q", bad) + } + } + if postCount := vmListAllCount(t); postCount != preCount { + t.Fatalf("invalid name leaked VM row(s): pre=%d, post=%d", preCount, postCount) + } +} + +// updateBaseArgs are the manifest/pubkey flags every update scenario +// needs to redirect the updater away from the production R2 bucket +// and at our smoke release server. Built lazily because manifestURL / +// pubkeyFile are populated by prepareSmokeReleases. +func updateBaseArgs() []string { + return []string{"--manifest-url", manifestURL, "--pubkey-file", pubkeyFile} +} + +// testUpdateCheck ports scenario_update_check. `update --check` must +// succeed against the smoke release server and announce the available +// version on stdout. +func testUpdateCheck(t *testing.T) { + if err := prepareSmokeReleases(); err != nil { + t.Fatalf("prepare smoke releases: %v", err) + } + args := append([]string{"update", "--check"}, updateBaseArgs()...) + res := banger(t, args...) + if res.rc != 0 { + t.Fatalf("update --check failed: rc=%d\nstdout: %s\nstderr: %s", + res.rc, res.stdout, res.stderr) + } + wantContains(t, res.stdout+res.stderr, "update available: ", "update --check stdout") +} + +// testUpdateToUnknown ports scenario_update_to_unknown. Asking for a +// version not in the manifest must fail before any host mutation — +// the installed binary's version stays put. +func testUpdateToUnknown(t *testing.T) { + if err := prepareSmokeReleases(); err != nil { + t.Fatalf("prepare smoke releases: %v", err) + } + preVer := installedVersion(t) + args := append([]string{"update", "--to", "v9.9.9"}, updateBaseArgs()...) + res := banger(t, args...) + if res.rc == 0 { + t.Fatalf("update --to v9.9.9: exit 0 (out: %s%s)", res.stdout, res.stderr) + } + combined := strings.ToLower(res.stdout + res.stderr) + if !strings.Contains(combined, "not found") { + t.Fatalf("update --to v9.9.9: error doesn't say 'not found'; got: %s%s", res.stdout, res.stderr) + } + if postVer := installedVersion(t); preVer != postVer { + t.Fatalf("update --to v9.9.9 mutated the install: %s -> %s", preVer, postVer) + } +} + +// testUpdateNoRoot ports scenario_update_no_root. Non-sudo invocation +// of `update --to` must refuse with a root-required error and leave +// the install untouched. +func testUpdateNoRoot(t *testing.T) { + if err := prepareSmokeReleases(); err != nil { + t.Fatalf("prepare smoke releases: %v", err) + } + preVer := installedVersion(t) + args := append([]string{"update", "--to", smokeReleaseGood}, updateBaseArgs()...) + res := banger(t, args...) + if res.rc == 0 { + t.Fatalf("update without sudo: exit 0 (out: %s%s)", res.stdout, res.stderr) + } + combined := strings.ToLower(res.stdout + res.stderr) + if !strings.Contains(combined, "root") { + t.Fatalf("update without sudo: error doesn't mention root; got: %s%s", res.stdout, res.stderr) + } + if postVer := installedVersion(t); preVer != postVer { + t.Fatalf("update without sudo mutated the install: %s -> %s", preVer, postVer) + } +} + +// testUpdateDryRun ports scenario_update_dry_run. `--dry-run` fetches +// + verifies the new release but must not swap the binary. +func testUpdateDryRun(t *testing.T) { + requirePasswordlessSudo(t) + if err := prepareSmokeReleases(); err != nil { + t.Fatalf("prepare smoke releases: %v", err) + } + preVer := installedVersion(t) + args := append([]string{"update", "--to", smokeReleaseGood, "--dry-run"}, updateBaseArgs()...) + res := sudoBanger(t, args...) + if res.rc != 0 { + t.Fatalf("update --dry-run failed: %s%s", res.stdout, res.stderr) + } + wantContains(t, res.stdout+res.stderr, "dry-run:", "update --dry-run stdout") + if postVer := installedVersion(t); preVer != postVer { + t.Fatalf("update --dry-run swapped the binary: %s -> %s", preVer, postVer) + } +} + +// vmBootID reads /proc/sys/kernel/random/boot_id from the guest. The +// kernel regenerates it on every boot, so an unchanged value across a +// daemon restart proves the firecracker process survived. Used by both +// update scenarios that assert "the VM stays alive". +func vmBootID(t *testing.T, name string) string { + t.Helper() + out, _ := exec.Command(bangerBin, "vm", "ssh", name, "--", "cat", "/proc/sys/kernel/random/boot_id").Output() + return strings.TrimSpace(string(out)) +} + +var installTomlVersionRE = regexp.MustCompile(`(?m)^version\s*=\s*"([^"]+)"`) + +// installedTomlVersion reads /etc/banger/install.toml's version field +// (under sudo since the dir is not always world-readable). +func installedTomlVersion(t *testing.T) string { + t.Helper() + out, err := exec.Command("sudo", "cat", "/etc/banger/install.toml").Output() + if err != nil { + t.Fatalf("read /etc/banger/install.toml: %v", err) + } + m := installTomlVersionRE.FindStringSubmatch(string(out)) + if len(m) != 2 { + t.Fatalf("install.toml: no version field in:\n%s", out) + } + return m[1] +} + +// testUpdateKeepsVMAlive ports scenario_update_keeps_vm_alive. The +// long-running update scenario: a real swap to v0.smoke.0, must not +// reboot the running VM, must update the install metadata, and the VM +// must still answer SSH afterwards. +func testUpdateKeepsVMAlive(t *testing.T) { + requirePasswordlessSudo(t) + if err := prepareSmokeReleases(); err != nil { + t.Fatalf("prepare smoke releases: %v", err) + } + const name = "smoke-update" + vmCreate(t, name) + waitForSSH(t, name) + preBoot := vmBootID(t, name) + if preBoot == "" { + t.Fatalf("pre-update boot_id capture failed") + } + preVer := installedVersion(t) + + args := append([]string{"update", "--to", smokeReleaseGood}, updateBaseArgs()...) + if res := sudoBanger(t, args...); res.rc != 0 { + t.Fatalf("update --to %s failed: %s%s", smokeReleaseGood, res.stdout, res.stderr) + } + + postVer := installedVersion(t) + if postVer != smokeReleaseGood { + t.Fatalf("post-update /usr/local/bin/banger version = %s, want %s", postVer, smokeReleaseGood) + } + if preVer == postVer { + t.Fatalf("update did not change the binary version (pre==post=%s)", postVer) + } + if metaVer := installedTomlVersion(t); metaVer != smokeReleaseGood { + t.Fatalf("install.toml version = %q, want %s", metaVer, smokeReleaseGood) + } + + waitForSSH(t, name) + postBoot := vmBootID(t, name) + if postBoot == "" { + t.Fatalf("post-update boot_id read failed") + } + if preBoot != postBoot { + t.Fatalf("VM rebooted during update: boot_id %s -> %s", preBoot, postBoot) + } +} + +// testUpdateRollbackKeepsVMAlive ports scenario_update_rollback_keeps_vm_alive. +// Rollback drill: install the broken-bangerd release, which passes the +// pre-swap migration sanity but fails as a service. runUpdate's +// rollbackAndWrap must restore the previous binaries, and the VM must +// survive the whole drill. +func testUpdateRollbackKeepsVMAlive(t *testing.T) { + requirePasswordlessSudo(t) + if err := prepareSmokeReleases(); err != nil { + t.Fatalf("prepare smoke releases: %v", err) + } + preVer := installedVersion(t) + + const name = "smoke-rollback" + vmCreate(t, name) + waitForSSH(t, name) + preBoot := vmBootID(t, name) + if preBoot == "" { + t.Fatalf("pre-drill boot_id capture failed") + } + + args := append([]string{"update", "--to", smokeReleaseBroken}, updateBaseArgs()...) + res := sudoBanger(t, args...) + if res.rc == 0 { + t.Fatalf("rollback drill: update returned exit 0 despite broken bangerd\nstdout: %s\nstderr: %s", + res.stdout, res.stderr) + } + + if postVer := installedVersion(t); postVer != preVer { + t.Fatalf("rollback drill: post-rollback version = %s, want %s", postVer, preVer) + } + + waitForSSH(t, name) + postBoot := vmBootID(t, name) + if postBoot == "" { + t.Fatalf("post-rollback boot_id read failed") + } + if preBoot != postBoot { + t.Fatalf("VM rebooted during rollback drill: boot_id %s -> %s", preBoot, postBoot) + } +} + +// testDaemonAdmin ports scenario_daemon_admin. MUST be the last global +// scenario in the run order: `banger daemon stop` tears the installed +// services down, so anything after it that talks to the daemon would +// fail. The teardown path re-stops idempotently. +func testDaemonAdmin(t *testing.T) { + socket := strings.TrimSpace(mustBanger(t, "daemon", "socket")) + if socket != "/run/banger/bangerd.sock" { + t.Fatalf("daemon socket: got %q, want /run/banger/bangerd.sock", socket) + } + + migOut, err := exec.Command(bangerdBin, "--system", "--check-migrations").CombinedOutput() + if err != nil { + t.Fatalf("bangerd --check-migrations: %v\n%s", err, migOut) + } + if !strings.HasPrefix(strings.TrimSpace(string(migOut)), "compatible:") { + t.Fatalf("bangerd --check-migrations: stdout missing 'compatible:' prefix; got: %s", migOut) + } + + requirePasswordlessSudo(t) + if res := sudoBanger(t, "daemon", "stop"); res.rc != 0 { + t.Fatalf("banger daemon stop: %s%s", res.stdout, res.stderr) + } + status, _ := exec.Command(bangerBin, "system", "status").Output() + if !regexp.MustCompile(`(?m)^active\s+inactive`).Match(status) { + t.Fatalf("owner daemon still active after daemon stop:\n%s", status) + } + if !regexp.MustCompile(`(?m)^helper_active\s+inactive`).Match(status) { + t.Fatalf("root helper still active after daemon stop:\n%s", status) + } +} diff --git a/internal/smoketest/scenarios_pure_test.go b/internal/smoketest/scenarios_pure_test.go new file mode 100644 index 0000000..fd92add --- /dev/null +++ b/internal/smoketest/scenarios_pure_test.go @@ -0,0 +1,311 @@ +//go:build smoke + +package smoketest + +import ( + "os" + "os/exec" + "path/filepath" + "regexp" + "strings" + "sync" + "testing" +) + +// testBareRun is the Go port of scenario_bare_run from +// scripts/smoke.sh. Bare ephemeral VM run: create + start + ssh + +// echo + --rm. +func testBareRun(t *testing.T) { + t.Parallel() + out := mustBanger(t, "vm", "run", "--rm", "--", "echo", "smoke-bare-ok") + wantContains(t, out, "smoke-bare-ok", "bare vm run stdout") +} + +// testExitCode is the Go port of scenario_exit_code. Asserts that +// `vm run -- sh -c 'exit 42'` propagates rc=42 verbatim. +func testExitCode(t *testing.T) { + t.Parallel() + res := banger(t, "vm", "run", "--rm", "--", "sh", "-c", "exit 42") + wantExit(t, res, 42, "exit-code propagation") +} + +// testConcurrentRun fires two `vm run --rm` invocations simultaneously +// and asserts both succeed and emit their respective markers. Bash uses +// `& ; wait`; Go uses two goroutines that capture the result and a +// WaitGroup. Note: t.Fatalf cannot be called from a goroutine, so the +// children write to result slots and assertions run on the main goroutine. +func testConcurrentRun(t *testing.T) { + t.Parallel() + var wg sync.WaitGroup + var resA, resB result + run := func(dst *result, marker string) { + defer wg.Done() + cmd := exec.Command(bangerBin, "vm", "run", "--rm", "--", "echo", marker) + var out, errBuf strings.Builder + cmd.Stdout = &out + cmd.Stderr = &errBuf + err := cmd.Run() + dst.stdout = out.String() + dst.stderr = errBuf.String() + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + dst.rc = exitErr.ExitCode() + } else { + dst.rc = -1 + dst.stderr += "\nexec error: " + err.Error() + } + } + } + wg.Add(2) + go run(&resA, "smoke-concurrent-a") + go run(&resB, "smoke-concurrent-b") + wg.Wait() + wantExit(t, resA, 0, "concurrent A exit") + wantExit(t, resB, 0, "concurrent B exit") + wantContains(t, resA.stdout, "smoke-concurrent-a", "concurrent A stdout") + wantContains(t, resB.stdout, "smoke-concurrent-b", "concurrent B stdout") +} + +// testDetachRun ports scenario_detach_run. Verifies -d combined with +// --rm or with a guest command is rejected before VM creation, then +// that -d --name leaves the VM running and ssh-able. +func testDetachRun(t *testing.T) { + t.Parallel() + + res := banger(t, "vm", "run", "-d", "--rm") + if res.rc == 0 { + t.Fatalf("detach: -d --rm should be rejected before VM creation") + } + + res = banger(t, "vm", "run", "-d", "--", "echo", "hi") + if res.rc == 0 { + t.Fatalf("detach: -d -- should be rejected before VM creation") + } + + const name = "smoke-detach" + mustBanger(t, "vm", "run", "-d", "--name", name) + t.Cleanup(func() { vmDelete(name) }) + + show := mustBanger(t, "vm", "show", name) + wantContains(t, show, `"state": "running"`, "detach: post-detach state") + + out := mustBanger(t, "vm", "ssh", name, "--", "echo", "detach-marker") + wantContains(t, out, "detach-marker", "detach: ssh stdout") +} + +// testBootstrapPrecondition ports scenario_bootstrap_precondition. +// A workspace with .mise.toml requires NAT (or --no-bootstrap) to run. +// The fake repo lives in a TempDir so it doesn't pollute the shared +// repodir fixture used by repodir-class scenarios. +func testBootstrapPrecondition(t *testing.T) { + t.Parallel() + miseRepo := t.TempDir() + gitInit := func(args ...string) { + t.Helper() + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = miseRepo + if out, err := cmd.CombinedOutput(); err != nil { + t.Fatalf("setup mise repo: %s: %v\n%s", args, err, out) + } + } + gitInit("git", "init", "-q") + gitInit("git", "-c", "user.email=smoke@banger", "-c", "user.name=smoke", + "commit", "--allow-empty", "-q", "-m", "init") + if err := os.WriteFile(filepath.Join(miseRepo, ".mise.toml"), []byte("[tools]\n"), 0o644); err != nil { + t.Fatalf("write .mise.toml: %v", err) + } + gitInit("git", "add", ".mise.toml") + gitInit("git", "-c", "user.email=smoke@banger", "-c", "user.name=smoke", + "commit", "-q", "-m", "add mise") + + res := banger(t, "vm", "run", "--rm", miseRepo, "--", "echo", "nope") + if res.rc == 0 { + t.Fatalf("bootstrap: workspace with .mise.toml should refuse without --nat / --no-bootstrap") + } + + out := mustBanger(t, "vm", "run", "--rm", "--no-bootstrap", miseRepo, "--", "echo", "no-bootstrap-ok") + wantContains(t, out, "no-bootstrap-ok", "bootstrap: --no-bootstrap stdout") +} + +// testVMLifecycle ports scenario_vm_lifecycle. Drives an explicit +// create / show / ssh / stop / start / ssh / delete and asserts the +// state transitions are visible in `vm show`. +func testVMLifecycle(t *testing.T) { + t.Parallel() + const name = "smoke-lifecycle" + vmCreate(t, name) + + show := mustBanger(t, "vm", "show", name) + wantContains(t, show, `"state": "running"`, "post-create state") + + waitForSSH(t, name) + out := mustBanger(t, "vm", "ssh", name, "--", "echo", "hello-1") + wantContains(t, out, "hello-1", "vm ssh #1") + + mustBanger(t, "vm", "stop", name) + show = mustBanger(t, "vm", "show", name) + wantContains(t, show, `"state": "stopped"`, "post-stop state") + + mustBanger(t, "vm", "start", name) + show = mustBanger(t, "vm", "show", name) + wantContains(t, show, `"state": "running"`, "post-start state") + + waitForSSH(t, name) + out = mustBanger(t, "vm", "ssh", name, "--", "echo", "hello-2") + wantContains(t, out, "hello-2", "vm ssh #2 (post-restart)") + + mustBanger(t, "vm", "delete", name) + res := banger(t, "vm", "show", name) + if res.rc == 0 { + t.Fatalf("vm show still finds %q after delete\nstdout: %s", name, res.stdout) + } +} + +// testVMSet ports scenario_vm_set. Creates with --vcpu 2, asserts +// guest sees 2 CPUs, reconfigures to 4 while stopped, asserts guest +// sees 4 after restart. +func testVMSet(t *testing.T) { + t.Parallel() + const name = "smoke-set" + vmCreate(t, name, "--vcpu", "2") + waitForSSH(t, name) + + out := mustBanger(t, "vm", "ssh", name, "--", "nproc") + if got := strings.TrimSpace(out); got != "2" { + t.Fatalf("vm set: initial nproc got %q, want 2", got) + } + + mustBanger(t, "vm", "stop", name) + mustBanger(t, "vm", "set", name, "--vcpu", "4") + mustBanger(t, "vm", "start", name) + waitForSSH(t, name) + + out = mustBanger(t, "vm", "ssh", name, "--", "nproc") + if got := strings.TrimSpace(out); got != "4" { + t.Fatalf("vm set: post-reconfig nproc got %q, want 4 (spec change didn't land)", got) + } +} + +// testVMRestart ports scenario_vm_restart. Reads /proc boot_id before +// and after `vm restart`; the kernel regenerates it on every boot, so +// distinct values prove the verb actually rebooted the guest. +func testVMRestart(t *testing.T) { + t.Parallel() + const name = "smoke-restart" + vmCreate(t, name) + waitForSSH(t, name) + + bootBefore := strings.TrimSpace(mustBanger(t, "vm", "ssh", name, "--", "cat", "/proc/sys/kernel/random/boot_id")) + if bootBefore == "" { + t.Fatalf("vm restart: could not read initial boot_id") + } + + mustBanger(t, "vm", "restart", name) + waitForSSH(t, name) + + bootAfter := strings.TrimSpace(mustBanger(t, "vm", "ssh", name, "--", "cat", "/proc/sys/kernel/random/boot_id")) + if bootAfter == "" { + t.Fatalf("vm restart: could not read post-restart boot_id") + } + if bootBefore == bootAfter { + t.Fatalf("vm restart: boot_id unchanged (%s); verb didn't actually reboot the guest", bootBefore) + } +} + +// dmDevRE captures the dm-snapshot device name from `vm show` JSON. +// Used by testVMKill to check that `vm kill --signal KILL` cleans up +// the dm device alongside the firecracker process. +var dmDevRE = regexp.MustCompile(`"dm_dev":\s*"(fc-rootfs-[^"]+)"`) + +// testVMKill ports scenario_vm_kill. `vm kill --signal KILL` must stop +// the VM and clean up its dm-snapshot device. The dm-name capture +// degrades gracefully — older builds without the field still pass the +// state-check half. +func testVMKill(t *testing.T) { + t.Parallel() + const name = "smoke-kill" + vmCreate(t, name) + + show := mustBanger(t, "vm", "show", name) + var dmName string + if m := dmDevRE.FindStringSubmatch(show); len(m) == 2 { + dmName = m[1] + } + + mustBanger(t, "vm", "kill", "--signal", "KILL", name) + show = mustBanger(t, "vm", "show", name) + wantContains(t, show, `"state": "stopped"`, "post-kill state") + + if dmName != "" { + out, _ := exec.Command("sudo", "-n", "dmsetup", "ls").CombinedOutput() + for _, line := range strings.Split(string(out), "\n") { + fields := strings.Fields(line) + if len(fields) > 0 && fields[0] == dmName { + t.Fatalf("vm kill: dm device %q still mapped (cleanup didn't run)", dmName) + } + } + } +} + +// testVMPorts ports scenario_vm_ports. Asserts `vm ports` reports the +// guest's sshd listener under the VM's DNS name. +func testVMPorts(t *testing.T) { + t.Parallel() + const name = "smoke-ports" + vmCreate(t, name) + waitForSSH(t, name) + + out := mustBanger(t, "vm", "ports", name) + wantContains(t, out, "smoke-ports.vm:22", "vm ports stdout (host:port)") + wantContains(t, out, "sshd", "vm ports stdout (process name)") +} + +// testSSHConfig ports scenario_ssh_config. Drives ssh-config +// install/uninstall against a fake $HOME so the contributor's real +// ~/.ssh/config is never touched. Verifies idempotent install, +// preservation of pre-existing user content, and clean uninstall. +func testSSHConfig(t *testing.T) { + t.Parallel() + fakeHome := t.TempDir() + if err := os.MkdirAll(filepath.Join(fakeHome, ".ssh"), 0o700); err != nil { + t.Fatalf("mkdir .ssh: %v", err) + } + cfg := filepath.Join(fakeHome, ".ssh", "config") + if err := os.WriteFile(cfg, []byte("Host myserver\n HostName example.invalid\n"), 0o600); err != nil { + t.Fatalf("write fake config: %v", err) + } + + mustBangerHome(t, fakeHome, "ssh-config", "--install") + cfgBytes, err := os.ReadFile(cfg) + if err != nil { + t.Fatalf("read fake config after install: %v", err) + } + body := string(cfgBytes) + if !strings.Contains(body, "\nInclude ") && !strings.HasPrefix(body, "Include ") { + t.Fatalf("ssh-config: install didn't add Include line:\n%s", body) + } + wantContains(t, body, "Host myserver", "ssh-config: install must preserve user content") + + mustBangerHome(t, fakeHome, "ssh-config", "--install") + cfgBytes, _ = os.ReadFile(cfg) + body = string(cfgBytes) + includeCount := 0 + for _, line := range strings.Split(body, "\n") { + if strings.HasPrefix(line, "Include ") && strings.Contains(line, "banger") { + includeCount++ + } + } + if includeCount != 1 { + t.Fatalf("ssh-config: install not idempotent (Include appeared %d times)", includeCount) + } + + mustBangerHome(t, fakeHome, "ssh-config", "--uninstall") + cfgBytes, _ = os.ReadFile(cfg) + body = string(cfgBytes) + for _, line := range strings.Split(body, "\n") { + if strings.HasPrefix(line, "Include ") && strings.Contains(line, "banger") { + t.Fatalf("ssh-config: uninstall left the Include line behind:\n%s", body) + } + } + wantContains(t, body, "Host myserver", "ssh-config: uninstall must keep user content") +} diff --git a/internal/smoketest/scenarios_repodir_test.go b/internal/smoketest/scenarios_repodir_test.go new file mode 100644 index 0000000..65f1e22 --- /dev/null +++ b/internal/smoketest/scenarios_repodir_test.go @@ -0,0 +1,205 @@ +//go:build smoke + +package smoketest + +import ( + "os" + "os/exec" + "path/filepath" + "strings" + "testing" +) + +// testWorkspaceRun ports scenario_workspace_run. Ships the throwaway +// git repo to a fresh VM and reads the marker file from the guest. +func testWorkspaceRun(t *testing.T) { + out := mustBanger(t, "vm", "run", "--rm", repoDir, "--", "cat", "/root/repo/smoke-file.txt") + wantContains(t, out, "smoke-workspace-marker", "workspace vm run guest read") +} + +// testWorkspaceDryrun ports scenario_workspace_dryrun. `--dry-run` +// lists the tracked files and the resolved transfer mode without +// creating a VM. +func testWorkspaceDryrun(t *testing.T) { + out := mustBanger(t, "vm", "run", "--dry-run", repoDir) + wantContains(t, out, "smoke-file.txt", "dry-run file list") + wantContains(t, out, "mode: tracked only", "dry-run mode line") +} + +// testIncludeUntracked ports scenario_include_untracked. Drops an +// untracked file in the fixture and asserts --include-untracked picks +// it up. The cleanup hook removes the file even if the scenario fails +// so downstream repodir scenarios see the original tree. +func testIncludeUntracked(t *testing.T) { + untracked := filepath.Join(repoDir, "smoke-untracked.txt") + if err := os.WriteFile(untracked, []byte("untracked-marker\n"), 0o644); err != nil { + t.Fatalf("write untracked file: %v", err) + } + t.Cleanup(func() { _ = os.Remove(untracked) }) + + out := mustBanger(t, "vm", "run", "--rm", "--include-untracked", repoDir, + "--", "cat", "/root/repo/smoke-untracked.txt") + wantContains(t, out, "untracked-marker", "include-untracked guest read") +} + +// testWorkspaceExport ports scenario_workspace_export. Round-trips a +// guest-side edit back out as a patch via `vm workspace export`. +func testWorkspaceExport(t *testing.T) { + const name = "smoke-export" + vmCreate(t, name, "--image", "debian-bookworm") + mustBanger(t, "vm", "workspace", "prepare", name, repoDir) + mustBanger(t, "vm", "ssh", name, "--", "sh", "-c", + "echo guest-edit > /root/repo/new-guest-file.txt") + + patch := filepath.Join(runtimeDir, "smoke-export.diff") + mustBanger(t, "vm", "workspace", "export", name, "--output", patch) + + st, err := os.Stat(patch) + if err != nil { + t.Fatalf("export: stat patch %s: %v", patch, err) + } + if st.Size() == 0 { + t.Fatalf("export: patch file empty at %s", patch) + } + body, err := os.ReadFile(patch) + if err != nil { + t.Fatalf("export: read patch: %v", err) + } + wantContains(t, string(body), "new-guest-file.txt", "export: patch must reference new-guest-file.txt") +} + +// testWorkspaceFullCopy ports scenario_workspace_full_copy. Verifies +// the alternate transfer path (--mode full_copy) lands the same fixture +// in the guest. +func testWorkspaceFullCopy(t *testing.T) { + const name = "smoke-fc" + vmCreate(t, name) + mustBanger(t, "vm", "workspace", "prepare", name, repoDir, "--mode", "full_copy") + + out := mustBanger(t, "vm", "ssh", name, "--", "cat", "/root/repo/smoke-file.txt") + wantContains(t, out, "smoke-workspace-marker", "full_copy: marker missing in guest") +} + +// testWorkspaceBasecommit ports scenario_workspace_basecommit. Confirms +// that `vm workspace export` without --base-commit captures only the +// working-copy diff, while --base-commit also captures guest-side +// commits made on top of HEAD. +func testWorkspaceBasecommit(t *testing.T) { + const name = "smoke-basecommit" + vmCreate(t, name) + mustBanger(t, "vm", "workspace", "prepare", name, repoDir) + + baseSHA := strings.TrimSpace(mustBanger(t, "vm", "ssh", name, "--", + "sh", "-c", "cd /root/repo && git rev-parse HEAD")) + if len(baseSHA) != 40 { + t.Fatalf("export base: bad base sha: %q", baseSHA) + } + + mustBanger(t, "vm", "ssh", name, "--", "sh", "-c", + "cd /root/repo && "+ + "git -c user.email=smoke@smoke -c user.name=smoke checkout -b smoke-branch >/dev/null 2>&1 && "+ + "echo committed-marker > smoke-committed.txt && "+ + "git add smoke-committed.txt && "+ + "git -c user.email=smoke@smoke -c user.name=smoke commit -q -m 'guest side'") + + plain := filepath.Join(runtimeDir, "smoke-plain.diff") + mustBanger(t, "vm", "workspace", "export", name, "--output", plain) + if body, err := os.ReadFile(plain); err == nil { + wantNotContains(t, string(body), "smoke-committed.txt", + "export base: plain export must NOT capture guest-side commit") + } + + base := filepath.Join(runtimeDir, "smoke-base.diff") + mustBanger(t, "vm", "workspace", "export", name, "--base-commit", baseSHA, "--output", base) + st, err := os.Stat(base) + if err != nil || st.Size() == 0 { + t.Fatalf("export base: --base-commit patch empty/missing: stat=%v err=%v", st, err) + } + body, _ := os.ReadFile(base) + wantContains(t, string(body), "smoke-committed.txt", + "export base: --base-commit patch must include committed marker") +} + +// testWorkspaceRestart ports scenario_workspace_restart. Verifies the +// workspace marker survives a stop/start cycle (rootfs persistence). +func testWorkspaceRestart(t *testing.T) { + const name = "smoke-wsrestart" + vmCreate(t, name) + mustBanger(t, "vm", "workspace", "prepare", name, repoDir) + + pre := mustBanger(t, "vm", "ssh", name, "--", "cat", "/root/repo/smoke-file.txt") + wantContains(t, pre, "smoke-workspace-marker", "workspace stop/start: pre-cycle marker") + + mustBanger(t, "vm", "stop", name) + mustBanger(t, "vm", "start", name) + waitForSSH(t, name) + + post := mustBanger(t, "vm", "ssh", name, "--", "cat", "/root/repo/smoke-file.txt") + wantContains(t, post, "smoke-workspace-marker", "workspace stop/start: post-cycle marker") +} + +// testVMExec ports scenario_vm_exec. The longest scenario in the suite +// — covers auto-cd, exit-code propagation, stale-workspace detection, +// --auto-prepare resync, and the not-running refusal. The repodir +// commit added mid-scenario is rolled back via t.Cleanup so subsequent +// repodir-chain scenarios see the original fixture state. +func testVMExec(t *testing.T) { + const name = "smoke-exec" + vmCreate(t, name) + mustBanger(t, "vm", "workspace", "prepare", name, repoDir) + + show := mustBanger(t, "vm", "show", name) + wantContains(t, show, `"guest_path": "/root/repo"`, + "vm exec: workspace.guest_path not persisted") + + out := mustBanger(t, "vm", "exec", name, "--", "cat", "smoke-file.txt") + wantContains(t, out, "smoke-workspace-marker", "vm exec: workspace marker") + + if got := strings.TrimSpace(mustBanger(t, "vm", "exec", name, "--", "pwd")); got != "/root/repo" { + t.Fatalf("vm exec: pwd got %q, want /root/repo (auto-cd didn't happen)", got) + } + + res := banger(t, "vm", "exec", name, "--", "sh", "-c", "exit 17") + wantExit(t, res, 17, "vm exec: exit-code propagation") + + // Advance host HEAD so the workspace goes stale, register the + // rollback before mutating so a Fatal anywhere below still + // restores the fixture. + t.Cleanup(func() { + cmd := exec.Command("git", "reset", "--hard", "HEAD~1", "-q") + cmd.Dir = repoDir + _ = cmd.Run() + }) + for _, args := range [][]string{ + {"sh", "-c", "echo post-prepare-marker > smoke-exec-new.txt"}, + {"git", "add", "smoke-exec-new.txt"}, + {"git", "commit", "-q", "-m", "add smoke-exec-new.txt after prepare"}, + } { + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = repoDir + if out, err := cmd.CombinedOutput(); err != nil { + t.Fatalf("vm exec: stage host commit: %s: %v\n%s", args, err, out) + } + } + + stale := banger(t, "vm", "exec", name, "--", "ls", "smoke-exec-new.txt") + if stale.rc == 0 { + t.Fatalf("vm exec: stale workspace already had the new file (dirty path didn't take effect)") + } + wantContains(t, stale.stderr, "workspace stale", "vm exec: stale-workspace warning on stderr") + wantContains(t, stale.stderr, "--auto-prepare", "vm exec: stale warning must mention --auto-prepare") + + auto := mustBanger(t, "vm", "exec", name, "--auto-prepare", "--", "cat", "smoke-exec-new.txt") + wantContains(t, auto, "post-prepare-marker", "vm exec: --auto-prepare didn't re-sync new file") + + clean := banger(t, "vm", "exec", name, "--", "true") + wantExit(t, clean, 0, "vm exec: post-auto-prepare run") + wantNotContains(t, clean.stderr, "workspace stale", "vm exec: stale warning persisted after --auto-prepare") + + mustBanger(t, "vm", "stop", name) + stopped := banger(t, "vm", "exec", name, "--", "true") + if stopped.rc == 0 { + t.Fatalf("vm exec: exec on stopped VM unexpectedly succeeded") + } + wantContains(t, stopped.stderr, "not running", "vm exec: stopped-VM error message") +} diff --git a/internal/smoketest/smoke_main_test.go b/internal/smoketest/smoke_main_test.go new file mode 100644 index 0000000..e03b3ce --- /dev/null +++ b/internal/smoketest/smoke_main_test.go @@ -0,0 +1,305 @@ +//go:build smoke + +package smoketest + +import ( + "errors" + "fmt" + "io" + "os" + "os/exec" + "os/user" + "path/filepath" + "regexp" + "strings" + "testing" +) + +// Package-level state set up in TestMain and consumed by every test. +// Lowercase, file-scope; tests in this package don't share globals +// with other packages because of the build tag. +var ( + bangerBin string + bangerdBin string + vsockBin string + coverDir string + scratchRoot string + runtimeDir string + repoDir string + smokeOwner string +) + +const ( + serviceCoverDir = "/var/lib/banger" + smokeMarker = "/etc/banger/.smoke-owned" + ownerService = "bangerd.service" + rootService = "bangerd-root.service" +) + +// smokeConfigTOML is the smoke-tuned daemon config dropped at +// /etc/banger/config.toml after install (mirrors scripts/smoke.sh:404-415). +// Small VMs by default — scenarios that need full-size resources override +// --vcpu / --memory / --disk-size explicitly. +const smokeConfigTOML = `# Smoke-tuned defaults — every VM starts small unless the scenario +# overrides --vcpu / --memory / --disk-size explicitly. +[vm_defaults] +vcpu = 2 +memory_mib = 1024 +disk_size = "2G" +system_overlay_size = "2G" +` + +func TestMain(m *testing.M) { + // `go test -list ...` (used by `make smoke-list`) just enumerates + // the test names. Skip the install preamble and let m.Run() print + // the listing — env vars + KVM aren't needed for discovery. + if isListMode() { + os.Exit(m.Run()) + } + + if err := requireEnv(); err != nil { + fmt.Fprintf(os.Stderr, "[smoke] %v\n", err) + // Skip cleanly when run outside `make smoke`. Returning 0 + // prevents `go test` from being mistaken for a real failure + // when a contributor accidentally runs the smoke package + // directly without the harness env. + os.Exit(0) + } + + // Export GOCOVERDIR so every banger / bangerd subprocess this + // test binary spawns lands its covdata under BANGER_SMOKE_COVER_DIR. + // The test binary itself is not instrumented; only the smoke + // binaries are (they were built with `go build -cover`). + if err := os.Setenv("GOCOVERDIR", coverDir); err != nil { + fmt.Fprintf(os.Stderr, "[smoke] setenv GOCOVERDIR: %v\n", err) + os.Exit(1) + } + + if err := installPreamble(); err != nil { + fmt.Fprintf(os.Stderr, "[smoke] install preamble failed: %v\n", err) + os.Exit(1) + } + + if err := setupRepoFixture(); err != nil { + fmt.Fprintf(os.Stderr, "[smoke] fixture setup failed: %v\n", err) + teardown() + os.Exit(1) + } + + code := m.Run() + teardown() + os.Exit(code) +} + +// isListMode returns true when the test binary was invoked with the +// `-test.list` flag, which `go test -list ...` translates into. In that +// mode the harness only enumerates names and never spawns a test, so +// requireEnv / installPreamble would needlessly block discovery on a +// fresh checkout (no KVM, no sudo). +func isListMode() bool { + for _, a := range os.Args[1:] { + if a == "-test.list" || strings.HasPrefix(a, "-test.list=") { + return true + } + } + return false +} + +// requireEnv reads and validates the three BANGER_SMOKE_* env vars and +// confirms the binaries they point at exist and are executable. Returns +// a single descriptive error so a contributor running by hand sees +// exactly which variable is missing. +func requireEnv() error { + binDir := os.Getenv("BANGER_SMOKE_BIN_DIR") + if binDir == "" { + return errors.New("BANGER_SMOKE_BIN_DIR not set; run via `make smoke`") + } + cov := os.Getenv("BANGER_SMOKE_COVER_DIR") + if cov == "" { + return errors.New("BANGER_SMOKE_COVER_DIR not set; run via `make smoke`") + } + xdg := os.Getenv("BANGER_SMOKE_XDG_DIR") + if xdg == "" { + return errors.New("BANGER_SMOKE_XDG_DIR not set; run via `make smoke`") + } + + bangerBin = filepath.Join(binDir, "banger") + bangerdBin = filepath.Join(binDir, "bangerd") + vsockBin = filepath.Join(binDir, "banger-vsock-agent") + coverDir = cov + scratchRoot = xdg + + for _, bin := range []string{bangerBin, bangerdBin, vsockBin} { + st, err := os.Stat(bin) + if err != nil { + return fmt.Errorf("smoke binary missing: %s: %w", bin, err) + } + if st.Mode()&0o111 == 0 { + return fmt.Errorf("smoke binary not executable: %s", bin) + } + } + + if err := os.MkdirAll(coverDir, 0o755); err != nil { + return fmt.Errorf("mkdir cover dir: %w", err) + } + // Reset the scratch root each run — leftover state from a prior + // crashed run would otherwise leak into this one's fixtures. + if err := os.RemoveAll(scratchRoot); err != nil { + return fmt.Errorf("clean scratch root: %w", err) + } + if err := os.MkdirAll(scratchRoot, 0o755); err != nil { + return fmt.Errorf("mkdir scratch root: %w", err) + } + + rt, err := os.MkdirTemp(scratchRoot, "runtime-") + if err != nil { + return fmt.Errorf("mktemp runtime: %w", err) + } + runtimeDir = rt + + u, err := user.Current() + if err != nil { + return fmt.Errorf("user.Current: %w", err) + } + smokeOwner = u.Username + + return nil +} + +// installPreamble mirrors scripts/smoke.sh's install_preamble. Refuses to +// overwrite a non-smoke install, otherwise installs the instrumented +// services, runs doctor, drops the smoke-tuned config, and restarts. +func installPreamble() error { + if installExists() { + if markerExists() { + fmt.Fprintln(os.Stderr, "[smoke] found stale smoke-owned install; purging it first") + _ = exec.Command("sudo", "env", "GOCOVERDIR="+coverDir, bangerBin, + "system", "uninstall", "--purge").Run() + } else { + return errors.New("banger is already installed on this host; supported-path smoke refuses to overwrite a non-smoke install") + } + } + + // Wipe the user-side known_hosts. Fresh VMs reuse guest IPs with + // new host keys every run; a stale entry trips StrictHostKeyChecking. + // scripts/smoke.sh:374-380 explains why this is host-side, not + // daemon-side state. + if home, err := os.UserHomeDir(); err == nil { + _ = os.Remove(filepath.Join(home, ".local", "state", "banger", "ssh", "known_hosts")) + } + + fmt.Fprintln(os.Stderr, "[smoke] installing smoke-owned services") + install := exec.Command("sudo", "env", + "GOCOVERDIR="+coverDir, + "BANGER_SYSTEM_GOCOVERDIR="+serviceCoverDir, + "BANGER_ROOT_HELPER_GOCOVERDIR="+serviceCoverDir, + bangerBin, "system", "install", "--owner", smokeOwner, + ) + if out, err := install.CombinedOutput(); err != nil { + return fmt.Errorf("system install: %w\n%s", err, out) + } + if out, err := exec.Command("sudo", "touch", smokeMarker).CombinedOutput(); err != nil { + return fmt.Errorf("touch smoke marker: %w\n%s", err, out) + } + + if err := assertServicesActive("after install"); err != nil { + return err + } + + fmt.Fprintln(os.Stderr, "[smoke] doctor: checking host readiness") + if out, err := exec.Command(bangerBin, "doctor").CombinedOutput(); err != nil { + return fmt.Errorf("doctor reported failures; fix the host before running smoke:\n%s", out) + } + + fmt.Fprintln(os.Stderr, "[smoke] writing smoke-tuned daemon config") + if err := writeSmokeConfig(); err != nil { + return err + } + + fmt.Fprintln(os.Stderr, "[smoke] system restart: services should come back cleanly") + restart := exec.Command("sudo", "env", "GOCOVERDIR="+coverDir, + bangerBin, "system", "restart") + if out, err := restart.CombinedOutput(); err != nil { + return fmt.Errorf("system restart: %w\n%s", err, out) + } + return assertServicesActive("after restart") +} + +// installExists checks /etc/banger/install.toml under sudo (the dir is +// not always world-readable). +func installExists() bool { + return exec.Command("sudo", "test", "-f", "/etc/banger/install.toml").Run() == nil +} + +func markerExists() bool { + return exec.Command("sudo", "test", "-f", smokeMarker).Run() == nil +} + +var ( + statusOwnerRE = regexp.MustCompile(`(?m)^active\s+active\b`) + statusHelperRE = regexp.MustCompile(`(?m)^helper_active\s+active\b`) +) + +func assertServicesActive(label string) error { + out, err := exec.Command(bangerBin, "system", "status").CombinedOutput() + if err != nil { + return fmt.Errorf("system status %s: %w\n%s", label, err, out) + } + if !statusOwnerRE.Match(out) { + return fmt.Errorf("owner daemon not active %s:\n%s", label, out) + } + if !statusHelperRE.Match(out) { + return fmt.Errorf("root helper not active %s:\n%s", label, out) + } + return nil +} + +// writeSmokeConfig drops smokeConfigTOML at /etc/banger/config.toml via +// `sudo tee`. tee is the path of least resistance for "write to a root- +// owned file from a non-root process". +func writeSmokeConfig() error { + cmd := exec.Command("sudo", "tee", "/etc/banger/config.toml") + cmd.Stdin = strings.NewReader(smokeConfigTOML) + cmd.Stdout = io.Discard + cmd.Stderr = os.Stderr + if err := cmd.Run(); err != nil { + return fmt.Errorf("write smoke config: %w", err) + } + return nil +} + +// teardown is the equivalent of scripts/smoke.sh's `cleanup` trap. It +// best-efforts every step — partial failures during teardown should +// not mask the test outcome. +func teardown() { + shutdownReleaseServer() + stopServicesForCoverage() + collectServiceCoverage() + _ = exec.Command("sudo", "env", "GOCOVERDIR="+coverDir, bangerBin, + "system", "uninstall", "--purge").Run() + _ = os.RemoveAll(scratchRoot) +} + +func stopServicesForCoverage() { + _ = exec.Command("sudo", "systemctl", "stop", ownerService, rootService).Run() +} + +// collectServiceCoverage copies covmeta.* / covcounters.* out of +// /var/lib/banger into BANGER_SMOKE_COVER_DIR, chowning to the test +// user so subsequent `go tool covdata` invocations can read them. +// Mirrors the inline `sudo bash -lc '...'` in scripts/smoke.sh:307-325. +func collectServiceCoverage() { + uid := fmt.Sprint(os.Getuid()) + gid := fmt.Sprint(os.Getgid()) + const script = ` +shopt -s nullglob +for file in "$1"/covmeta.* "$1"/covcounters.*; do + base="${file##*/}" + cp "$file" "$2/$base" + chown "$3:$4" "$2/$base" + chmod 0644 "$2/$base" +done +` + _ = exec.Command("sudo", "bash", "-c", script, "bash", + serviceCoverDir, coverDir, uid, gid).Run() +} diff --git a/internal/smoketest/smoke_test.go b/internal/smoketest/smoke_test.go new file mode 100644 index 0000000..53544b7 --- /dev/null +++ b/internal/smoketest/smoke_test.go @@ -0,0 +1,72 @@ +//go:build smoke + +package smoketest + +import "testing" + +// TestSmoke is the single top-level test that pins run-order across +// scenario classes: +// +// - "pool" runs pure scenarios concurrently (each calls t.Parallel) +// alongside the repodir chain, which runs its own subtests +// sequentially. The pool subtest only returns once every t.Parallel +// child has finished. +// - "global" runs after pool, serially, in registry order. These +// scenarios assert host-wide state (iptables, vm row counts, +// ssh-config under a fake HOME, the update / rollback flow, daemon +// stop) and would race with the parallel pool. +// +// `go test -parallel N` controls fan-out within the pool. `-run +// TestSmoke/pool/bare_run` runs a single scenario without changing +// the install preamble path. +func TestSmoke(t *testing.T) { + t.Run("pool", func(t *testing.T) { + // Pure scenarios — t.Parallel inside each, fan out under -parallel. + t.Run("bare_run", testBareRun) + t.Run("exit_code", testExitCode) + t.Run("concurrent_run", testConcurrentRun) + t.Run("detach_run", testDetachRun) + t.Run("bootstrap_precondition", testBootstrapPrecondition) + t.Run("vm_lifecycle", testVMLifecycle) + t.Run("vm_set", testVMSet) + t.Run("vm_restart", testVMRestart) + t.Run("vm_kill", testVMKill) + t.Run("vm_ports", testVMPorts) + t.Run("ssh_config", testSSHConfig) + + // Repodir chain — single virtual job in the pool. Subtests run + // sequentially because they share the throwaway git repo at + // repoDir and mutate it; t.Parallel() is intentionally absent. + // The chain itself competes with the pure scenarios for a + // parallel slot at this outer level. + t.Run("repodir_chain", func(t *testing.T) { + t.Parallel() + t.Run("workspace_run", testWorkspaceRun) + t.Run("workspace_dryrun", testWorkspaceDryrun) + t.Run("include_untracked", testIncludeUntracked) + t.Run("workspace_export", testWorkspaceExport) + t.Run("workspace_full_copy", testWorkspaceFullCopy) + t.Run("workspace_basecommit", testWorkspaceBasecommit) + t.Run("workspace_restart", testWorkspaceRestart) + t.Run("vm_exec", testVMExec) + }) + }) + + // Global scenarios — serial, after the pool drains. Order matters: + // daemon_admin tears the installed services down and must be LAST. + // The order otherwise mirrors scripts/smoke.sh's SMOKE_SCENARIOS + // registry so the run shape is comparable. + t.Run("global", func(t *testing.T) { + t.Run("vm_prune", testVMPrune) + t.Run("nat", testNAT) + t.Run("invalid_spec", testInvalidSpec) + t.Run("invalid_name", testInvalidName) + t.Run("update_check", testUpdateCheck) + t.Run("update_to_unknown", testUpdateToUnknown) + t.Run("update_no_root", testUpdateNoRoot) + t.Run("update_dry_run", testUpdateDryRun) + t.Run("update_keeps_vm_alive", testUpdateKeepsVMAlive) + t.Run("update_rollback_keeps_vm_alive", testUpdateRollbackKeepsVMAlive) + t.Run("daemon_admin", testDaemonAdmin) + }) +} diff --git a/internal/store/migrations.go b/internal/store/migrations.go new file mode 100644 index 0000000..1734c03 --- /dev/null +++ b/internal/store/migrations.go @@ -0,0 +1,303 @@ +package store + +import ( + "database/sql" + "fmt" + "sort" + "time" +) + +// migration is one ordered, atomic schema step. id must be unique and +// strictly increasing across the slice. name is a human-readable label +// stored alongside the id for debugging, and up receives a *sql.Tx so +// DDL + data backfills land atomically — either the migration fully +// applies and a schema_migrations row is written, or the whole thing +// rolls back and gets retried on next Open(). +type migration struct { + id int + name string + up func(*sql.Tx) error +} + +// migrations is the canonical ordered history. Append new migrations +// at the bottom with the next id. Never edit or reorder existing +// entries — installed DBs key off the id column. +var migrations = []migration{ + {id: 1, name: "baseline", up: migrateBaseline}, + {id: 2, name: "drop_images_docker", up: migrateDropImagesDocker}, + {id: 3, name: "add_vm_workspace", up: migrateAddVMWorkspace}, +} + +// runMigrations ensures schema_migrations exists, then applies every +// migration whose id hasn't been recorded yet, in id order. +func runMigrations(db *sql.DB) error { + if _, err := db.Exec(`CREATE TABLE IF NOT EXISTS schema_migrations ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + applied_at TEXT NOT NULL + )`); err != nil { + return fmt.Errorf("create schema_migrations: %w", err) + } + + applied, err := loadAppliedMigrations(db) + if err != nil { + return err + } + + sorted := make([]migration, len(migrations)) + copy(sorted, migrations) + sort.Slice(sorted, func(i, j int) bool { return sorted[i].id < sorted[j].id }) + seen := map[int]bool{} + for _, m := range sorted { + if seen[m.id] { + return fmt.Errorf("duplicate migration id %d (%q)", m.id, m.name) + } + seen[m.id] = true + } + + for _, m := range sorted { + if _, ok := applied[m.id]; ok { + continue + } + if err := applyMigration(db, m); err != nil { + return fmt.Errorf("migration %d (%s): %w", m.id, m.name, err) + } + } + return nil +} + +// SchemaCompatibility classifies the relationship between this +// binary's known migrations and a (possibly stale) DB's applied set. +type SchemaCompatibility int + +const ( + // SchemaCompatible: every applied id is known to this binary AND + // every known id has been applied. Binary and DB are in lockstep. + SchemaCompatible SchemaCompatibility = iota + // SchemaMigrationsNeeded: binary knows ids the DB hasn't applied + // yet. Open() would auto-migrate; safe. + SchemaMigrationsNeeded + // SchemaIncompatible: DB has applied ids this binary doesn't + // know about. Binary is older than the running install. Refuse + // the swap. + SchemaIncompatible +) + +// SchemaState describes the migration status of a DB relative to +// this binary's compiled-in `migrations` slice. Used by +// `bangerd --check-migrations` to gate `banger update`'s binary swap +// before service restart — a staged binary must not be allowed to +// take over a DB whose schema it doesn't know how to read. +type SchemaState struct { + Compatibility SchemaCompatibility + AppliedIDs []int + KnownMaxID int + Pending []int // known IDs not yet applied + Unknown []int // applied IDs the binary doesn't recognise +} + +// InspectSchemaState opens path read-only and reports how the DB's +// applied-migration set compares to the binary's known set. Returns +// an error only on real I/O failures (file missing, permission +// denied, corrupt SQLite); a "DB ahead of binary" state is reported +// via Compatibility, not as an error. +func InspectSchemaState(path string) (SchemaState, error) { + dsn, err := sqliteReadOnlyDSN(path) + if err != nil { + return SchemaState{}, err + } + db, err := sql.Open("sqlite", dsn) + if err != nil { + return SchemaState{}, err + } + defer db.Close() + if err := db.Ping(); err != nil { + return SchemaState{}, err + } + // schema_migrations may not exist on a fresh install. Treat that + // as "applied = ∅" rather than an error — the equivalent of + // "the new binary will create the table on first Open". + rows, err := db.Query("SELECT id FROM schema_migrations") + if err != nil { + // modernc.org/sqlite doesn't expose a typed "no such table" + // error; sniff the message. Anything else bubbles. + if errMissingTable(err) { + return classifySchemaState(nil), nil + } + return SchemaState{}, err + } + defer rows.Close() + var applied []int + for rows.Next() { + var id int + if err := rows.Scan(&id); err != nil { + return SchemaState{}, err + } + applied = append(applied, id) + } + if err := rows.Err(); err != nil { + return SchemaState{}, err + } + return classifySchemaState(applied), nil +} + +func classifySchemaState(applied []int) SchemaState { + known := map[int]struct{}{} + knownMax := 0 + for _, m := range migrations { + known[m.id] = struct{}{} + if m.id > knownMax { + knownMax = m.id + } + } + appliedSet := map[int]struct{}{} + var unknown []int + for _, id := range applied { + appliedSet[id] = struct{}{} + if _, ok := known[id]; !ok { + unknown = append(unknown, id) + } + } + var pending []int + for _, m := range migrations { + if _, ok := appliedSet[m.id]; !ok { + pending = append(pending, m.id) + } + } + state := SchemaState{ + AppliedIDs: append([]int(nil), applied...), + KnownMaxID: knownMax, + Pending: pending, + Unknown: unknown, + } + switch { + case len(unknown) > 0: + state.Compatibility = SchemaIncompatible + case len(pending) > 0: + state.Compatibility = SchemaMigrationsNeeded + default: + state.Compatibility = SchemaCompatible + } + return state +} + +func errMissingTable(err error) bool { + if err == nil { + return false + } + msg := err.Error() + // modernc.org/sqlite wraps the underlying SQLITE_ERROR with this + // canonical sub-string for missing-table errors. + return contains(msg, "no such table: schema_migrations") +} + +func contains(s, sub string) bool { + if len(sub) > len(s) { + return false + } + for i := 0; i+len(sub) <= len(s); i++ { + if s[i:i+len(sub)] == sub { + return true + } + } + return false +} + +func loadAppliedMigrations(db *sql.DB) (map[int]struct{}, error) { + rows, err := db.Query("SELECT id FROM schema_migrations") + if err != nil { + return nil, fmt.Errorf("load schema_migrations: %w", err) + } + defer rows.Close() + applied := map[int]struct{}{} + for rows.Next() { + var id int + if err := rows.Scan(&id); err != nil { + return nil, err + } + applied[id] = struct{}{} + } + return applied, rows.Err() +} + +func applyMigration(db *sql.DB, m migration) error { + tx, err := db.Begin() + if err != nil { + return err + } + if err := m.up(tx); err != nil { + _ = tx.Rollback() + return err + } + if _, err := tx.Exec( + "INSERT INTO schema_migrations (id, name, applied_at) VALUES (?, ?, ?)", + m.id, m.name, time.Now().UTC().Format(time.RFC3339), + ); err != nil { + _ = tx.Rollback() + return fmt.Errorf("record migration: %w", err) + } + return tx.Commit() +} + +// migrateBaseline creates the full current schema. +func migrateBaseline(tx *sql.Tx) error { + stmts := []string{ + `CREATE TABLE IF NOT EXISTS images ( + id TEXT PRIMARY KEY, + name TEXT NOT NULL UNIQUE, + managed INTEGER NOT NULL DEFAULT 0, + artifact_dir TEXT, + rootfs_path TEXT NOT NULL, + work_seed_path TEXT, + kernel_path TEXT NOT NULL, + initrd_path TEXT, + modules_dir TEXT, + build_size TEXT, + seeded_ssh_public_key_fingerprint TEXT, + docker INTEGER NOT NULL DEFAULT 0, + created_at TEXT NOT NULL, + updated_at TEXT NOT NULL + );`, + `CREATE TABLE IF NOT EXISTS vms ( + id TEXT PRIMARY KEY, + name TEXT NOT NULL UNIQUE, + image_id TEXT NOT NULL, + guest_ip TEXT NOT NULL UNIQUE, + state TEXT NOT NULL, + created_at TEXT NOT NULL, + updated_at TEXT NOT NULL, + last_touched_at TEXT NOT NULL, + spec_json TEXT NOT NULL, + runtime_json TEXT NOT NULL, + stats_json TEXT NOT NULL DEFAULT '{}', + FOREIGN KEY(image_id) REFERENCES images(id) ON DELETE RESTRICT + );`, + } + for _, stmt := range stmts { + if _, err := tx.Exec(stmt); err != nil { + return err + } + } + return nil +} + +// migrateDropImagesDocker removes the legacy images.docker column. +// SQLite supports ALTER TABLE ... DROP COLUMN since 3.35 (2021), and +// banger ships against modern SQLite, so a single statement is enough. +// Existing values are simply discarded — the field never affected +// runtime behaviour. +func migrateDropImagesDocker(tx *sql.Tx) error { + _, err := tx.Exec(`ALTER TABLE images DROP COLUMN docker;`) + return err +} + +// migrateAddVMWorkspace adds the workspace_json column that records +// the last workspace.prepare result (guest path, host source path, +// HEAD commit, and timestamp) per VM. Default '{}' means no workspace +// has been prepared yet. The column is managed exclusively via +// Store.SetVMWorkspace; lifecycle UpsertVM calls never touch it so +// workspace state survives VM stop/start cycles. +func migrateAddVMWorkspace(tx *sql.Tx) error { + _, err := tx.Exec(`ALTER TABLE vms ADD COLUMN workspace_json TEXT NOT NULL DEFAULT '{}'`) + return err +} diff --git a/internal/store/migrations_test.go b/internal/store/migrations_test.go new file mode 100644 index 0000000..580fc6c --- /dev/null +++ b/internal/store/migrations_test.go @@ -0,0 +1,374 @@ +package store + +import ( + "database/sql" + "errors" + "path/filepath" + "testing" + + _ "modernc.org/sqlite" +) + +// openRawDB opens a SQLite DB at a fresh tempfile without running any +// migrations, so tests can observe migration-runner behaviour directly. +func openRawDB(t *testing.T) *sql.DB { + t.Helper() + path := filepath.Join(t.TempDir(), "state.db") + dsn, err := sqliteDSN(path) + if err != nil { + t.Fatalf("sqliteDSN: %v", err) + } + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + t.Cleanup(func() { _ = db.Close() }) + return db +} + +func TestRunMigrationsAppliesBaselineOnFreshDB(t *testing.T) { + db := openRawDB(t) + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations: %v", err) + } + // All declared migrations must be recorded. + for _, m := range migrations { + var got string + if err := db.QueryRow("SELECT name FROM schema_migrations WHERE id = ?", m.id).Scan(&got); err != nil { + t.Fatalf("migration %d not recorded: %v", m.id, err) + } + if got != m.name { + t.Errorf("migration %d name = %q, want %q", m.id, got, m.name) + } + } + // Baseline must have created the real tables. + for _, table := range []string{"images", "vms"} { + var name string + if err := db.QueryRow("SELECT name FROM sqlite_master WHERE type='table' AND name=?", table).Scan(&name); err != nil { + t.Fatalf("table %s missing: %v", table, err) + } + } +} + +func TestRunMigrationsIsIdempotent(t *testing.T) { + db := openRawDB(t) + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations first pass: %v", err) + } + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations second pass: %v", err) + } + // One row per migration, no duplicates. + var count int + if err := db.QueryRow("SELECT COUNT(*) FROM schema_migrations").Scan(&count); err != nil { + t.Fatalf("count: %v", err) + } + if count != len(migrations) { + t.Errorf("schema_migrations rows = %d, want %d", count, len(migrations)) + } +} + +func TestRunMigrationsSkipsAlreadyApplied(t *testing.T) { + db := openRawDB(t) + + // Swap in a test-only migration whose body would error if invoked, + // pre-insert its id into schema_migrations, and confirm the runner + // recognises the marker and skips the body entirely. + orig := migrations + t.Cleanup(func() { migrations = orig }) + migrations = []migration{ + {id: 1, name: "baseline", up: migrateBaseline}, + {id: 99, name: "explodes-if-run", up: func(*sql.Tx) error { + return errors.New("must not execute") + }}, + } + + if _, err := db.Exec(`CREATE TABLE IF NOT EXISTS schema_migrations ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + applied_at TEXT NOT NULL + )`); err != nil { + t.Fatalf("seed schema_migrations table: %v", err) + } + if _, err := db.Exec( + "INSERT INTO schema_migrations (id, name, applied_at) VALUES (?, ?, ?)", + 99, "explodes-if-run", "2026-04-20T00:00:00Z", + ); err != nil { + t.Fatalf("seed applied row: %v", err) + } + + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations: %v", err) + } +} + +func TestApplyMigrationRollsBackOnBodyError(t *testing.T) { + db := openRawDB(t) + if _, err := db.Exec(`CREATE TABLE IF NOT EXISTS schema_migrations ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + applied_at TEXT NOT NULL + )`); err != nil { + t.Fatalf("seed schema_migrations: %v", err) + } + + err := applyMigration(db, migration{ + id: 7, + name: "creates-then-fails", + up: func(tx *sql.Tx) error { + if _, err := tx.Exec("CREATE TABLE transient (x INTEGER)"); err != nil { + return err + } + return errors.New("synthetic failure") + }, + }) + if err == nil { + t.Fatal("expected applyMigration to surface body error") + } + + // The transient table must NOT survive the failed migration. + var name string + if err := db.QueryRow("SELECT name FROM sqlite_master WHERE type='table' AND name='transient'").Scan(&name); err == nil { + t.Fatal("transient table survived rollback") + } + // And no schema_migrations row for id=7. + var count int + if err := db.QueryRow("SELECT COUNT(*) FROM schema_migrations WHERE id=7").Scan(&count); err != nil { + t.Fatalf("count: %v", err) + } + if count != 0 { + t.Fatalf("schema_migrations recorded failed migration: count=%d", count) + } +} + +// TestOpenReadOnlyDoesNotRunMigrations pins the doctor contract: +// OpenReadOnly must not mutate the DB. Seed a DB whose baseline +// migration row has been forcibly removed (simulating a "behind" +// state), open it read-only, and confirm nothing was re-applied. +func TestOpenReadOnlyDoesNotRunMigrations(t *testing.T) { + path := filepath.Join(t.TempDir(), "state.db") + full, err := Open(path) + if err != nil { + t.Fatalf("Open: %v", err) + } + if _, err := full.db.Exec("DELETE FROM schema_migrations WHERE id = 1"); err != nil { + t.Fatalf("remove baseline marker: %v", err) + } + _ = full.Close() + + ro, err := OpenReadOnly(path) + if err != nil { + t.Fatalf("OpenReadOnly: %v", err) + } + defer ro.Close() + + var migCount int + if err := ro.db.QueryRow("SELECT COUNT(*) FROM schema_migrations WHERE id = 1").Scan(&migCount); err != nil { + t.Fatalf("query schema_migrations: %v", err) + } + if migCount != 0 { + t.Fatal("OpenReadOnly re-recorded a migration row — the open path mutated the DB") + } +} + +// TestOpenReadOnlyRefusesWrites confirms SQLite's mode=ro is in effect +// — no matter what a caller tries, writes are rejected at the driver +// level. Belt-and-braces guard against a future refactor that might +// plumb a write method through. +func TestOpenReadOnlyRefusesWrites(t *testing.T) { + path := filepath.Join(t.TempDir(), "state.db") + if s, err := Open(path); err != nil { + t.Fatalf("seed Open: %v", err) + } else { + _ = s.Close() + } + ro, err := OpenReadOnly(path) + if err != nil { + t.Fatalf("OpenReadOnly: %v", err) + } + defer ro.Close() + if _, err := ro.db.Exec("INSERT INTO schema_migrations (id, name, applied_at) VALUES (999, 'x', 'x')"); err == nil { + t.Fatal("write succeeded against a read-only store") + } +} + +// TestRunMigrationsIgnoresUnknownAppliedIDs simulates an older banger +// opening a DB that was written by a newer version: schema_migrations +// carries rows with ids the current binary's migrations slice doesn't +// know about. The runner must leave those rows alone and still apply +// any of its own known migrations that haven't been recorded yet. +// +// Without this behaviour, upgrading forward then downgrading back +// (or running two daemon versions against the same state dir) would +// either fail outright or start destructively reinterpreting rows. +func TestRunMigrationsIgnoresUnknownAppliedIDs(t *testing.T) { + db := openRawDB(t) + + // Bootstrap schema_migrations and pre-seed a row for a migration + // id the current binary doesn't know. Use a high id so it's + // clearly outside our slice. + if _, err := db.Exec(`CREATE TABLE IF NOT EXISTS schema_migrations ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + applied_at TEXT NOT NULL + )`); err != nil { + t.Fatalf("seed schema_migrations: %v", err) + } + if _, err := db.Exec( + "INSERT INTO schema_migrations (id, name, applied_at) VALUES (?, ?, ?)", + 9001, "from-the-future", "2099-01-01T00:00:00Z", + ); err != nil { + t.Fatalf("seed future migration row: %v", err) + } + + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations: %v", err) + } + + // The alien row is untouched. + var name string + if err := db.QueryRow("SELECT name FROM schema_migrations WHERE id = 9001").Scan(&name); err != nil { + t.Fatalf("alien migration row disappeared: %v", err) + } + if name != "from-the-future" { + t.Fatalf("alien row name = %q, want 'from-the-future'", name) + } + + // Every known migration in our slice was applied — their rows + // should exist too. + for _, m := range migrations { + var got string + if err := db.QueryRow("SELECT name FROM schema_migrations WHERE id = ?", m.id).Scan(&got); err != nil { + t.Fatalf("migration %d not recorded despite unknown alien row: %v", m.id, err) + } + } +} + +// TestInspectSchemaStateCompatible pins the happy path: a fully- +// migrated DB reports SchemaCompatible. +func TestInspectSchemaStateCompatible(t *testing.T) { + path := filepath.Join(t.TempDir(), "state.db") + dsn, _ := sqliteDSN(path) + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations: %v", err) + } + _ = db.Close() + + state, err := InspectSchemaState(path) + if err != nil { + t.Fatalf("InspectSchemaState: %v", err) + } + if state.Compatibility != SchemaCompatible { + t.Fatalf("Compatibility = %d, want SchemaCompatible (state=%+v)", state.Compatibility, state) + } + if len(state.Pending) != 0 || len(state.Unknown) != 0 { + t.Fatalf("expected empty pending/unknown; got %+v", state) + } +} + +// TestInspectSchemaStateMigrationsNeeded covers the "binary newer +// than DB" case: the DB has only the baseline, so migrations 2 and 3 +// show up in Pending and Compatibility = SchemaMigrationsNeeded. +func TestInspectSchemaStateMigrationsNeeded(t *testing.T) { + path := filepath.Join(t.TempDir(), "state.db") + dsn, _ := sqliteDSN(path) + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + // Create just schema_migrations + record only id=1. + if _, err := db.Exec(`CREATE TABLE schema_migrations (id INTEGER PRIMARY KEY, name TEXT NOT NULL, applied_at TEXT NOT NULL)`); err != nil { + t.Fatalf("create schema_migrations: %v", err) + } + if _, err := db.Exec(`INSERT INTO schema_migrations VALUES (1, 'baseline', '2026-01-01T00:00:00Z')`); err != nil { + t.Fatalf("insert: %v", err) + } + _ = db.Close() + + state, err := InspectSchemaState(path) + if err != nil { + t.Fatalf("InspectSchemaState: %v", err) + } + if state.Compatibility != SchemaMigrationsNeeded { + t.Fatalf("Compatibility = %d, want SchemaMigrationsNeeded (state=%+v)", state.Compatibility, state) + } + if len(state.Pending) == 0 { + t.Fatal("expected non-empty pending list") + } +} + +// TestInspectSchemaStateIncompatible covers the "DB ahead of binary" +// case: the DB records migration id=99 that this binary doesn't +// know about. Compatibility = SchemaIncompatible; Unknown contains 99. +func TestInspectSchemaStateIncompatible(t *testing.T) { + path := filepath.Join(t.TempDir(), "state.db") + dsn, _ := sqliteDSN(path) + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + if err := runMigrations(db); err != nil { + t.Fatalf("runMigrations: %v", err) + } + if _, err := db.Exec(`INSERT INTO schema_migrations VALUES (99, 'from_the_future', '2030-01-01T00:00:00Z')`); err != nil { + t.Fatalf("insert future: %v", err) + } + _ = db.Close() + + state, err := InspectSchemaState(path) + if err != nil { + t.Fatalf("InspectSchemaState: %v", err) + } + if state.Compatibility != SchemaIncompatible { + t.Fatalf("Compatibility = %d, want SchemaIncompatible (state=%+v)", state.Compatibility, state) + } + if len(state.Unknown) != 1 || state.Unknown[0] != 99 { + t.Fatalf("Unknown = %v, want [99]", state.Unknown) + } +} + +// TestInspectSchemaStateMissingTable handles the fresh-install case: +// a DB file exists but schema_migrations doesn't (the file was created +// by something other than banger, or banger was halted before its +// first migration). Treat this as "all migrations pending". +func TestInspectSchemaStateMissingTable(t *testing.T) { + path := filepath.Join(t.TempDir(), "state.db") + dsn, _ := sqliteDSN(path) + db, err := sql.Open("sqlite", dsn) + if err != nil { + t.Fatalf("sql.Open: %v", err) + } + if err := db.Ping(); err != nil { + t.Fatalf("ping: %v", err) + } + _ = db.Close() + + state, err := InspectSchemaState(path) + if err != nil { + t.Fatalf("InspectSchemaState: %v", err) + } + if state.Compatibility != SchemaMigrationsNeeded { + t.Fatalf("Compatibility = %d, want SchemaMigrationsNeeded (no schema_migrations table)", state.Compatibility) + } + if len(state.Pending) != len(migrations) { + t.Fatalf("Pending = %v, want all %d migrations", state.Pending, len(migrations)) + } +} + +func TestRunMigrationsRejectsDuplicateID(t *testing.T) { + db := openRawDB(t) + orig := migrations + t.Cleanup(func() { migrations = orig }) + migrations = []migration{ + {id: 1, name: "first", up: func(*sql.Tx) error { return nil }}, + {id: 1, name: "dupe", up: func(*sql.Tx) error { return nil }}, + } + err := runMigrations(db) + if err == nil { + t.Fatal("expected error for duplicate migration id") + } +} diff --git a/internal/store/store.go b/internal/store/store.go index 1ef1dca..e3c7502 100644 --- a/internal/store/store.go +++ b/internal/store/store.go @@ -31,13 +31,42 @@ func Open(path string) (*Store, error) { return nil, err } store := &Store{db: db} - if err := store.migrate(); err != nil { + if err := runMigrations(db); err != nil { _ = db.Close() return nil, err } return store, nil } +// OpenReadOnly opens the state DB without running migrations and with +// SQLite's mode=ro flag so no write can slip through — the file and +// its WAL sidecar stay untouched. Used by `banger doctor`, which must +// be pure inspection: running it should never mutate user state, and +// it must not trigger a schema migration the user didn't ask for. +// +// Returns the usual sql.ErrNoRows-compatible errors from the read +// queries if the DB's schema is older than the current code expects; +// doctor surfaces those as failing checks rather than a hard crash. +func OpenReadOnly(path string) (*Store, error) { + dsn, err := sqliteReadOnlyDSN(path) + if err != nil { + return nil, err + } + db, err := sql.Open("sqlite", dsn) + if err != nil { + return nil, err + } + // Ping forces SQLite to actually open the file, so a missing or + // unreadable DB fails here rather than at first query. Match the + // existing Open contract: caller expects success to mean "ready + // to read." + if err := db.Ping(); err != nil { + _ = db.Close() + return nil, err + } + return &Store{db: db}, nil +} + func (s *Store) Close() error { return s.db.Close() } @@ -66,52 +95,26 @@ func sqliteDSN(path string) (string, error) { }).String(), nil } -func (s *Store) migrate() error { - stmts := []string{ - `CREATE TABLE IF NOT EXISTS images ( - id TEXT PRIMARY KEY, - name TEXT NOT NULL UNIQUE, - managed INTEGER NOT NULL DEFAULT 0, - artifact_dir TEXT, - rootfs_path TEXT NOT NULL, - work_seed_path TEXT, - kernel_path TEXT NOT NULL, - initrd_path TEXT, - modules_dir TEXT, - packages_path TEXT, - build_size TEXT, - seeded_ssh_public_key_fingerprint TEXT, - docker INTEGER NOT NULL DEFAULT 0, - created_at TEXT NOT NULL, - updated_at TEXT NOT NULL - );`, - `CREATE TABLE IF NOT EXISTS vms ( - id TEXT PRIMARY KEY, - name TEXT NOT NULL UNIQUE, - image_id TEXT NOT NULL, - guest_ip TEXT NOT NULL UNIQUE, - state TEXT NOT NULL, - created_at TEXT NOT NULL, - updated_at TEXT NOT NULL, - last_touched_at TEXT NOT NULL, - spec_json TEXT NOT NULL, - runtime_json TEXT NOT NULL, - stats_json TEXT NOT NULL DEFAULT '{}', - FOREIGN KEY(image_id) REFERENCES images(id) ON DELETE RESTRICT - );`, +// sqliteReadOnlyDSN builds a DSN that opens the DB in SQLite's +// read-only mode. Deliberately omits journal_mode=WAL and the other +// write-adjacent pragmas set by sqliteDSN — mode=ro refuses them +// anyway, and keeping the list minimal means the query never touches +// the file. foreign_keys and busy_timeout are the only pragmas worth +// keeping for read paths (semantics parity + lock backoff). +func sqliteReadOnlyDSN(path string) (string, error) { + absPath, err := filepath.Abs(path) + if err != nil { + return "", fmt.Errorf("resolve sqlite path: %w", err) } - for _, stmt := range stmts { - if _, err := s.db.Exec(stmt); err != nil { - return err - } - } - if err := ensureColumnExists(s.db, "images", "work_seed_path", "TEXT"); err != nil { - return err - } - if err := ensureColumnExists(s.db, "images", "seeded_ssh_public_key_fingerprint", "TEXT"); err != nil { - return err - } - return nil + query := url.Values{} + query.Set("mode", "ro") + query.Add("_pragma", "foreign_keys(1)") + query.Add("_pragma", "busy_timeout(5000)") + return (&url.URL{ + Scheme: "file", + Path: filepath.ToSlash(absPath), + RawQuery: query.Encode(), + }).String(), nil } func (s *Store) UpsertImage(ctx context.Context, image model.Image) error { @@ -120,8 +123,8 @@ func (s *Store) UpsertImage(ctx context.Context, image model.Image) error { const query = ` INSERT INTO images ( id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, - modules_dir, build_size, seeded_ssh_public_key_fingerprint, docker, created_at, updated_at - ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + modules_dir, build_size, seeded_ssh_public_key_fingerprint, created_at, updated_at + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) ON CONFLICT(id) DO UPDATE SET name=excluded.name, managed=excluded.managed, @@ -133,7 +136,6 @@ func (s *Store) UpsertImage(ctx context.Context, image model.Image) error { modules_dir=excluded.modules_dir, build_size=excluded.build_size, seeded_ssh_public_key_fingerprint=excluded.seeded_ssh_public_key_fingerprint, - docker=excluded.docker, updated_at=excluded.updated_at` _, err := s.db.ExecContext(ctx, query, image.ID, @@ -147,7 +149,6 @@ func (s *Store) UpsertImage(ctx context.Context, image model.Image) error { image.ModulesDir, image.BuildSize, image.SeededSSHPublicKeyFingerprint, - boolToInt(image.Docker), image.CreatedAt.Format(time.RFC3339), image.UpdatedAt.Format(time.RFC3339), ) @@ -155,15 +156,15 @@ func (s *Store) UpsertImage(ctx context.Context, image model.Image) error { } func (s *Store) GetImageByName(ctx context.Context, name string) (model.Image, error) { - return s.getImage(ctx, "SELECT id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, modules_dir, build_size, seeded_ssh_public_key_fingerprint, docker, created_at, updated_at FROM images WHERE name = ?", name) + return s.getImage(ctx, "SELECT id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, modules_dir, build_size, seeded_ssh_public_key_fingerprint, created_at, updated_at FROM images WHERE name = ?", name) } func (s *Store) GetImageByID(ctx context.Context, id string) (model.Image, error) { - return s.getImage(ctx, "SELECT id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, modules_dir, build_size, seeded_ssh_public_key_fingerprint, docker, created_at, updated_at FROM images WHERE id = ?", id) + return s.getImage(ctx, "SELECT id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, modules_dir, build_size, seeded_ssh_public_key_fingerprint, created_at, updated_at FROM images WHERE id = ?", id) } func (s *Store) ListImages(ctx context.Context) ([]model.Image, error) { - rows, err := s.db.QueryContext(ctx, "SELECT id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, modules_dir, build_size, seeded_ssh_public_key_fingerprint, docker, created_at, updated_at FROM images ORDER BY created_at ASC") + rows, err := s.db.QueryContext(ctx, "SELECT id, name, managed, artifact_dir, rootfs_path, work_seed_path, kernel_path, initrd_path, modules_dir, build_size, seeded_ssh_public_key_fingerprint, created_at, updated_at FROM images ORDER BY created_at ASC") if err != nil { return nil, err } @@ -235,7 +236,7 @@ func (s *Store) UpsertVM(ctx context.Context, vm model.VMRecord) error { func (s *Store) GetVM(ctx context.Context, idOrName string) (model.VMRecord, error) { const query = ` SELECT id, name, image_id, guest_ip, state, created_at, updated_at, last_touched_at, - spec_json, runtime_json, stats_json + spec_json, runtime_json, stats_json, workspace_json FROM vms WHERE id = ? OR name = ? ` @@ -246,15 +247,29 @@ func (s *Store) GetVM(ctx context.Context, idOrName string) (model.VMRecord, err func (s *Store) GetVMByID(ctx context.Context, id string) (model.VMRecord, error) { row := s.db.QueryRowContext(ctx, ` SELECT id, name, image_id, guest_ip, state, created_at, updated_at, last_touched_at, - spec_json, runtime_json, stats_json + spec_json, runtime_json, stats_json, workspace_json FROM vms WHERE id = ?`, id) return scanVMRow(row) } +// GetVMByName is the exact-name lookup used for creation-time +// uniqueness checks. Unlike GetVM (which matches id OR name) and +// Daemon.FindVM (which also falls back to prefix-matching), this +// returns sql.ErrNoRows for anything except a literal name hit, so +// a new VM can't be rejected just because its name prefixes an +// existing VM's id or an existing VM's name. +func (s *Store) GetVMByName(ctx context.Context, name string) (model.VMRecord, error) { + row := s.db.QueryRowContext(ctx, ` + SELECT id, name, image_id, guest_ip, state, created_at, updated_at, last_touched_at, + spec_json, runtime_json, stats_json, workspace_json + FROM vms WHERE name = ?`, name) + return scanVMRow(row) +} + func (s *Store) ListVMs(ctx context.Context) ([]model.VMRecord, error) { rows, err := s.db.QueryContext(ctx, ` SELECT id, name, image_id, guest_ip, state, created_at, updated_at, last_touched_at, - spec_json, runtime_json, stats_json + spec_json, runtime_json, stats_json, workspace_json FROM vms ORDER BY created_at ASC`) if err != nil { return nil, err @@ -278,10 +293,27 @@ func (s *Store) DeleteVM(ctx context.Context, id string) error { return err } +// SetVMWorkspace persists the workspace state from a workspace.prepare +// result onto the VM row. Called after a successful prepare so the +// guest path, host source path, and HEAD commit survive daemon +// restarts and are available to `vm exec` without re-stating them. +// Best-effort from the caller's perspective — a failure here does not +// roll back the prepare itself. +func (s *Store) SetVMWorkspace(ctx context.Context, vmID string, workspace model.VMWorkspace) error { + s.writeMu.Lock() + defer s.writeMu.Unlock() + data, err := json.Marshal(workspace) + if err != nil { + return err + } + _, err = s.db.ExecContext(ctx, "UPDATE vms SET workspace_json = ? WHERE id = ?", string(data), vmID) + return err +} + func (s *Store) FindVMsUsingImage(ctx context.Context, imageID string) ([]model.VMRecord, error) { rows, err := s.db.QueryContext(ctx, ` SELECT id, name, image_id, guest_ip, state, created_at, updated_at, last_touched_at, - spec_json, runtime_json, stats_json + spec_json, runtime_json, stats_json, workspace_json FROM vms WHERE image_id = ?`, imageID) if err != nil { return nil, err @@ -339,7 +371,7 @@ type scanner interface { func scanImageRow(row scanner) (model.Image, error) { var image model.Image - var managed, docker int + var managed int var workSeedPath sql.NullString var seededSSHPublicKeyFingerprint sql.NullString var createdAt, updatedAt string @@ -355,7 +387,6 @@ func scanImageRow(row scanner) (model.Image, error) { &image.ModulesDir, &image.BuildSize, &seededSSHPublicKeyFingerprint, - &docker, &createdAt, &updatedAt, ) @@ -363,7 +394,6 @@ func scanImageRow(row scanner) (model.Image, error) { return image, err } image.Managed = managed == 1 - image.Docker = docker == 1 image.WorkSeedPath = workSeedPath.String image.SeededSSHPublicKeyFingerprint = seededSSHPublicKeyFingerprint.String image.CreatedAt, err = time.Parse(time.RFC3339, createdAt) @@ -387,7 +417,7 @@ func scanVMRows(rows scanner) (model.VMRecord, error) { func scanVMInto(row scanner) (model.VMRecord, error) { var vm model.VMRecord - var state, createdAt, updatedAt, touchedAt, specJSON, runtimeJSON, statsJSON string + var state, createdAt, updatedAt, touchedAt, specJSON, runtimeJSON, statsJSON, workspaceJSON string err := row.Scan( &vm.ID, &vm.Name, @@ -400,6 +430,7 @@ func scanVMInto(row scanner) (model.VMRecord, error) { &specJSON, &runtimeJSON, &statsJSON, + &workspaceJSON, ) if err != nil { return vm, err @@ -416,6 +447,11 @@ func scanVMInto(row scanner) (model.VMRecord, error) { return vm, err } } + if workspaceJSON != "" && workspaceJSON != "{}" { + if err := json.Unmarshal([]byte(workspaceJSON), &vm.Workspace); err != nil { + return vm, err + } + } var parseErr error vm.CreatedAt, parseErr = time.Parse(time.RFC3339, createdAt) if parseErr != nil { @@ -432,38 +468,23 @@ func scanVMInto(row scanner) (model.VMRecord, error) { return vm, nil } -func ensureColumnExists(db *sql.DB, table, column, columnType string) error { - rows, err := db.Query(fmt.Sprintf("PRAGMA table_info(%s)", table)) - if err != nil { - return err - } - defer rows.Close() - for rows.Next() { - var ( - cid int - name string - valueType string - notNull int - defaultV sql.NullString - pk int - ) - if err := rows.Scan(&cid, &name, &valueType, ¬Null, &defaultV, &pk); err != nil { - return err - } - if name == column { - return nil - } - } - if err := rows.Err(); err != nil { - return err - } - _, err = db.Exec(fmt.Sprintf("ALTER TABLE %s ADD COLUMN %s %s", table, column, columnType)) - return err -} - func boolToInt(value bool) int { if value { return 1 } return 0 } + +func nullableTimeString(value time.Time) any { + if value.IsZero() { + return nil + } + return value.Format(time.RFC3339) +} + +func nullableInt(value *int) any { + if value == nil { + return nil + } + return *value +} diff --git a/internal/store/store_test.go b/internal/store/store_test.go index 164ad4e..29589e5 100644 --- a/internal/store/store_test.go +++ b/internal/store/store_test.go @@ -4,6 +4,7 @@ import ( "context" "database/sql" "errors" + "os" "path/filepath" "reflect" "strconv" @@ -178,8 +179,8 @@ func TestGetImageRejectsMalformedTimestamp(t *testing.T) { _, err := store.db.ExecContext(ctx, ` INSERT INTO images ( id, name, managed, artifact_dir, rootfs_path, kernel_path, initrd_path, - modules_dir, packages_path, build_size, docker, created_at, updated_at - ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, + modules_dir, build_size, created_at, updated_at + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, "image-bad-time", "image-bad-time", 0, @@ -189,8 +190,6 @@ func TestGetImageRejectsMalformedTimestamp(t *testing.T) { "", "", "", - "", - 0, "not-a-time", "not-a-time", ) @@ -320,6 +319,58 @@ func TestStoreConfiguresSQLitePragmasOnPooledConnections(t *testing.T) { } } +// TestOpenRejectsCorruptDB pins the actionable-error contract when +// state.db exists on disk but isn't a valid SQLite file. Users can +// hit this after a disk-full crash mid-write, a copy that truncated, +// or accidental manual editing. banger must surface the error +// cleanly so the operator can delete-and-retry — never panic, never +// silently overwrite, never leak a partially-opened sql.DB handle. +func TestOpenRejectsCorruptDB(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + path := filepath.Join(dir, "state.db") + garbage := []byte("this is definitely not a sqlite database") + if err := os.WriteFile(path, garbage, 0o600); err != nil { + t.Fatalf("WriteFile: %v", err) + } + + s, err := Open(path) + if err == nil { + _ = s.Close() + t.Fatal("Open: want error on corrupt DB file") + } + + // The garbage bytes must still be there — Open must not have + // overwritten the file mid-attempt. A user recovering from a + // mid-write crash needs that invariant to hand the file to a + // tool like sqlite3_analyzer. + got, readErr := os.ReadFile(path) + if readErr != nil { + t.Fatalf("ReadFile: %v", readErr) + } + if string(got) != string(garbage) { + t.Fatalf("Open touched the garbage file: got %q, want %q", string(got), string(garbage)) + } +} + +// TestOpenReadOnlyRejectsMissingDB pins the "no silent creation" +// contract for the doctor path: OpenReadOnly against a path that +// doesn't exist must error, not create an empty DB that later reads +// would mistake for "no VMs yet." +func TestOpenReadOnlyRejectsMissingDB(t *testing.T) { + t.Parallel() + missing := filepath.Join(t.TempDir(), "never-existed.db") + s, err := OpenReadOnly(missing) + if err == nil { + _ = s.Close() + t.Fatal("OpenReadOnly: want error when the DB file doesn't exist") + } + if _, statErr := os.Stat(missing); !os.IsNotExist(statErr) { + t.Fatalf("OpenReadOnly silently created %q (stat err = %v)", missing, statErr) + } +} + func openTestStore(t *testing.T) *Store { t.Helper() store, err := Open(filepath.Join(t.TempDir(), "state.db")) @@ -346,7 +397,6 @@ func sampleImage(name string) model.Image { ModulesDir: "/modules/" + name, BuildSize: "8G", SeededSSHPublicKeyFingerprint: "seeded-fingerprint", - Docker: true, CreatedAt: now, UpdatedAt: now, } @@ -372,7 +422,6 @@ func sampleVM(name, imageID, guestIP string) model.VMRecord { Runtime: model.VMRuntime{ State: model.VMStateStopped, GuestIP: guestIP, - TapDevice: "tap-" + name, APISockPath: "/tmp/" + name + ".sock", LogPath: "/tmp/" + name + ".log", MetricsPath: "/tmp/" + name + ".metrics", diff --git a/internal/system/ext4.go b/internal/system/ext4.go new file mode 100644 index 0000000..e0c8fcd --- /dev/null +++ b/internal/system/ext4.go @@ -0,0 +1,317 @@ +package system + +import ( + "bytes" + "context" + "fmt" + "os" + "strings" +) + +// ext4 mode bitmasks that debugfs's `set_inode_field ... mode` expects. +// debugfs wants the full file-type + permission word, not just the +// permission bits. Callers pass the permission portion; these constants +// OR it into the right file type. +const ( + ext4ModeRegularFile = 0o100000 // S_IFREG + ext4ModeDirectory = 0o040000 // S_IFDIR +) + +// MkdirExt4 creates a directory inside the ext4 image, setting its +// owner/group/mode to root:root: by default or whatever the +// caller passes. Idempotent: if the directory already exists, it's +// left alone and only the metadata (uid/gid/mode) is reset to what +// was requested. Runs a single `debugfs -w` invocation so ~all the +// state transitions land in one fs-lock window. +// +// guestPath must be an absolute path inside the ext4 image (e.g. +// "/.ssh"). The function escapes the path for debugfs before sending +// it down the wire. +func MkdirExt4(ctx context.Context, runner CommandRunner, imagePath, guestPath string, mode os.FileMode, uid, gid int) error { + escaped, err := escapeDebugfsGuestPath(guestPath) + if err != nil { + return err + } + var script bytes.Buffer + // `mkdir` errors if the entry already exists. Tolerate that by + // running `stat` first: on "exists" we skip the mkdir line and + // fall through to the metadata resets, which are idempotent. + exists, err := Ext4PathExists(ctx, runner, imagePath, guestPath) + if err != nil { + return err + } + if !exists { + fmt.Fprintf(&script, "mkdir %s\n", escaped) + } + fmt.Fprintf(&script, "set_inode_field %s mode 0%o\n", escaped, ext4ModeDirectory|(uint32(mode.Perm())&0o7777)) + fmt.Fprintf(&script, "set_inode_field %s uid %d\n", escaped, uid) + fmt.Fprintf(&script, "set_inode_field %s gid %d\n", escaped, gid) + return debugfsScript(ctx, runner, imagePath, &script) +} + +// MkdirAllExt4 creates each intermediate directory in guestPath that +// doesn't already exist, with the given mode/uid/gid. Mirrors +// os.MkdirAll's shape, not mkdir(1) -p: existing directories are left +// with their current metadata untouched (we don't reset mode/uid/gid +// on pre-existing parents, only on the final segment). Paths starting +// at "/" are allowed — the root is treated as pre-existing. +func MkdirAllExt4(ctx context.Context, runner CommandRunner, imagePath, guestPath string, mode os.FileMode, uid, gid int) error { + if err := rejectDebugfsUnsafePath(guestPath); err != nil { + return err + } + segments := strings.Split(strings.Trim(guestPath, "/"), "/") + cur := "" + for i, seg := range segments { + if seg == "" { + continue + } + cur = cur + "/" + seg + exists, err := Ext4PathExists(ctx, runner, imagePath, cur) + if err != nil { + return err + } + if exists { + continue + } + // Intermediate dirs inherit the requested mode/uid/gid too — + // callers that want a different mode on parents should create + // them explicitly. Matches the most common use (mkdir -p a + // config tree where every hop is root-owned). + if i < len(segments)-1 || !exists { + if err := MkdirExt4(ctx, runner, imagePath, cur, mode, uid, gid); err != nil { + return err + } + } + } + return nil +} + +// WriteExt4FileOwned copies `data` into : and +// forces the inode's uid/gid/mode to the requested values. Unlike +// WriteExt4FileMode, this helper does NOT assume the image is a +// root-owned block device: if the image is a regular file the daemon +// user owns, every call runs without sudo. That's the common case for +// work-disk writes (vm_authsync, image_seed, runFileSync). +// +// Safety: always remove the destination first so e2cp sees a clean +// target (avoids copy-into-existing-file quirks on older e2tools). +func WriteExt4FileOwned(ctx context.Context, runner CommandRunner, imagePath, guestPath string, mode os.FileMode, uid, gid int, data []byte) error { + tmp, err := stageDataTempfile(data, mode) + if err != nil { + return err + } + defer os.Remove(tmp) + + _, _ = extfsRun(ctx, runner, imagePath, "e2rm", imagePath+":"+guestPath) + if _, err := extfsRun(ctx, runner, imagePath, "e2cp", tmp, imagePath+":"+guestPath); err != nil { + return err + } + + // Fix per-file uid/gid/mode in a debugfs batch. e2cp -O/-G exist + // but ship inconsistently across distros; driving the inode via + // set_inode_field matches how imagepull.ApplyOwnership has worked + // reliably in production. + escaped, err := escapeDebugfsGuestPath(guestPath) + if err != nil { + return err + } + var script bytes.Buffer + fmt.Fprintf(&script, "set_inode_field %s mode 0%o\n", escaped, ext4ModeRegularFile|(uint32(mode.Perm())&0o7777)) + fmt.Fprintf(&script, "set_inode_field %s uid %d\n", escaped, uid) + fmt.Fprintf(&script, "set_inode_field %s gid %d\n", escaped, gid) + return debugfsScript(ctx, runner, imagePath, &script) +} + +// EnsureExt4RootPerms sets the filesystem root inode (inode <2>, +// which is what `/` resolves to) to the given directory mode + owner. +// sshd's StrictModes inside the guest walks the home directory's +// ownership; the work disk is mounted at /root in the guest, so its +// root inode is /root as far as sshd is concerned. Default-safe +// value: 0755 root:root. +// +// Note on debugfs mode semantics: `set_inode_field mode N` +// OVERWRITES the full i_mode word — it does NOT preserve the type +// nibble. Passing just the permission bits (e.g. 0755) would reset +// the root inode to a regular-file shape, and the next kernel mount +// would fail with "Structure needs cleaning." The constant ORed +// below restores the S_IFDIR type bits explicitly. +func EnsureExt4RootPerms(ctx context.Context, runner CommandRunner, imagePath string, mode os.FileMode, uid, gid int) error { + fullMode := ext4ModeDirectory | (uint32(mode.Perm()) & 0o7777) + var script bytes.Buffer + fmt.Fprintf(&script, "set_inode_field <2> mode 0%o\n", fullMode) + fmt.Fprintf(&script, "set_inode_field <2> uid %d\n", uid) + fmt.Fprintf(&script, "set_inode_field <2> gid %d\n", gid) + return debugfsScript(ctx, runner, imagePath, &script) +} + +// Ext4PathExists reports whether guestPath resolves inside imagePath. +// Missing-path is NOT an error — the boolean distinguishes them. +// Uses `debugfs -R "stat "` and inspects stderr for the +// standard "File not found" message e2fsprogs emits. +func Ext4PathExists(ctx context.Context, runner CommandRunner, imagePath, guestPath string) (bool, error) { + // debugfs stat wants the path without any extra quoting beyond + // what debugfs already does; we still reject quoting-hostile + // chars up front. + if err := rejectDebugfsUnsafePath(guestPath); err != nil { + return false, err + } + out, err := extfsRun(ctx, runner, imagePath, "debugfs", "-R", "stat "+guestPath, imagePath) + combined := strings.ToLower(string(out) + " " + fmt.Sprint(err)) + if strings.Contains(combined, "file not found") { + return false, nil + } + if err != nil { + return false, err + } + return true, nil +} + +// ReadExt4File reads guestPath from imagePath as raw bytes. Wraps the +// older ReadDebugFSText with a []byte return and the same unsafe-path +// rejection the write helpers use. +func ReadExt4File(ctx context.Context, runner CommandRunner, imagePath, guestPath string) ([]byte, error) { + if err := rejectDebugfsUnsafePath(guestPath); err != nil { + return nil, err + } + out, err := extfsRun(ctx, runner, imagePath, "debugfs", "-R", "cat "+guestPath, imagePath) + if err != nil { + return nil, err + } + return out, nil +} + +// ---- internal helpers ---- + +// extfsRun executes an ext4-toolkit command against imagePath, +// auto-elevating to sudo when imagePath is a block device (dm-snapshot +// targets, raw loop devices) and staying as the invoking user when +// it's a regular file (the user-owned .ext4 files under StateDir that +// this refactor targets). Tests that don't care can pass any runner +// that satisfies CommandRunner. +func extfsRun(ctx context.Context, runner CommandRunner, imagePath, name string, args ...string) ([]byte, error) { + if needsElevation(imagePath) { + all := append([]string{name}, args...) + return runner.RunSudo(ctx, all...) + } + return runner.Run(ctx, name, args...) +} + +// needsElevation returns true when imagePath is something only root +// can write to (block devices owned root:disk). For regular files +// the invoking user owns, returns false. On stat failure we err on +// the side of NOT elevating — the subsequent tool invocation will +// surface a clearer error than a bogus sudo escalation would. +func needsElevation(imagePath string) bool { + info, err := os.Stat(imagePath) + if err != nil { + return false + } + return !info.Mode().IsRegular() +} + +// debugfsScript streams a scripted batch to `debugfs -w -f - +// `. Requires the runner to implement StdinRunner — every +// production runner in banger does, but test doubles may not, in +// which case we fall back to one debugfs invocation per line. The +// fallback is a correctness net; production always gets the batched +// single-invocation path. +func debugfsScript(ctx context.Context, runner CommandRunner, imagePath string, script *bytes.Buffer) error { + if script.Len() == 0 { + return nil + } + stdinRunner, ok := runner.(StdinRunner) + if ok { + // StdinRunner's interface always runs un-elevated (it's a + // Runner method, not RunSudo). For block devices we need sudo. + // When elevation is required, fall through to the per-line + // path which routes through extfsRun. + if !needsElevation(imagePath) { + out, err := stdinRunner.RunStdin(ctx, script, "debugfs", "-w", "-f", "-", imagePath) + if err != nil { + return fmt.Errorf("debugfs batch: %w: %s", err, bytes.TrimSpace(out)) + } + return nil + } + } + // Per-line fallback. Not ideal for throughput but preserves + // semantics in tests and in the rare case we run against a + // block device via this toolkit. + for _, line := range strings.Split(script.String(), "\n") { + line = strings.TrimSpace(line) + if line == "" { + continue + } + if _, err := extfsRun(ctx, runner, imagePath, "debugfs", "-w", "-R", line, imagePath); err != nil { + return fmt.Errorf("debugfs %q: %w", line, err) + } + } + return nil +} + +// escapeDebugfsGuestPath produces a debugfs-safe rendition of the +// guest path. debugfs tokenises on whitespace by default; paths with +// spaces must be double-quoted. Paths containing the double-quote +// itself, backslashes, or newlines are rejected outright — quoting +// those reliably in debugfs's hand-rolled parser is lore we don't +// want to inherit. +func escapeDebugfsGuestPath(guestPath string) (string, error) { + if err := rejectDebugfsUnsafePath(guestPath); err != nil { + return "", err + } + if strings.ContainsAny(guestPath, " \t") { + return `"` + guestPath + `"`, nil + } + return guestPath, nil +} + +func rejectDebugfsUnsafePath(guestPath string) error { + if guestPath == "" { + return fmt.Errorf("guest path is required") + } + if !strings.HasPrefix(guestPath, "/") { + return fmt.Errorf("guest path %q must be absolute", guestPath) + } + if strings.ContainsAny(guestPath, "\"\\\n\r") { + return fmt.Errorf("guest path %q contains characters debugfs cannot safely encode", guestPath) + } + return nil +} + +func stageDataTempfile(data []byte, mode os.FileMode) (string, error) { + tmp, err := os.CreateTemp("", "banger-ext4-*") + if err != nil { + return "", err + } + path := tmp.Name() + if _, err := tmp.Write(data); err != nil { + _ = tmp.Close() + _ = os.Remove(path) + return "", err + } + if err := tmp.Close(); err != nil { + _ = os.Remove(path) + return "", err + } + if err := os.Chmod(path, mode.Perm()); err != nil { + _ = os.Remove(path) + return "", err + } + return path, nil +} + +// RdumpExt4Dir shells out to `debugfs -R "rdump " image` +// to spill a tree from the ext4 image into a host directory. Used by +// ensureWorkDisk's no-seed path to extract /root from the base rootfs +// without mounting. Content is preserved; per-entry metadata (uid, +// gid, mode) is captured via a subsequent stat walk inside debugfs. +// Returns the destination directory (same as dst on success). +func RdumpExt4Dir(ctx context.Context, runner CommandRunner, imagePath, srcPath, dstDir string) error { + if err := rejectDebugfsUnsafePath(srcPath); err != nil { + return err + } + if err := os.MkdirAll(dstDir, 0o755); err != nil { + return err + } + _, err := extfsRun(ctx, runner, imagePath, "debugfs", "-R", "rdump "+srcPath+" "+dstDir, imagePath) + return err +} diff --git a/internal/system/ext4_test.go b/internal/system/ext4_test.go new file mode 100644 index 0000000..3e23bb5 --- /dev/null +++ b/internal/system/ext4_test.go @@ -0,0 +1,322 @@ +package system + +import ( + "bytes" + "context" + "errors" + "io" + "os" + "path/filepath" + "strings" + "testing" +) + +// stdinFuncRunner is funcRunner extended with a RunStdin hook so we +// can assert the exact debugfs batch script that callers stream in. +type stdinFuncRunner struct { + funcRunner + runStdin func(ctx context.Context, stdin io.Reader, name string, args ...string) ([]byte, error) +} + +func (r stdinFuncRunner) RunStdin(ctx context.Context, stdin io.Reader, name string, args ...string) ([]byte, error) { + if r.runStdin == nil { + return nil, errors.New("unexpected RunStdin call") + } + return r.runStdin(ctx, stdin, name, args...) +} + +// userOwnedImage writes a zero-length regular file at a tempdir and +// returns its path. Regular files trigger extfsRun's non-sudo branch, +// which is the whole point of the new toolkit. +func userOwnedImage(t *testing.T) string { + t.Helper() + path := filepath.Join(t.TempDir(), "work.ext4") + if err := os.WriteFile(path, []byte{}, 0o644); err != nil { + t.Fatalf("write image: %v", err) + } + return path +} + +func TestExt4PathExists(t *testing.T) { + image := userOwnedImage(t) + + t.Run("path found", func(t *testing.T) { + r := funcRunner{ + run: func(_ context.Context, name string, args ...string) ([]byte, error) { + if name != "debugfs" { + t.Fatalf("name = %q, want debugfs", name) + } + want := []string{"-R", "stat /root/.ssh", image} + for i := range want { + if args[i] != want[i] { + t.Fatalf("args[%d] = %q, want %q (full %v)", i, args[i], want[i], args) + } + } + return []byte("Inode: 12 Type: directory"), nil + }, + } + ok, err := Ext4PathExists(context.Background(), r, image, "/root/.ssh") + if err != nil { + t.Fatalf("Ext4PathExists: %v", err) + } + if !ok { + t.Fatal("expected exists = true") + } + }) + + t.Run("path missing", func(t *testing.T) { + r := funcRunner{ + run: func(context.Context, string, ...string) ([]byte, error) { + // debugfs prints the "File not found" message to stdout + // on lookup miss. No exit error (debugfs exits 0 for + // soft misses on `stat`). + return []byte("stat: File not found by ext2_lookup while starting pathname"), nil + }, + } + ok, err := Ext4PathExists(context.Background(), r, image, "/root/.ssh") + if err != nil { + t.Fatalf("Ext4PathExists: %v", err) + } + if ok { + t.Fatal("expected exists = false") + } + }) + + t.Run("rejects hostile path", func(t *testing.T) { + r := funcRunner{} + if _, err := Ext4PathExists(context.Background(), r, image, `/evil"path`); err == nil { + t.Fatal("expected rejection for path containing double-quote") + } + }) +} + +func TestReadExt4File(t *testing.T) { + image := userOwnedImage(t) + r := funcRunner{ + run: func(_ context.Context, name string, args ...string) ([]byte, error) { + if name != "debugfs" { + t.Fatalf("name = %q, want debugfs", name) + } + if args[0] != "-R" || args[1] != "cat /etc/fstab" { + t.Fatalf("args = %v, want -R \"cat /etc/fstab\" ...", args) + } + return []byte("tmpfs /tmp tmpfs defaults 0 0\n"), nil + }, + } + got, err := ReadExt4File(context.Background(), r, image, "/etc/fstab") + if err != nil { + t.Fatalf("ReadExt4File: %v", err) + } + if !bytes.Contains(got, []byte("tmpfs /tmp")) { + t.Fatalf("got = %q, want contains tmpfs line", got) + } +} + +func TestMkdirExt4_BatchesStatMkdirAndMetadata(t *testing.T) { + image := userOwnedImage(t) + + var capturedScript string + r := stdinFuncRunner{ + funcRunner: funcRunner{ + run: func(_ context.Context, name string, args ...string) ([]byte, error) { + // The only non-stdin call should be the existence check. + if name == "debugfs" && len(args) >= 2 && args[0] == "-R" && strings.HasPrefix(args[1], "stat ") { + return []byte("stat: File not found"), nil + } + t.Fatalf("unexpected Run(%q, %v)", name, args) + return nil, nil + }, + }, + runStdin: func(_ context.Context, stdin io.Reader, name string, args ...string) ([]byte, error) { + if name != "debugfs" { + t.Fatalf("stdin runner name = %q, want debugfs", name) + } + want := []string{"-w", "-f", "-", image} + for i, w := range want { + if args[i] != w { + t.Fatalf("stdin args[%d] = %q, want %q", i, args[i], w) + } + } + b, _ := io.ReadAll(stdin) + capturedScript = string(b) + return nil, nil + }, + } + + if err := MkdirExt4(context.Background(), r, image, "/.ssh", 0o700, 0, 0); err != nil { + t.Fatalf("MkdirExt4: %v", err) + } + + // mkdir line must be present (path didn't exist). + if !strings.Contains(capturedScript, "mkdir /.ssh") { + t.Fatalf("script missing mkdir line:\n%s", capturedScript) + } + // Mode must include the directory file-type nibble (040000 | 0700 = 040700). + if !strings.Contains(capturedScript, "set_inode_field /.ssh mode 040700") { + t.Fatalf("script missing mode line with S_IFDIR+0700:\n%s", capturedScript) + } + if !strings.Contains(capturedScript, "set_inode_field /.ssh uid 0") { + t.Fatalf("script missing uid line:\n%s", capturedScript) + } + if !strings.Contains(capturedScript, "set_inode_field /.ssh gid 0") { + t.Fatalf("script missing gid line:\n%s", capturedScript) + } +} + +func TestMkdirExt4_SkipsMkdirWhenDirectoryExists(t *testing.T) { + image := userOwnedImage(t) + + var capturedScript string + r := stdinFuncRunner{ + funcRunner: funcRunner{ + run: func(_ context.Context, name string, args ...string) ([]byte, error) { + // First call: existence probe. Return success. + if name == "debugfs" && args[0] == "-R" && strings.HasPrefix(args[1], "stat ") { + return []byte("Inode: 12 Type: directory Mode: 0700"), nil + } + t.Fatalf("unexpected Run(%q, %v)", name, args) + return nil, nil + }, + }, + runStdin: func(_ context.Context, stdin io.Reader, _ string, _ ...string) ([]byte, error) { + b, _ := io.ReadAll(stdin) + capturedScript = string(b) + return nil, nil + }, + } + + if err := MkdirExt4(context.Background(), r, image, "/.ssh", 0o700, 0, 0); err != nil { + t.Fatalf("MkdirExt4: %v", err) + } + + // Directory existed — no mkdir line, but metadata lines still fire. + if strings.Contains(capturedScript, "mkdir ") { + t.Fatalf("script should not contain mkdir for pre-existing path:\n%s", capturedScript) + } + if !strings.Contains(capturedScript, "set_inode_field /.ssh mode") { + t.Fatalf("script missing metadata reset for pre-existing dir:\n%s", capturedScript) + } +} + +func TestWriteExt4FileOwned_StagesTempfileAndBatchesOwnership(t *testing.T) { + image := userOwnedImage(t) + + var observedTemp string + var capturedScript string + r := stdinFuncRunner{ + funcRunner: funcRunner{ + run: func(_ context.Context, name string, args ...string) ([]byte, error) { + switch name { + case "e2rm": + // First non-stdin call — best-effort, we don't + // verify the target since e2rm on a missing path + // returns a visible error but the caller ignores it. + return nil, nil + case "e2cp": + if len(args) != 2 { + t.Fatalf("e2cp args = %v, want 2 (src, dst)", args) + } + observedTemp = args[0] + // Assert the dst uses the image:path form. + if args[1] != image+":/root/.ssh/authorized_keys" { + t.Fatalf("e2cp dst = %q, want image:path", args[1]) + } + // Assert the temp file was populated with our data + // BEFORE e2cp was called. + data, err := os.ReadFile(args[0]) + if err != nil { + t.Fatalf("temp missing at e2cp time: %v", err) + } + if !bytes.Equal(data, []byte("managed-key\n")) { + t.Fatalf("temp contents = %q, want managed-key", data) + } + return nil, nil + } + t.Fatalf("unexpected Run(%q, %v)", name, args) + return nil, nil + }, + }, + runStdin: func(_ context.Context, stdin io.Reader, _ string, _ ...string) ([]byte, error) { + b, _ := io.ReadAll(stdin) + capturedScript = string(b) + return nil, nil + }, + } + + err := WriteExt4FileOwned( + context.Background(), r, image, + "/root/.ssh/authorized_keys", + 0o600, 0, 0, + []byte("managed-key\n"), + ) + if err != nil { + t.Fatalf("WriteExt4FileOwned: %v", err) + } + + // Temp cleanup ran — we saved observedTemp while it still existed; + // by now it should be gone. + if observedTemp == "" { + t.Fatal("e2cp source path was never captured") + } + if _, err := os.Stat(observedTemp); !os.IsNotExist(err) { + t.Fatalf("temp file not cleaned up: stat err = %v", err) + } + + // Mode line must bake in S_IFREG (0100000) + 0600 = 0100600. + if !strings.Contains(capturedScript, "set_inode_field /root/.ssh/authorized_keys mode 0100600") { + t.Fatalf("script missing regular-file mode line:\n%s", capturedScript) + } +} + +func TestEnsureExt4RootPerms_UsesRootInodeLiteral(t *testing.T) { + image := userOwnedImage(t) + + var capturedScript string + r := stdinFuncRunner{ + funcRunner: funcRunner{}, + runStdin: func(_ context.Context, stdin io.Reader, _ string, _ ...string) ([]byte, error) { + b, _ := io.ReadAll(stdin) + capturedScript = string(b) + return nil, nil + }, + } + + if err := EnsureExt4RootPerms(context.Background(), r, image, 0o755, 0, 0); err != nil { + t.Fatalf("EnsureExt4RootPerms: %v", err) + } + + // Must address inode 2 — the ext4 root directory — with the + // FULL i_mode word (S_IFDIR | 0755 = 040755). debugfs's + // set_inode_field doesn't preserve the type nibble, so passing + // just the permission bits (0755) would reset the root inode + // to regular-file shape and break the next kernel mount. + if !strings.Contains(capturedScript, "set_inode_field <2> mode 040755") { + t.Fatalf("script missing root-inode mode line with S_IFDIR+0755:\n%s", capturedScript) + } + if !strings.Contains(capturedScript, "set_inode_field <2> uid 0") { + t.Fatalf("script missing root-inode uid line:\n%s", capturedScript) + } +} + +func TestRejectDebugfsUnsafePath(t *testing.T) { + for _, tc := range []struct { + name string + path string + wantErr bool + }{ + {"empty", "", true}, + {"relative", "relative/path", true}, + {"absolute plain", "/ok", false}, + {"absolute with space", "/ok path", false}, + {"contains double-quote", `/a"b`, true}, + {"contains backslash", `/a\b`, true}, + {"contains newline", "/a\nb", true}, + } { + t.Run(tc.name, func(t *testing.T) { + err := rejectDebugfsUnsafePath(tc.path) + if (err != nil) != tc.wantErr { + t.Fatalf("rejectDebugfsUnsafePath(%q) err = %v, wantErr = %v", tc.path, err, tc.wantErr) + } + }) + } +} diff --git a/internal/system/extra_test.go b/internal/system/extra_test.go new file mode 100644 index 0000000..ce912e4 --- /dev/null +++ b/internal/system/extra_test.go @@ -0,0 +1,133 @@ +package system + +import ( + "context" + "encoding/json" + "os" + "path/filepath" + "runtime" + "testing" +) + +func TestWriteJSONRoundtrip(t *testing.T) { + path := filepath.Join(t.TempDir(), "out.json") + value := map[string]any{"name": "banger", "n": 42.0} + if err := WriteJSON(path, value); err != nil { + t.Fatalf("WriteJSON: %v", err) + } + data, err := os.ReadFile(path) + if err != nil { + t.Fatalf("ReadFile: %v", err) + } + var got map[string]any + if err := json.Unmarshal(data, &got); err != nil { + t.Fatalf("Unmarshal: %v", err) + } + if got["name"] != "banger" || got["n"].(float64) != 42.0 { + t.Fatalf("decoded = %v", got) + } +} + +func TestWriteJSONErrorsForUnmarshalable(t *testing.T) { + path := filepath.Join(t.TempDir(), "out.json") + if err := WriteJSON(path, make(chan int)); err == nil { + t.Fatal("expected marshal error for channel value") + } + if _, err := os.Stat(path); !os.IsNotExist(err) { + t.Fatalf("expected no file when marshal fails, got %v", err) + } +} + +func TestTailCommand(t *testing.T) { + cmd := TailCommand("/tmp/log.txt", false) + if cmd == nil || cmd.Path == "" { + t.Fatal("TailCommand(false) returned nil/empty") + } + // follow=false → cat, follow=true → tail -f. + if !hasArg(cmd.Args, "/tmp/log.txt") { + t.Fatalf("cat args missing path: %v", cmd.Args) + } + + followCmd := TailCommand("/tmp/log.txt", true) + if !hasArg(followCmd.Args, "-f") { + t.Fatalf("follow cmd missing -f: %v", followCmd.Args) + } + if !hasArg(followCmd.Args, "/tmp/log.txt") { + t.Fatalf("follow cmd missing path: %v", followCmd.Args) + } +} + +func hasArg(args []string, want string) bool { + for _, a := range args { + if a == want { + return true + } + } + return false +} + +func TestReportAddWarnAndHasFailures(t *testing.T) { + var r Report + r.AddPass("a") + r.AddWarn("b", "detail-1", "detail-2") + if r.HasFailures() { + t.Fatal("HasFailures should be false with only pass+warn") + } + if len(r.Checks) != 2 { + t.Fatalf("len(Checks) = %d, want 2", len(r.Checks)) + } + if r.Checks[1].Status != CheckStatusWarn { + t.Fatalf("check[1].Status = %v, want warn", r.Checks[1].Status) + } + if len(r.Checks[1].Details) != 2 { + t.Fatalf("warn details lost: %v", r.Checks[1].Details) + } + + r.AddFail("c") + if !r.HasFailures() { + t.Fatal("HasFailures should be true after AddFail") + } +} + +func TestRequireCommandsMissing(t *testing.T) { + err := RequireCommands(context.Background(), "this-command-cannot-possibly-exist-xyz-123") + if err == nil { + t.Fatal("expected error for missing command") + } +} + +func TestRequireCommandsPresent(t *testing.T) { + // `go` is guaranteed on PATH during test runs. + if err := RequireCommands(context.Background(), "go"); err != nil { + t.Fatalf("RequireCommands(go): %v", err) + } +} + +func TestReadHostResources(t *testing.T) { + if runtime.GOOS != "linux" { + t.Skip("ReadHostResources reads /proc/meminfo; Linux-only") + } + res, err := ReadHostResources() + if err != nil { + t.Fatalf("ReadHostResources: %v", err) + } + if res.CPUCount <= 0 { + t.Errorf("CPUCount = %d, want > 0", res.CPUCount) + } + if res.TotalMemoryBytes <= 0 { + t.Errorf("TotalMemoryBytes = %d, want > 0", res.TotalMemoryBytes) + } +} + +func TestShortIDEdgeCases(t *testing.T) { + if got := ShortID(""); got != "" { + t.Errorf("ShortID('') = %q, want ''", got) + } + if got := ShortID("short"); got != "short" { + t.Errorf("ShortID('short') = %q, want 'short'", got) + } + long := "0123456789abcdef" + if got := ShortID(long); got != "01234567" { + t.Errorf("ShortID(long) = %q, want 01234567", got) + } +} diff --git a/internal/system/files.go b/internal/system/files.go index 3ca732e..b6ec381 100644 --- a/internal/system/files.go +++ b/internal/system/files.go @@ -16,6 +16,17 @@ const ( minWorkSeedBytes int64 = 512 * 1024 * 1024 workSeedSlackBytes int64 = 256 * 1024 * 1024 workSeedRoundBytes int64 = 64 * 1024 * 1024 + + // MkfsExtraOptions are the -E flags banger always passes to + // mkfs.ext4 for VM-internal images. root_owner stamps inode 2 + // (the fs root) as root:root so sshd's StrictModes accepts the + // resulting /root in the guest. lazy_itable_init + lazy_journal_init + // skip the inode-table and journal zeroing pass at mkfs time — + // the kernel does it lazily on first write inside the guest. On + // an 8 GiB work disk this saves roughly 500-700ms of host CPU/IO + // per 'banger vm create' for a one-time, small per-write cost + // inside the guest that nobody notices. + MkfsExtraOptions = "root_owner=0:0,lazy_itable_init=1,lazy_journal_init=1" ) func CopyFilePreferClone(sourcePath, targetPath string) error { @@ -57,6 +68,72 @@ func CopyFilePreferClone(sourcePath, targetPath string) error { return nil } +// AtomicReplace replaces dst with newSrc, keeping the previous file +// (if any) at dst+suffixPrevious so the caller can roll back on a +// post-restart verification failure. The new path is renamed into +// place atomically (single os.Rename — atomic on a single fs); if +// dst sits on a different filesystem than newSrc, the operation +// returns an error rather than falling back to copy+remove because +// non-atomic copy is the wrong story for executable swap. +// +// Used by `banger update` to swap the three banger binaries: +// +// src = /var/cache/banger/updates/staged/banger +// dst = /usr/local/bin/banger +// dst+previous = /usr/local/bin/banger.previous +// +// Pre-existing dst+previous from a half-finished prior update is +// removed first; the helper assumes the operator has confirmed the +// current install is healthy before invoking it. +func AtomicReplace(newSrc, dst, suffixPrevious string) error { + if suffixPrevious == "" { + return fmt.Errorf("AtomicReplace: empty suffixPrevious would clobber dst") + } + prev := dst + suffixPrevious + if err := os.Remove(prev); err != nil && !os.IsNotExist(err) { + return fmt.Errorf("clear %s: %w", prev, err) + } + if _, err := os.Stat(dst); err == nil { + if err := os.Rename(dst, prev); err != nil { + return fmt.Errorf("backup %s -> %s: %w", dst, prev, err) + } + } else if !os.IsNotExist(err) { + return fmt.Errorf("stat %s: %w", dst, err) + } + if err := os.Rename(newSrc, dst); err != nil { + // Best-effort restore of the backup so we don't leave the + // caller without the binary they had a moment ago. + if rErr := os.Rename(prev, dst); rErr != nil { + return fmt.Errorf("install %s: %w (and restore from %s failed: %v)", dst, err, prev, rErr) + } + return fmt.Errorf("install %s: %w (restored previous)", dst, err) + } + return nil +} + +// AtomicReplaceRollback restores the file backed up by an earlier +// AtomicReplace call. Symmetric inverse: pulls dst+suffixPrevious +// back to dst. If dst+suffixPrevious doesn't exist (no prior backup, +// e.g. fresh-install update), returns nil — there's nothing to do. +func AtomicReplaceRollback(dst, suffixPrevious string) error { + prev := dst + suffixPrevious + if _, err := os.Stat(prev); os.IsNotExist(err) { + return nil + } else if err != nil { + return err + } + // Remove the in-place file so the rename of the .previous backup + // doesn't fail. os.Rename overwrites silently on Linux, but be + // explicit so cross-fs / read-only-mount cases surface here. + if err := os.Remove(dst); err != nil && !os.IsNotExist(err) { + return fmt.Errorf("remove %s before rollback: %w", dst, err) + } + if err := os.Rename(prev, dst); err != nil { + return fmt.Errorf("rollback %s -> %s: %w", prev, dst, err) + } + return nil +} + func WorkSeedPath(rootfsPath string) string { rootfsPath = strings.TrimSpace(rootfsPath) if rootfsPath == "" { @@ -68,14 +145,33 @@ func WorkSeedPath(rootfsPath string) string { return rootfsPath + ".work-seed" } +// BuildWorkSeedImage creates a sized ext4 image at outPath containing +// the /root subtree of rootfsPath. Uses only sudoless tooling: rdump +// to extract via debugfs, mkfs.ext4 to create the empty image (the +// output file is user-owned, so no elevation needed), and the ext4 +// toolkit (MkdirExt4 / WriteExt4FileOwned) to ingest each entry as +// root:root. Symlinks and special files are skipped — /root in a +// stock distro contains regular files and dirs only. func BuildWorkSeedImage(ctx context.Context, runner CommandRunner, rootfsPath, outPath string) error { - rootMount, cleanupRoot, err := MountTempDir(ctx, runner, rootfsPath, true) + stage, err := os.MkdirTemp("", "banger-work-seed-stage-") if err != nil { return err } - defer cleanupRoot() + defer os.RemoveAll(stage) + + if err := RdumpExt4Dir(ctx, runner, rootfsPath, "/root", stage); err != nil { + return fmt.Errorf("extract /root from %s: %w", rootfsPath, err) + } + rootHome := filepath.Join(stage, "root") + if _, err := os.Stat(rootHome); err != nil { + // rootfs has no /root (unusual). Build an empty seed so the + // caller still gets a usable artifact — VMs cloning it will + // just see an empty fs root, same as the no-seed fallback. + if err := os.MkdirAll(rootHome, 0o755); err != nil { + return err + } + } - rootHome := filepath.Join(rootMount, "root") sizeBytes, err := estimateWorkSeedSize(ctx, runner, rootHome) if err != nil { return err @@ -93,17 +189,95 @@ func BuildWorkSeedImage(ctx context.Context, runner CommandRunner, rootfsPath, o if err := os.Truncate(outPath, sizeBytes); err != nil { return err } - if _, err := runner.Run(ctx, "mkfs.ext4", "-F", outPath); err != nil { + // root_owner stamps inode 2 (which becomes /root in the guest) + // as root:root. Per-entry owners are forced via the ext4 toolkit + // walk below. + if _, err := runner.Run(ctx, "mkfs.ext4", "-F", "-E", MkfsExtraOptions, outPath); err != nil { return err } + return ingestWorkSeedTree(ctx, runner, outPath, rootHome) +} - workMount, cleanupWork, err := MountTempDir(ctx, runner, outPath, false) +// MaterializeWorkDisk creates a fresh ext4 image at workDiskPath sized +// to sizeBytes, then ingests the contents of seedPath (an ext4 image +// produced by BuildWorkSeedImage) into it. +// +// Replaces a copy-then-resize flow that needed to push every byte of +// seedPath through the kernel even though the seed is mostly empty +// filesystem padding — minWorkSeedBytes is 512 MiB but the actual +// payload is a handful of dotfiles. The mkfs + walk path runs in +// roughly a second regardless of the requested work-disk size. +func MaterializeWorkDisk(ctx context.Context, runner CommandRunner, seedPath, workDiskPath string, sizeBytes int64) error { + if err := os.RemoveAll(workDiskPath); err != nil && !os.IsNotExist(err) { + return err + } + file, err := os.OpenFile(workDiskPath, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, 0o644) if err != nil { return err } - defer cleanupWork() + if err := file.Close(); err != nil { + return err + } + if err := os.Truncate(workDiskPath, sizeBytes); err != nil { + return err + } + if _, err := runner.Run(ctx, "mkfs.ext4", "-F", "-E", MkfsExtraOptions, workDiskPath); err != nil { + return err + } - return CopyDirContents(ctx, runner, rootHome, workMount, true) + stage, err := os.MkdirTemp("", "banger-work-disk-stage-") + if err != nil { + return err + } + defer os.RemoveAll(stage) + + // rdump / dumps the seed's filesystem root contents directly into + // stage (no extra wrapping directory). lost+found is recreated by + // mkfs above, so the walk skips it at the top level. + if err := RdumpExt4Dir(ctx, runner, seedPath, "/", stage); err != nil { + return fmt.Errorf("extract seed %s: %w", seedPath, err) + } + return ingestWorkSeedTree(ctx, runner, workDiskPath, stage) +} + +// ingestWorkSeedTree walks the staged host tree and writes every +// directory and regular file into the work-seed ext4 as root:root, +// preserving source mode bits. Symlinks and special files are +// skipped silently — they are vanishingly rare in distro /root and +// don't survive the work-seed → work-disk clone path either. +// +// The top-level lost+found directory is skipped: mkfs.ext4 creates +// it on every fresh image, so re-ingesting it from the seed would +// either duplicate or fail with "exists". +func ingestWorkSeedTree(ctx context.Context, runner CommandRunner, imagePath, srcRoot string) error { + srcRoot = filepath.Clean(srcRoot) + return filepath.Walk(srcRoot, func(hostPath string, info os.FileInfo, walkErr error) error { + if walkErr != nil { + return walkErr + } + if hostPath == srcRoot { + return nil + } + rel, err := filepath.Rel(srcRoot, hostPath) + if err != nil { + return err + } + if rel == "lost+found" { + return filepath.SkipDir + } + guestPath := "/" + filepath.ToSlash(rel) + switch { + case info.IsDir(): + return MkdirExt4(ctx, runner, imagePath, guestPath, info.Mode().Perm(), 0, 0) + case info.Mode().IsRegular(): + data, err := os.ReadFile(hostPath) + if err != nil { + return err + } + return WriteExt4FileOwned(ctx, runner, imagePath, guestPath, info.Mode().Perm(), 0, 0, data) + } + return nil + }) } func estimateWorkSeedSize(ctx context.Context, runner CommandRunner, rootHome string) (int64, error) { diff --git a/internal/system/files_test.go b/internal/system/files_test.go new file mode 100644 index 0000000..6641bf5 --- /dev/null +++ b/internal/system/files_test.go @@ -0,0 +1,154 @@ +package system + +import ( + "os" + "path/filepath" + "strings" + "testing" +) + +// TestAtomicReplaceMovesPreviousAside pins the basic shape: an existing +// dst is moved to dst+suffix, and newSrc is renamed into place. +// Critical for `banger update` — without the .previous backup the +// rollback path has nothing to restore. +func TestAtomicReplaceMovesPreviousAside(t *testing.T) { + dir := t.TempDir() + dst := filepath.Join(dir, "banger") + if err := os.WriteFile(dst, []byte("old"), 0o755); err != nil { + t.Fatalf("write dst: %v", err) + } + src := filepath.Join(dir, "banger.new") + if err := os.WriteFile(src, []byte("new"), 0o755); err != nil { + t.Fatalf("write src: %v", err) + } + + if err := AtomicReplace(src, dst, ".previous"); err != nil { + t.Fatalf("AtomicReplace: %v", err) + } + + got, _ := os.ReadFile(dst) + if string(got) != "new" { + t.Fatalf("dst content = %q, want %q", got, "new") + } + prev, _ := os.ReadFile(dst + ".previous") + if string(prev) != "old" { + t.Fatalf("backup content = %q, want %q", prev, "old") + } + // src must be gone (it was renamed, not copied). + if _, err := os.Stat(src); !os.IsNotExist(err) { + t.Fatalf("src should have been renamed away; got %v", err) + } +} + +// TestAtomicReplaceFreshInstall covers the case where dst doesn't +// exist yet (fresh install). Should still install newSrc; no backup +// is left behind. +func TestAtomicReplaceFreshInstall(t *testing.T) { + dir := t.TempDir() + dst := filepath.Join(dir, "banger") + src := filepath.Join(dir, "banger.new") + if err := os.WriteFile(src, []byte("new"), 0o755); err != nil { + t.Fatalf("write src: %v", err) + } + + if err := AtomicReplace(src, dst, ".previous"); err != nil { + t.Fatalf("AtomicReplace: %v", err) + } + + got, _ := os.ReadFile(dst) + if string(got) != "new" { + t.Fatalf("dst content = %q, want %q", got, "new") + } + if _, err := os.Stat(dst + ".previous"); !os.IsNotExist(err) { + t.Fatalf(".previous should not exist for a fresh install") + } +} + +// TestAtomicReplaceClearsStaleBackup: a leftover .previous from a +// half-finished prior update would otherwise block the rename. +// AtomicReplace must clear it. +func TestAtomicReplaceClearsStaleBackup(t *testing.T) { + dir := t.TempDir() + dst := filepath.Join(dir, "banger") + if err := os.WriteFile(dst, []byte("old"), 0o755); err != nil { + t.Fatalf("write dst: %v", err) + } + if err := os.WriteFile(dst+".previous", []byte("ancient"), 0o755); err != nil { + t.Fatalf("write stale previous: %v", err) + } + src := filepath.Join(dir, "banger.new") + if err := os.WriteFile(src, []byte("new"), 0o755); err != nil { + t.Fatalf("write src: %v", err) + } + + if err := AtomicReplace(src, dst, ".previous"); err != nil { + t.Fatalf("AtomicReplace: %v", err) + } + prev, _ := os.ReadFile(dst + ".previous") + if string(prev) != "old" { + t.Fatalf("backup content = %q, want %q (stale 'ancient' should have been overwritten with the just-replaced 'old')", prev, "old") + } +} + +// TestAtomicReplaceRefusesEmptySuffix is paranoia: an empty suffix +// would silently no-op the backup AND break rollback. Refuse rather +// than letting the caller paint themselves into a corner. +func TestAtomicReplaceRefusesEmptySuffix(t *testing.T) { + dir := t.TempDir() + dst := filepath.Join(dir, "banger") + src := filepath.Join(dir, "banger.new") + _ = os.WriteFile(dst, []byte("old"), 0o755) + _ = os.WriteFile(src, []byte("new"), 0o755) + err := AtomicReplace(src, dst, "") + if err == nil { + t.Fatal("AtomicReplace with empty suffix succeeded; want error") + } + if !strings.Contains(err.Error(), "suffixPrevious") { + t.Fatalf("err = %v, want suffix-related message", err) + } +} + +// TestAtomicReplaceRollbackRestoresPrevious pins the rollback story +// after a doctor failure: AtomicReplaceRollback restores the .previous +// backup back into place. +func TestAtomicReplaceRollbackRestoresPrevious(t *testing.T) { + dir := t.TempDir() + dst := filepath.Join(dir, "banger") + src := filepath.Join(dir, "banger.new") + _ = os.WriteFile(dst, []byte("old"), 0o755) + _ = os.WriteFile(src, []byte("new"), 0o755) + if err := AtomicReplace(src, dst, ".previous"); err != nil { + t.Fatalf("AtomicReplace: %v", err) + } + + if err := AtomicReplaceRollback(dst, ".previous"); err != nil { + t.Fatalf("Rollback: %v", err) + } + got, _ := os.ReadFile(dst) + if string(got) != "old" { + t.Fatalf("post-rollback dst = %q, want %q", got, "old") + } + if _, err := os.Stat(dst + ".previous"); !os.IsNotExist(err) { + t.Fatalf(".previous should be gone after rollback; stat err = %v", err) + } +} + +// TestAtomicReplaceRollbackTolerantWhenNoBackup: rolling back when +// there's nothing to roll back (fresh-install case) must be a no-op, +// not an error. The updater calls Rollback unconditionally on +// failure paths and shouldn't have to track "was there a backup?" +// itself. +func TestAtomicReplaceRollbackTolerantWhenNoBackup(t *testing.T) { + dir := t.TempDir() + dst := filepath.Join(dir, "banger") + if err := os.WriteFile(dst, []byte("current"), 0o755); err != nil { + t.Fatalf("write dst: %v", err) + } + if err := AtomicReplaceRollback(dst, ".previous"); err != nil { + t.Fatalf("Rollback should be a no-op when no backup exists; got %v", err) + } + got, _ := os.ReadFile(dst) + if string(got) != "current" { + t.Fatalf("dst was disturbed despite no backup: %q", got) + } +} diff --git a/internal/system/system.go b/internal/system/system.go index 6368ed6..3c4a5ba 100644 --- a/internal/system/system.go +++ b/internal/system/system.go @@ -27,10 +27,34 @@ type CommandRunner interface { RunSudo(ctx context.Context, args ...string) ([]byte, error) } +// StdinRunner is a duck-typed extension to CommandRunner for callers +// that need to pipe stdin into a command (e.g. `debugfs -w -f -`). The +// real system.Runner implements it; test doubles don't need to unless +// they exercise this path. +type StdinRunner interface { + RunStdin(ctx context.Context, stdin io.Reader, name string, args ...string) ([]byte, error) +} + func NewRunner() Runner { return Runner{} } +// ExitCode extracts the process exit code from an error returned by +// Run/RunSudo. Returns -1 when the error isn't an *exec.ExitError +// (e.g. a context cancellation, the command wasn't found). Exposing +// this here keeps daemon-level callers out of os/exec — the +// shellout-policy test rejects direct imports outside system/cli/etc. +func ExitCode(err error) int { + if err == nil { + return 0 + } + var exitErr *exec.ExitError + if errors.As(err, &exitErr) { + return exitErr.ExitCode() + } + return -1 +} + func (Runner) Run(ctx context.Context, name string, args ...string) ([]byte, error) { cmd := exec.CommandContext(ctx, name, args...) var stdout bytes.Buffer @@ -47,11 +71,39 @@ func (Runner) Run(ctx context.Context, name string, args ...string) ([]byte, err } func (r Runner) RunSudo(ctx context.Context, args ...string) ([]byte, error) { + if os.Geteuid() == 0 { + if len(args) == 0 { + return nil, errors.New("command is required") + } + return r.Run(ctx, args[0], args[1:]...) + } all := append([]string{"-n"}, args...) return r.Run(ctx, "sudo", all...) } +// RunStdin executes name with args and pipes stdin in from the provided +// reader. Used for commands like debugfs -w that accept a scripted +// command stream on stdin. +func (Runner) RunStdin(ctx context.Context, stdin io.Reader, name string, args ...string) ([]byte, error) { + cmd := exec.CommandContext(ctx, name, args...) + var stdout bytes.Buffer + var stderr bytes.Buffer + cmd.Stdout = &stdout + cmd.Stderr = &stderr + cmd.Stdin = stdin + if err := cmd.Run(); err != nil { + if stderr.Len() > 0 { + return stdout.Bytes(), fmt.Errorf("%w: %s", err, strings.TrimSpace(stderr.String())) + } + return stdout.Bytes(), err + } + return stdout.Bytes(), nil +} + func EnsureSudo(ctx context.Context) error { + if os.Geteuid() == 0 { + return nil + } cmd := exec.CommandContext(ctx, "sudo", "-v") cmd.Stdout = os.Stdout cmd.Stderr = os.Stderr @@ -60,6 +112,9 @@ func EnsureSudo(ctx context.Context) error { } func CheckSudo(ctx context.Context) error { + if os.Geteuid() == 0 { + return nil + } if _, err := exec.LookPath("sudo"); err != nil { return err } @@ -117,7 +172,19 @@ func ProcessRunning(pid int, apiSock string) bool { return false } cmdline := strings.ReplaceAll(string(data), "\x00", " ") - return strings.Contains(cmdline, "firecracker") && strings.Contains(cmdline, apiSock) + if !strings.Contains(cmdline, "firecracker") { + return false + } + if strings.Contains(cmdline, apiSock) { + return true + } + // Jailer mode: apiSock is a symlink; firecracker's cmdline has the + // chroot-internal path (e.g. "/firecracker.socket"), not the host path. + // Fall back to matching the symlink target's base name. + if target, err := os.Readlink(apiSock); err == nil { + return strings.Contains(cmdline, filepath.Base(target)) + } + return false } type ProcessStats struct { @@ -288,7 +355,10 @@ func lastJSONLine(data []byte) []byte { } func CopyDirContents(ctx context.Context, runner CommandRunner, sourceDir, targetDir string, useSudo bool) error { - args := []string{"-a", filepath.Join(sourceDir, "."), targetDir + "/"} + // Trailing "/." on the source tells cp -a to copy the directory's + // contents rather than the directory itself. filepath.Join would + // strip the dot, hence the manual concat. + args := []string{"-a", strings.TrimRight(sourceDir, "/") + "/.", targetDir + "/"} var err error if useSudo { _, err = runner.RunSudo(ctx, append([]string{"cp"}, args...)...) diff --git a/internal/toolingplan/go.go b/internal/toolingplan/go.go new file mode 100644 index 0000000..9b65b72 --- /dev/null +++ b/internal/toolingplan/go.go @@ -0,0 +1,26 @@ +package toolingplan + +import ( + "context" + "fmt" +) + +type goDetector struct{} + +func (goDetector) detect(_ context.Context, repoRoot string, managedTools map[string]struct{}) detectionResult { + if alreadyManaged("go", managedTools) { + return detectionResult{Skips: []SkipNote{{Target: "go", Reason: "already managed by repo mise declarations"}}} + } + goMod, ok, err := readRepoFile(repoRoot, "go.mod") + if err != nil { + return detectionResult{Skips: []SkipNote{{Target: "go", Reason: fmt.Sprintf("could not read go.mod: %v", err)}}} + } + if !ok { + return detectionResult{Skips: []SkipNote{{Target: "go", Reason: "no go.mod"}}} + } + version, ok := parseGoDirective(goMod) + if !ok { + return detectionResult{Skips: []SkipNote{{Target: "go", Reason: "go.mod has no exact go directive"}}} + } + return detectionResult{Steps: []InstallStep{{Tool: "go", Version: version, Source: "go.mod", Reason: "go directive"}}} +} diff --git a/internal/toolingplan/mise.go b/internal/toolingplan/mise.go new file mode 100644 index 0000000..50803ca --- /dev/null +++ b/internal/toolingplan/mise.go @@ -0,0 +1,88 @@ +package toolingplan + +import ( + "bufio" + "fmt" + "os" + "path/filepath" + "sort" + "strings" + + toml "github.com/pelletier/go-toml" +) + +func repoManagedTools(repoRoot string) (map[string]struct{}, []SkipNote) { + tools := make(map[string]struct{}) + skips := make([]SkipNote, 0) + if err := collectToolVersions(filepath.Join(repoRoot, ".tool-versions"), tools); err != nil { + skips = append(skips, SkipNote{ + Target: "repo mise declarations", + Reason: fmt.Sprintf("could not read .tool-versions: %v", err), + }) + } + if err := collectMiseToml(filepath.Join(repoRoot, ".mise.toml"), tools); err != nil { + skips = append(skips, SkipNote{ + Target: "repo mise declarations", + Reason: fmt.Sprintf("could not parse .mise.toml: %v", err), + }) + } + return tools, skips +} + +func collectToolVersions(path string, tools map[string]struct{}) error { + file, err := os.Open(path) + if err != nil { + if os.IsNotExist(err) { + return nil + } + return err + } + defer file.Close() + scanner := bufio.NewScanner(file) + for scanner.Scan() { + line := strings.TrimSpace(scanner.Text()) + if line == "" || strings.HasPrefix(line, "#") { + continue + } + fields := strings.Fields(line) + if len(fields) == 0 { + continue + } + tools[fields[0]] = struct{}{} + } + return scanner.Err() +} + +func collectMiseToml(path string, tools map[string]struct{}) error { + data, err := os.ReadFile(path) + if err != nil { + if os.IsNotExist(err) { + return nil + } + return err + } + tree, err := toml.LoadBytes(data) + if err != nil { + return err + } + value := tree.Get("tools") + if value == nil { + return nil + } + switch typed := value.(type) { + case *toml.Tree: + for _, key := range typed.Keys() { + tools[key] = struct{}{} + } + case map[string]interface{}: + keys := make([]string, 0, len(typed)) + for key := range typed { + keys = append(keys, key) + } + sort.Strings(keys) + for _, key := range keys { + tools[key] = struct{}{} + } + } + return nil +} diff --git a/internal/toolingplan/node.go b/internal/toolingplan/node.go new file mode 100644 index 0000000..41c9c54 --- /dev/null +++ b/internal/toolingplan/node.go @@ -0,0 +1,109 @@ +package toolingplan + +import ( + "context" + "fmt" + "strings" +) + +type nodeDetector struct{} + +func (nodeDetector) detect(_ context.Context, repoRoot string, managedTools map[string]struct{}) detectionResult { + result := detectionResult{} + + nodeVersion, nodeSource, nodeManaged, nodeSkip := detectNodeVersion(repoRoot, managedTools) + if nodeManaged { + result.Skips = append(result.Skips, SkipNote{Target: "node", Reason: "already managed by repo mise declarations"}) + } else if nodeVersion != "" { + result.Steps = append(result.Steps, InstallStep{Tool: "node", Version: nodeVersion, Source: nodeSource, Reason: "exact runtime version"}) + } else { + result.Skips = append(result.Skips, SkipNote{Target: "node", Reason: nodeSkip}) + } + + packageManagerVersion, packageManagerTool, packageManagerSource, packageManagerSkip := detectNodePackageManager(repoRoot) + if packageManagerTool == "" { + if packageManagerSkip != "" { + result.Skips = append(result.Skips, SkipNote{Target: "node package manager", Reason: packageManagerSkip}) + } + return result + } + if alreadyManaged(packageManagerTool, managedTools) { + result.Skips = append(result.Skips, SkipNote{Target: "node package manager", Reason: packageManagerTool + " is already managed by repo mise declarations"}) + return result + } + if nodeVersion == "" && !alreadyManaged("node", managedTools) { + result.Skips = append(result.Skips, SkipNote{Target: "node package manager", Reason: "packageManager is pinned but node is not pinned"}) + return result + } + result.Steps = append(result.Steps, InstallStep{Tool: packageManagerTool, Version: packageManagerVersion, Source: packageManagerSource, Reason: "exact packageManager version"}) + return result +} + +func detectNodeVersion(repoRoot string, managedTools map[string]struct{}) (version string, source string, managed bool, skip string) { + if alreadyManaged("node", managedTools) { + return "", "", true, "" + } + for _, candidate := range []string{".node-version", ".nvmrc"} { + value, ok, err := readRepoFile(repoRoot, candidate) + if err != nil { + return "", "", false, fmt.Sprintf("could not read %s: %v", candidate, err) + } + if !ok { + continue + } + version, ok := normalizeExactVersion(strings.TrimSpace(value)) + if ok { + return version, candidate, false, "" + } + return "", "", false, candidate + " does not pin an exact version" + } + packageJSON, ok, err := readRepoFile(repoRoot, "package.json") + if err != nil { + return "", "", false, fmt.Sprintf("could not read package.json: %v", err) + } + if !ok { + return "", "", false, "no pinned node version file" + } + meta, err := parsePackageJSON(packageJSON) + if err != nil { + return "", "", false, fmt.Sprintf("could not parse package.json: %v", err) + } + if version, ok := normalizeExactVersion(meta.Volta.Node); ok { + return version, "package.json#volta.node", false, "" + } + if strings.TrimSpace(meta.Volta.Node) != "" { + return "", "", false, "package.json#volta.node is not an exact version" + } + return "", "", false, "no pinned node version file" +} + +func detectNodePackageManager(repoRoot string) (version string, tool string, source string, skip string) { + packageJSON, ok, err := readRepoFile(repoRoot, "package.json") + if err != nil { + return "", "", "", fmt.Sprintf("could not read package.json: %v", err) + } + if !ok { + return "", "", "", "" + } + meta, err := parsePackageJSON(packageJSON) + if err != nil { + return "", "", "", fmt.Sprintf("could not parse package.json: %v", err) + } + value := strings.TrimSpace(meta.PackageManager) + if value == "" { + return "", "", "", "" + } + parts := strings.SplitN(value, "@", 2) + if len(parts) != 2 { + return "", "", "", "packageManager is not in tool@version form" + } + tool = strings.TrimSpace(parts[0]) + if tool != "pnpm" && tool != "yarn" && tool != "npm" && tool != "bun" { + return "", "", "", "packageManager is not a supported exact installer target" + } + version, ok = normalizeExactVersion(parts[1]) + if !ok { + return "", "", "", "packageManager version is not exact" + } + return version, tool, "package.json#packageManager", "" +} diff --git a/internal/toolingplan/plan.go b/internal/toolingplan/plan.go new file mode 100644 index 0000000..07513c8 --- /dev/null +++ b/internal/toolingplan/plan.go @@ -0,0 +1,94 @@ +package toolingplan + +import ( + "context" + "os" + "path/filepath" + "sort" +) + +type InstallStep struct { + Tool string + Version string + Source string + Reason string +} + +type SkipNote struct { + Target string + Reason string +} + +type Plan struct { + RepoManagedTools []string + Steps []InstallStep + Skips []SkipNote +} + +type detector interface { + detect(context.Context, string, map[string]struct{}) detectionResult +} + +type detectionResult struct { + Steps []InstallStep + Skips []SkipNote +} + +var detectors = []detector{ + goDetector{}, + nodeDetector{}, + pythonDetector{}, + rustDetector{}, +} + +func Build(ctx context.Context, repoRoot string) Plan { + managedTools, managedSkips := repoManagedTools(repoRoot) + steps := make([]InstallStep, 0) + skips := append([]SkipNote(nil), managedSkips...) + for _, detector := range detectors { + result := detector.detect(ctx, repoRoot, managedTools) + steps = append(steps, result.Steps...) + skips = append(skips, result.Skips...) + } + sort.Slice(steps, func(i, j int) bool { + if steps[i].Tool != steps[j].Tool { + return steps[i].Tool < steps[j].Tool + } + if steps[i].Version != steps[j].Version { + return steps[i].Version < steps[j].Version + } + return steps[i].Source < steps[j].Source + }) + sort.Slice(skips, func(i, j int) bool { + if skips[i].Target != skips[j].Target { + return skips[i].Target < skips[j].Target + } + return skips[i].Reason < skips[j].Reason + }) + repoManagedList := make([]string, 0, len(managedTools)) + for tool := range managedTools { + repoManagedList = append(repoManagedList, tool) + } + sort.Strings(repoManagedList) + return Plan{ + RepoManagedTools: repoManagedList, + Steps: steps, + Skips: skips, + } +} + +func readRepoFile(repoRoot, relativePath string) (string, bool, error) { + data, err := os.ReadFile(filepath.Join(repoRoot, relativePath)) + if err != nil { + if os.IsNotExist(err) { + return "", false, nil + } + return "", false, err + } + return string(data), true, nil +} + +func alreadyManaged(tool string, managedTools map[string]struct{}) bool { + _, ok := managedTools[tool] + return ok +} diff --git a/internal/toolingplan/plan_test.go b/internal/toolingplan/plan_test.go new file mode 100644 index 0000000..ee4b7a7 --- /dev/null +++ b/internal/toolingplan/plan_test.go @@ -0,0 +1,137 @@ +package toolingplan + +import ( + "context" + "os" + "path/filepath" + "strings" + "testing" +) + +func TestBuildDetectsGoVersionFromGoMod(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, "go.mod", "module example.com/demo\n\ngo 1.25.0\n") + + plan := Build(context.Background(), repoRoot) + + if len(plan.Steps) != 1 { + t.Fatalf("steps = %#v, want one step", plan.Steps) + } + step := plan.Steps[0] + if step.Tool != "go" || step.Version != "1.25.0" || step.Source != "go.mod" { + t.Fatalf("step = %#v, want go@1.25.0 from go.mod", step) + } +} + +func TestBuildSkipsGoWhenRepoMiseAlreadyDeclaresIt(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, ".mise.toml", "[tools]\ngo = '1.25.0'\n") + writePlanFile(t, repoRoot, "go.mod", "module example.com/demo\n\ngo 1.25.0\n") + + plan := Build(context.Background(), repoRoot) + + if len(plan.Steps) != 0 { + t.Fatalf("steps = %#v, want no deterministic go install", plan.Steps) + } + if !containsSkip(plan.Skips, "go", "already managed by repo mise declarations") { + t.Fatalf("skips = %#v, want managed go skip", plan.Skips) + } + if len(plan.RepoManagedTools) != 1 || plan.RepoManagedTools[0] != "go" { + t.Fatalf("repo managed tools = %#v, want [go]", plan.RepoManagedTools) + } +} + +func TestBuildDetectsNodeAndPackageManager(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, ".node-version", "v22.14.0\n") + writePlanFile(t, repoRoot, "package.json", `{"packageManager":"pnpm@9.15.2"}`) + + plan := Build(context.Background(), repoRoot) + + if !containsStep(plan.Steps, "node", "22.14.0", ".node-version") { + t.Fatalf("steps = %#v, want node step", plan.Steps) + } + if !containsStep(plan.Steps, "pnpm", "9.15.2", "package.json#packageManager") { + t.Fatalf("steps = %#v, want pnpm step", plan.Steps) + } +} + +func TestBuildSkipsPackageManagerWhenNodeIsNotPinned(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, "package.json", `{"packageManager":"pnpm@9.15.2"}`) + + plan := Build(context.Background(), repoRoot) + + if containsStep(plan.Steps, "pnpm", "9.15.2", "package.json#packageManager") { + t.Fatalf("steps = %#v, want no package manager install", plan.Steps) + } + if !containsSkip(plan.Skips, "node package manager", "packageManager is pinned but node is not pinned") { + t.Fatalf("skips = %#v, want node package manager skip", plan.Skips) + } +} + +func TestBuildDetectsPythonAndRust(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, ".python-version", "3.12.9\n") + writePlanFile(t, repoRoot, "rust-toolchain.toml", "[toolchain]\nchannel = '1.86.0'\n") + + plan := Build(context.Background(), repoRoot) + + if !containsStep(plan.Steps, "python", "3.12.9", ".python-version") { + t.Fatalf("steps = %#v, want python step", plan.Steps) + } + if !containsStep(plan.Steps, "rust", "1.86.0", "rust-toolchain.toml") { + t.Fatalf("steps = %#v, want rust step", plan.Steps) + } +} + +func TestBuildSkipsRustChannelNames(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, "rust-toolchain.toml", "[toolchain]\nchannel = 'stable'\n") + + plan := Build(context.Background(), repoRoot) + + if !containsSkip(plan.Skips, "rust", "rust-toolchain.toml channel is not an exact version") { + t.Fatalf("skips = %#v, want rust exact-version skip", plan.Skips) + } +} + +func TestBuildReportsMalformedMiseTomlAsSkip(t *testing.T) { + repoRoot := t.TempDir() + writePlanFile(t, repoRoot, ".mise.toml", "[tools\nbroken") + + plan := Build(context.Background(), repoRoot) + + if !containsSkip(plan.Skips, "repo mise declarations", "could not parse .mise.toml") { + t.Fatalf("skips = %#v, want malformed .mise.toml skip", plan.Skips) + } +} + +func writePlanFile(t *testing.T, repoRoot, relativePath, contents string) { + t.Helper() + path := filepath.Join(repoRoot, relativePath) + if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil { + t.Fatalf("MkdirAll(%s): %v", filepath.Dir(path), err) + } + if err := os.WriteFile(path, []byte(contents), 0o644); err != nil { + t.Fatalf("WriteFile(%s): %v", relativePath, err) + } +} + +func containsStep(steps []InstallStep, tool, version, source string) bool { + for _, step := range steps { + if step.Tool == tool && step.Version == version && step.Source == source { + return true + } + } + return false +} + +func containsSkip(skips []SkipNote, target, reasonContains string) bool { + for _, skip := range skips { + if skip.Target == target && strings.Contains(skip.Reason, reasonContains) { + return true + } + } + return false +} diff --git a/internal/toolingplan/python.go b/internal/toolingplan/python.go new file mode 100644 index 0000000..6df9ef3 --- /dev/null +++ b/internal/toolingplan/python.go @@ -0,0 +1,27 @@ +package toolingplan + +import ( + "context" + "fmt" + "strings" +) + +type pythonDetector struct{} + +func (pythonDetector) detect(_ context.Context, repoRoot string, managedTools map[string]struct{}) detectionResult { + if alreadyManaged("python", managedTools) { + return detectionResult{Skips: []SkipNote{{Target: "python", Reason: "already managed by repo mise declarations"}}} + } + value, ok, err := readRepoFile(repoRoot, ".python-version") + if err != nil { + return detectionResult{Skips: []SkipNote{{Target: "python", Reason: fmt.Sprintf("could not read .python-version: %v", err)}}} + } + if !ok { + return detectionResult{Skips: []SkipNote{{Target: "python", Reason: "no .python-version"}}} + } + version, ok := normalizeExactVersion(strings.TrimSpace(value)) + if !ok { + return detectionResult{Skips: []SkipNote{{Target: "python", Reason: ".python-version does not pin an exact version"}}} + } + return detectionResult{Steps: []InstallStep{{Tool: "python", Version: version, Source: ".python-version", Reason: "exact runtime version"}}} +} diff --git a/internal/toolingplan/rust.go b/internal/toolingplan/rust.go new file mode 100644 index 0000000..ddd8090 --- /dev/null +++ b/internal/toolingplan/rust.go @@ -0,0 +1,70 @@ +package toolingplan + +import ( + "context" + "fmt" + "strings" + + toml "github.com/pelletier/go-toml" +) + +type rustDetector struct{} + +func (rustDetector) detect(_ context.Context, repoRoot string, managedTools map[string]struct{}) detectionResult { + if alreadyManaged("rust", managedTools) { + return detectionResult{Skips: []SkipNote{{Target: "rust", Reason: "already managed by repo mise declarations"}}} + } + if version, ok, reason := parseRustToolchainToml(repoRoot); ok { + return detectionResult{Steps: []InstallStep{{Tool: "rust", Version: version, Source: "rust-toolchain.toml", Reason: "exact toolchain channel"}}} + } else if reason != "" { + return detectionResult{Skips: []SkipNote{{Target: "rust", Reason: reason}}} + } + value, ok, err := readRepoFile(repoRoot, "rust-toolchain") + if err != nil { + return detectionResult{Skips: []SkipNote{{Target: "rust", Reason: fmt.Sprintf("could not read rust-toolchain: %v", err)}}} + } + if !ok { + return detectionResult{Skips: []SkipNote{{Target: "rust", Reason: "no rust-toolchain or rust-toolchain.toml"}}} + } + version := firstMeaningfulLine(value) + if normalized, ok := normalizeExactVersion(version); ok { + return detectionResult{Steps: []InstallStep{{Tool: "rust", Version: normalized, Source: "rust-toolchain", Reason: "exact toolchain channel"}}} + } + return detectionResult{Skips: []SkipNote{{Target: "rust", Reason: "rust-toolchain does not pin an exact version"}}} +} + +func parseRustToolchainToml(repoRoot string) (version string, ok bool, reason string) { + data, found, err := readRepoFile(repoRoot, "rust-toolchain.toml") + if err != nil { + return "", false, fmt.Sprintf("could not read rust-toolchain.toml: %v", err) + } + if !found { + return "", false, "" + } + tree, err := toml.Load(data) + if err != nil { + return "", false, fmt.Sprintf("could not parse rust-toolchain.toml: %v", err) + } + channelValue := tree.GetDefault("toolchain.channel", "") + channel, _ := channelValue.(string) + channel = strings.TrimSpace(channel) + if channel == "" { + return "", false, "rust-toolchain.toml has no toolchain.channel" + } + version, ok = normalizeExactVersion(channel) + if !ok { + return "", false, "rust-toolchain.toml channel is not an exact version" + } + return version, true, "" +} + +func firstMeaningfulLine(value string) string { + for _, line := range strings.Split(value, "\n") { + trimmed := strings.TrimSpace(line) + if trimmed == "" || strings.HasPrefix(trimmed, "#") { + continue + } + return trimmed + } + return "" +} diff --git a/internal/toolingplan/rust_test.go b/internal/toolingplan/rust_test.go new file mode 100644 index 0000000..e474042 --- /dev/null +++ b/internal/toolingplan/rust_test.go @@ -0,0 +1,23 @@ +package toolingplan + +import "testing" + +func TestFirstMeaningfulLine(t *testing.T) { + cases := []struct { + in, want string + }{ + {"", ""}, + {"\n\n\n", ""}, + {" \n \n", ""}, + {"# just a comment\n# another\n", ""}, + {"1.75.0\n", "1.75.0"}, + {" 1.75.0 ", "1.75.0"}, + {"# pinned toolchain\n1.75.0\nmore junk\n", "1.75.0"}, + {"\n\n stable-x86_64-unknown-linux-gnu \n", "stable-x86_64-unknown-linux-gnu"}, + } + for _, tc := range cases { + if got := firstMeaningfulLine(tc.in); got != tc.want { + t.Errorf("firstMeaningfulLine(%q) = %q, want %q", tc.in, got, tc.want) + } + } +} diff --git a/internal/toolingplan/version.go b/internal/toolingplan/version.go new file mode 100644 index 0000000..c8c225b --- /dev/null +++ b/internal/toolingplan/version.go @@ -0,0 +1,41 @@ +package toolingplan + +import ( + "encoding/json" + "regexp" + "strings" +) + +var ( + exactVersionPattern = regexp.MustCompile(`^v?\d+(?:\.\d+){0,2}(?:[-+][0-9A-Za-z.-]+)?$`) + goDirectivePattern = regexp.MustCompile(`(?m)^go\s+([0-9]+(?:\.[0-9]+){1,2})\s*$`) +) + +func normalizeExactVersion(value string) (string, bool) { + trimmed := strings.TrimSpace(value) + if !exactVersionPattern.MatchString(trimmed) { + return "", false + } + return strings.TrimPrefix(trimmed, "v"), true +} + +func parseGoDirective(goMod string) (string, bool) { + matches := goDirectivePattern.FindStringSubmatch(goMod) + if len(matches) != 2 { + return "", false + } + return matches[1], true +} + +type packageJSONMetadata struct { + PackageManager string `json:"packageManager"` + Volta struct { + Node string `json:"node"` + } `json:"volta"` +} + +func parsePackageJSON(data string) (packageJSONMetadata, error) { + var meta packageJSONMetadata + err := json.Unmarshal([]byte(data), &meta) + return meta, err +} diff --git a/internal/updater/download.go b/internal/updater/download.go new file mode 100644 index 0000000..11c8d81 --- /dev/null +++ b/internal/updater/download.go @@ -0,0 +1,117 @@ +package updater + +import ( + "context" + "fmt" + "io" + "net/http" + "os" + "path" + "path/filepath" + + "banger/internal/download" +) + +// DownloadRelease fetches the SHA256SUMS file for `release`, looks up +// the tarball's basename in it, then fetches the tarball with on-the- +// fly hash verification. The tarball lands at dstPath; the function +// errors on any verification failure and removes the partial file +// before returning. +// +// SHA256SUMS bytes are returned alongside so the caller can +// cosign-verify them against an embedded public key before trusting +// the hashes inside. Without that step this function is only as +// secure as TLS; see verify_signature.go for the cosign tie-in. +func DownloadRelease(ctx context.Context, client *http.Client, release Release, dstPath string) (sumsBody []byte, err error) { + if client == nil { + client = http.DefaultClient + } + + sumsBody, err = fetchBounded(ctx, client, release.SHA256SumsURL, MaxSHA256SumsBytes) + if err != nil { + return nil, fmt.Errorf("fetch SHA256SUMS: %w", err) + } + sums, err := ParseSHA256Sums(sumsBody) + if err != nil { + return nil, fmt.Errorf("parse SHA256SUMS: %w", err) + } + + tarballName := path.Base(release.TarballURL) + expected, ok := sums[tarballName] + if !ok { + return nil, fmt.Errorf("SHA256SUMS does not list %q", tarballName) + } + if _, err := download.FetchVerified(ctx, client, release.TarballURL, expected, MaxTarballBytes, dstPath); err != nil { + return nil, fmt.Errorf("fetch tarball: %w", err) + } + return sumsBody, nil +} + +// fetchBounded does a small bounded GET — used for the manifest, the +// SHA256SUMS file, and (later) the cosign signature. Anything bigger +// goes through download.FetchVerified, which adds the on-the-fly +// hash check. +func fetchBounded(ctx context.Context, client *http.Client, url string, maxBytes int64) ([]byte, error) { + req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) + if err != nil { + return nil, err + } + resp, err := client.Do(req) + if err != nil { + return nil, fmt.Errorf("fetch %s: %w", url, err) + } + defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return nil, fmt.Errorf("fetch %s: HTTP %s", url, resp.Status) + } + if resp.ContentLength > maxBytes { + return nil, fmt.Errorf("fetch %s: %d bytes exceeds %d-byte cap", url, resp.ContentLength, maxBytes) + } + body, err := io.ReadAll(io.LimitReader(resp.Body, maxBytes+1)) + if err != nil { + return nil, fmt.Errorf("read %s: %w", url, err) + } + if int64(len(body)) > maxBytes { + return nil, fmt.Errorf("%s exceeded %d-byte cap mid-stream", url, maxBytes) + } + return body, nil +} + +// EnsureStagingDir creates the staging directory with restrictive +// permissions (0700, owned by the caller — typically root in system +// mode). Any pre-existing contents are NOT cleared; that's +// PrepareCleanStaging's job. +func EnsureStagingDir(stagingDir string) error { + return os.MkdirAll(stagingDir, 0o700) +} + +// PrepareCleanStaging wipes anything left in the staging dir from a +// prior aborted update, then re-creates the directory. Distinct from +// EnsureStagingDir because we don't want to nuke the dir unless +// we're ABOUT to use it — having a leftover staged tree from a +// prior failed run is sometimes useful for diagnostics. +func PrepareCleanStaging(stagingDir string) error { + if err := os.RemoveAll(stagingDir); err != nil { + return fmt.Errorf("clear staging %s: %w", stagingDir, err) + } + return EnsureStagingDir(stagingDir) +} + +// DefaultStagingDir is where the updater stages downloads + +// extracted binaries when no explicit dir is configured. Sits under +// banger's system CacheDir (typically /var/cache/banger/updates) so: +// - the systemd unit's CacheDirectory=banger keeps the path +// writable for the helper. +// - `banger system uninstall --purge` cleans it. +// - it sits beside the OCI and kernel caches without colliding. +// +// Atomicity caveat: we expect /var/cache and /usr/local to share a +// filesystem (default on essentially every Linux install). On a host +// with /usr split onto a separate volume, the swap step's os.Rename +// would fall through to a copy + delete and lose its atomicity +// guarantee. We document this rather than detect-and-error for +// v0.1.0; the worst-case symptom is a brief window where a binary is +// half-written, which `banger doctor` would catch in step 7. +func DefaultStagingDir(cacheDir string) string { + return filepath.Join(cacheDir, "updates") +} diff --git a/internal/updater/flow_test.go b/internal/updater/flow_test.go new file mode 100644 index 0000000..5da29df --- /dev/null +++ b/internal/updater/flow_test.go @@ -0,0 +1,363 @@ +package updater + +import ( + "archive/tar" + "bytes" + "compress/gzip" + "context" + "crypto/sha256" + "encoding/hex" + "fmt" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "strings" + "testing" +) + +// makeReleaseTarball writes a tarball whose root contains the three +// expected entries with the given bodies. Used by stage + download +// tests so they don't need a real banger build to exercise the +// extraction path. +func makeReleaseTarball(t *testing.T, bodies map[string][]byte) []byte { + t.Helper() + var buf bytes.Buffer + gz := gzip.NewWriter(&buf) + tw := tar.NewWriter(gz) + for name, body := range bodies { + hdr := &tar.Header{ + Name: name, + Mode: 0o755, + Size: int64(len(body)), + Typeflag: tar.TypeReg, + } + if err := tw.WriteHeader(hdr); err != nil { + t.Fatalf("write header: %v", err) + } + if _, err := tw.Write(body); err != nil { + t.Fatalf("write body: %v", err) + } + } + if err := tw.Close(); err != nil { + t.Fatalf("close tar: %v", err) + } + if err := gz.Close(); err != nil { + t.Fatalf("close gzip: %v", err) + } + return buf.Bytes() +} + +func sha256Hex(b []byte) string { + sum := sha256.Sum256(b) + return hex.EncodeToString(sum[:]) +} + +func TestStageTarballHappyPath(t *testing.T) { + body := makeReleaseTarball(t, map[string][]byte{ + "banger": []byte("BANGER"), + "bangerd": []byte("BANGERD"), + "banger-vsock-agent": []byte("AGENT"), + }) + tarball := filepath.Join(t.TempDir(), "release.tar.gz") + if err := os.WriteFile(tarball, body, 0o644); err != nil { + t.Fatalf("write tarball: %v", err) + } + staging := filepath.Join(t.TempDir(), "staged") + + got, err := StageTarball(tarball, staging) + if err != nil { + t.Fatalf("StageTarball: %v", err) + } + for _, p := range []string{got.BangerPath, got.BangerdPath, got.VsockAgentPath} { + info, err := os.Stat(p) + if err != nil { + t.Fatalf("stat %s: %v", p, err) + } + if info.Mode().Perm() != 0o755 { + t.Errorf("%s mode = %o, want 0755", p, info.Mode().Perm()) + } + } + bs, _ := os.ReadFile(got.BangerPath) + if string(bs) != "BANGER" { + t.Fatalf("banger content = %q", bs) + } +} + +func TestStageTarballRejectsExtraEntry(t *testing.T) { + body := makeReleaseTarball(t, map[string][]byte{ + "banger": []byte("a"), + "bangerd": []byte("b"), + "banger-vsock-agent": []byte("c"), + "bonus.txt": []byte("not allowed"), + }) + tarball := filepath.Join(t.TempDir(), "rel.tar.gz") + _ = os.WriteFile(tarball, body, 0o644) + _, err := StageTarball(tarball, t.TempDir()) + if err == nil || !strings.Contains(err.Error(), "unexpected entry") { + t.Fatalf("err = %v, want unexpected-entry rejection", err) + } +} + +func TestStageTarballRejectsMissingEntry(t *testing.T) { + body := makeReleaseTarball(t, map[string][]byte{ + "banger": []byte("a"), + "bangerd": []byte("b"), + // banger-vsock-agent intentionally missing + }) + tarball := filepath.Join(t.TempDir(), "rel.tar.gz") + _ = os.WriteFile(tarball, body, 0o644) + _, err := StageTarball(tarball, t.TempDir()) + if err == nil || !strings.Contains(err.Error(), "missing required entry") { + t.Fatalf("err = %v, want missing-required rejection", err) + } +} + +func TestStageTarballRejectsPathTraversal(t *testing.T) { + // Build the tarball manually so we can inject a `../` entry — + // makeReleaseTarball's expected-entry filter would otherwise + // catch it earlier. + var buf bytes.Buffer + gz := gzip.NewWriter(&buf) + tw := tar.NewWriter(gz) + for _, e := range []struct{ name, body string }{ + {"banger", "a"}, + {"bangerd", "b"}, + {"../escape", "x"}, + } { + _ = tw.WriteHeader(&tar.Header{Name: e.name, Size: int64(len(e.body)), Mode: 0o755, Typeflag: tar.TypeReg}) + _, _ = tw.Write([]byte(e.body)) + } + _ = tw.Close() + _ = gz.Close() + tarball := filepath.Join(t.TempDir(), "rel.tar.gz") + _ = os.WriteFile(tarball, buf.Bytes(), 0o644) + _, err := StageTarball(tarball, t.TempDir()) + if err == nil || !strings.Contains(err.Error(), "unsafe path") { + t.Fatalf("err = %v, want unsafe-path rejection", err) + } +} + +func TestSwapAndRollback(t *testing.T) { + root := t.TempDir() + binDir := filepath.Join(root, "bin") + libDir := filepath.Join(root, "lib", "banger") + if err := os.MkdirAll(binDir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.MkdirAll(libDir, 0o755); err != nil { + t.Fatal(err) + } + for _, p := range []string{ + filepath.Join(binDir, "banger"), + filepath.Join(binDir, "bangerd"), + filepath.Join(libDir, "banger-vsock-agent"), + } { + if err := os.WriteFile(p, []byte("OLD-"+filepath.Base(p)), 0o755); err != nil { + t.Fatal(err) + } + } + + staging := filepath.Join(root, "staging") + _ = os.MkdirAll(staging, 0o700) + staged := StagedRelease{ + BangerPath: filepath.Join(staging, "banger"), + BangerdPath: filepath.Join(staging, "bangerd"), + VsockAgentPath: filepath.Join(staging, "banger-vsock-agent"), + } + for _, pair := range []struct{ p, body string }{ + {staged.BangerPath, "NEW-banger"}, + {staged.BangerdPath, "NEW-bangerd"}, + {staged.VsockAgentPath, "NEW-banger-vsock-agent"}, + } { + if err := os.WriteFile(pair.p, []byte(pair.body), 0o755); err != nil { + t.Fatal(err) + } + } + + targets := InstallTargets{ + Banger: filepath.Join(binDir, "banger"), + Bangerd: filepath.Join(binDir, "bangerd"), + VsockAgent: filepath.Join(libDir, "banger-vsock-agent"), + } + + res, err := Swap(staged, targets) + if err != nil { + t.Fatalf("Swap: %v", err) + } + if len(res.SwappedTargets) != 3 { + t.Fatalf("SwappedTargets len = %d, want 3", len(res.SwappedTargets)) + } + for _, p := range []string{targets.Banger, targets.Bangerd, targets.VsockAgent} { + got, _ := os.ReadFile(p) + want := "NEW-" + filepath.Base(p) + if string(got) != want { + t.Fatalf("%s content = %q, want %q", p, got, want) + } + prev, err := os.ReadFile(p + previousSuffix) + if err != nil { + t.Fatalf("missing backup at %s.previous: %v", p, err) + } + if string(prev) != "OLD-"+filepath.Base(p) { + t.Fatalf(".previous content = %q", prev) + } + } + + if err := Rollback(res); err != nil { + t.Fatalf("Rollback: %v", err) + } + for _, p := range []string{targets.Banger, targets.Bangerd, targets.VsockAgent} { + got, _ := os.ReadFile(p) + want := "OLD-" + filepath.Base(p) + if string(got) != want { + t.Fatalf("post-rollback %s = %q, want %q", p, got, want) + } + if _, err := os.Stat(p + previousSuffix); !os.IsNotExist(err) { + t.Fatalf(".previous should be cleaned after rollback; stat err = %v", err) + } + } +} + +func TestSwapPartialFailureRollsBackCleanly(t *testing.T) { + root := t.TempDir() + binDir := filepath.Join(root, "bin") + libDir := filepath.Join(root, "lib", "banger") + if err := os.MkdirAll(binDir, 0o755); err != nil { + t.Fatal(err) + } + if err := os.MkdirAll(libDir, 0o755); err != nil { + t.Fatal(err) + } + // Pre-create the two binaries that will swap successfully. + for _, p := range []string{ + filepath.Join(binDir, "bangerd"), + filepath.Join(libDir, "banger-vsock-agent"), + } { + _ = os.WriteFile(p, []byte("OLD-"+filepath.Base(p)), 0o755) + } + + staging := filepath.Join(root, "staging") + _ = os.MkdirAll(staging, 0o700) + staged := StagedRelease{ + BangerPath: filepath.Join(staging, "banger"), + BangerdPath: filepath.Join(staging, "bangerd"), + VsockAgentPath: filepath.Join(staging, "banger-vsock-agent"), + } + for _, pair := range []struct{ p, body string }{ + {staged.BangerPath, "NEW-banger"}, + {staged.BangerdPath, "NEW-bangerd"}, + {staged.VsockAgentPath, "NEW-banger-vsock-agent"}, + } { + _ = os.WriteFile(pair.p, []byte(pair.body), 0o755) + } + + // Block the banger swap (which is LAST in the order) by putting + // a regular file where its parent dir should be — MkdirAll fails + // with "not a directory". Vsock + bangerd succeed first. + blockedParent := filepath.Join(root, "blocked-bin") + if err := os.WriteFile(blockedParent, []byte("blocking"), 0o644); err != nil { + t.Fatal(err) + } + targets := InstallTargets{ + Banger: filepath.Join(blockedParent, "banger"), + Bangerd: filepath.Join(binDir, "bangerd"), + VsockAgent: filepath.Join(libDir, "banger-vsock-agent"), + } + + res, err := Swap(staged, targets) + if err == nil { + t.Fatal("Swap unexpectedly succeeded; banger parent should be blocked by a regular file") + } + if len(res.SwappedTargets) != 2 { + t.Fatalf("SwappedTargets = %v, want 2 (vsock + bangerd before banger failed)", res.SwappedTargets) + } + // Rolling back the partial swap should restore the filesystem. + if err := Rollback(res); err != nil { + t.Fatalf("Rollback after partial swap: %v", err) + } + for _, p := range res.SwappedTargets { + got, _ := os.ReadFile(p) + want := "OLD-" + filepath.Base(p) + if string(got) != want { + t.Fatalf("post-rollback %s = %q", p, got) + } + } +} + +func TestDownloadReleaseHappyPath(t *testing.T) { + tarballBody := []byte("fake tarball bytes") + tarballSHA := sha256Hex(tarballBody) + mux := http.NewServeMux() + mux.HandleFunc("/banger.tar.gz", func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write(tarballBody) + }) + mux.HandleFunc("/SHA256SUMS", func(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "%s banger.tar.gz\n", tarballSHA) + }) + srv := httptest.NewServer(mux) + defer srv.Close() + + dst := filepath.Join(t.TempDir(), "out.tar.gz") + sums, err := DownloadRelease(context.Background(), srv.Client(), Release{ + Version: "v0.1.0", + TarballURL: srv.URL + "/banger.tar.gz", + SHA256SumsURL: srv.URL + "/SHA256SUMS", + }, dst) + if err != nil { + t.Fatalf("DownloadRelease: %v", err) + } + if !strings.Contains(string(sums), "banger.tar.gz") { + t.Fatalf("returned sums body missing tarball name: %q", sums) + } + got, _ := os.ReadFile(dst) + if !bytes.Equal(got, tarballBody) { + t.Fatalf("downloaded body differs from served body") + } +} + +func TestDownloadReleaseRejectsTarballMissingFromSums(t *testing.T) { + mux := http.NewServeMux() + mux.HandleFunc("/banger.tar.gz", func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte("body")) + }) + mux.HandleFunc("/SHA256SUMS", func(w http.ResponseWriter, r *http.Request) { + // Sums for a different file; tarball name not listed. + fmt.Fprintf(w, "%s unrelated\n", sha256Hex([]byte("body"))) + }) + srv := httptest.NewServer(mux) + defer srv.Close() + + dst := filepath.Join(t.TempDir(), "out.tar.gz") + _, err := DownloadRelease(context.Background(), srv.Client(), Release{ + TarballURL: srv.URL + "/banger.tar.gz", + SHA256SumsURL: srv.URL + "/SHA256SUMS", + }, dst) + if err == nil || !strings.Contains(err.Error(), "does not list") { + t.Fatalf("err = %v, want SHA256SUMS-missing rejection", err) + } +} + +func TestDownloadReleasePropagatesShaMismatch(t *testing.T) { + mux := http.NewServeMux() + mux.HandleFunc("/banger.tar.gz", func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte("served body")) + }) + mux.HandleFunc("/SHA256SUMS", func(w http.ResponseWriter, r *http.Request) { + // Wrong digest for the tarball. + fmt.Fprintf(w, "%s banger.tar.gz\n", sha256Hex([]byte("expected body"))) + }) + srv := httptest.NewServer(mux) + defer srv.Close() + + dst := filepath.Join(t.TempDir(), "out.tar.gz") + _, err := DownloadRelease(context.Background(), srv.Client(), Release{ + TarballURL: srv.URL + "/banger.tar.gz", + SHA256SumsURL: srv.URL + "/SHA256SUMS", + }, dst) + if err == nil || !strings.Contains(err.Error(), "sha256 mismatch") { + t.Fatalf("err = %v, want sha256 mismatch", err) + } + if _, statErr := os.Stat(dst); !os.IsNotExist(statErr) { + t.Fatalf("partial tarball should be removed; stat err = %v", statErr) + } +} diff --git a/internal/updater/manifest.go b/internal/updater/manifest.go new file mode 100644 index 0000000..1ae35d0 --- /dev/null +++ b/internal/updater/manifest.go @@ -0,0 +1,177 @@ +// Package updater drives `banger update`: discover a new release, +// download + verify it, swap binaries atomically, restart the systemd +// units, run doctor, roll back on failure. The package is split across +// files by responsibility — manifest.go owns the release-discovery +// shape, the rest is in their own files. +package updater + +import ( + "context" + "encoding/json" + "fmt" + "io" + "net/http" + "strings" + "time" +) + +// manifestURL is the canonical URL of banger's release manifest on +// the Cloudflare R2 bucket. Hardcoded (rather than pulling from +// config) so a compromised daemon config can't redirect the updater +// to a different bucket. Var (not const) only because tests need to +// point at an httptest.Server; production never mutates it. +// +// The bucket lives at releases.thaloco.com; the path /banger/ scopes +// it inside the bucket so the same host can serve other projects' +// release artifacts later. +var manifestURL = "https://releases.thaloco.com/banger/manifest.json" + +// ManifestURL exposes the configured URL for callers that want to +// surface it in user-facing output (e.g. `banger update --check`). +func ManifestURL() string { return manifestURL } + +// MaxManifestBytes caps the manifest download size. The manifest is +// JSON with a small bounded shape (10s of releases × ~200 bytes +// each); 1 MiB is generous and protects us from a server that +// accidentally serves an arbitrary file. +const MaxManifestBytes int64 = 1 << 20 + +// MaxSHA256SumsBytes caps the SHA256SUMS download. One line per +// release artifact (today: one line for the tarball); 16 KiB is +// orders of magnitude over what we'd ever publish. +const MaxSHA256SumsBytes int64 = 16 * 1024 + +// MaxTarballBytes caps the release-tarball download. Banger's three +// binaries plus a SHA256SUMS file fit comfortably under this; if a +// future release approaches the cap, bump intentionally and ship a +// note in CHANGELOG. +const MaxTarballBytes int64 = 256 * 1024 * 1024 + +// Manifest is the top-level shape of releases.thaloco.com/banger/manifest.json. +// SchemaVersion lets us evolve the structure without breaking older +// CLIs — a CLI that doesn't recognise its current SchemaVersion +// refuses to update rather than guessing. +type Manifest struct { + SchemaVersion int `json:"schema_version"` + LatestStable string `json:"latest_stable"` + Releases []Release `json:"releases"` +} + +// Release describes one published banger build. The tarball + the +// SHA256SUMS file (and optionally its cosign signature) live at the +// URLs listed here; the actual binary hashes come from SHA256SUMS, +// not from the manifest, so manifest tampering can't substitute a +// hash for a known-good tarball. +type Release struct { + Version string `json:"version"` + TarballURL string `json:"tarball_url"` + SHA256SumsURL string `json:"sha256sums_url"` + SHA256SumsSigURL string `json:"sha256sums_sig_url,omitempty"` + ReleasedAt time.Time `json:"released_at"` +} + +// ManifestSchemaVersion is the SchemaVersion this CLI knows how to +// parse. Bumped together with any breaking change in Manifest / +// Release. +const ManifestSchemaVersion = 1 + +// FetchManifest downloads the release manifest from the embedded +// canonical URL and validates its shape. Returns an error if the +// server is unreachable, returns non-2xx, exceeds the size cap, or +// the schema_version is newer than this CLI knows. +func FetchManifest(ctx context.Context, client *http.Client) (Manifest, error) { + return FetchManifestFrom(ctx, client, manifestURL) +} + +// FetchManifestFrom is FetchManifest against an explicit URL. Used by +// the smoke suite (via `banger update --manifest-url …`) to drive the +// updater against a locally-served fake manifest. Production callers +// stick with FetchManifest. +func FetchManifestFrom(ctx context.Context, client *http.Client, url string) (Manifest, error) { + if client == nil { + client = http.DefaultClient + } + req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) + if err != nil { + return Manifest{}, err + } + resp, err := client.Do(req) + if err != nil { + return Manifest{}, fmt.Errorf("fetch manifest: %w", err) + } + defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return Manifest{}, fmt.Errorf("fetch manifest: HTTP %s", resp.Status) + } + if resp.ContentLength > MaxManifestBytes { + return Manifest{}, fmt.Errorf("manifest is %d bytes, exceeds %d-byte cap", resp.ContentLength, MaxManifestBytes) + } + body, err := io.ReadAll(io.LimitReader(resp.Body, MaxManifestBytes+1)) + if err != nil { + return Manifest{}, fmt.Errorf("read manifest: %w", err) + } + if int64(len(body)) > MaxManifestBytes { + return Manifest{}, fmt.Errorf("manifest body exceeded %d-byte cap", MaxManifestBytes) + } + return ParseManifest(body) +} + +// ParseManifest unmarshals manifest bytes and validates the schema +// version. Exposed as a separate function so tests can drive it +// without an HTTP server. +func ParseManifest(body []byte) (Manifest, error) { + var m Manifest + if err := json.Unmarshal(body, &m); err != nil { + return Manifest{}, fmt.Errorf("parse manifest: %w", err) + } + if m.SchemaVersion == 0 { + return Manifest{}, fmt.Errorf("manifest missing schema_version") + } + if m.SchemaVersion > ManifestSchemaVersion { + return Manifest{}, fmt.Errorf("manifest schema_version %d is newer than this CLI knows (%d); upgrade banger to read it", m.SchemaVersion, ManifestSchemaVersion) + } + if strings.TrimSpace(m.LatestStable) == "" && len(m.Releases) > 0 { + return Manifest{}, fmt.Errorf("manifest missing latest_stable") + } + for i, r := range m.Releases { + if strings.TrimSpace(r.Version) == "" { + return Manifest{}, fmt.Errorf("release[%d]: empty version", i) + } + if strings.TrimSpace(r.TarballURL) == "" { + return Manifest{}, fmt.Errorf("release[%d] (%s): empty tarball_url", i, r.Version) + } + if strings.TrimSpace(r.SHA256SumsURL) == "" { + return Manifest{}, fmt.Errorf("release[%d] (%s): empty sha256sums_url", i, r.Version) + } + } + return m, nil +} + +// LookupRelease finds the release with the given version (e.g. +// "v0.1.0") in the manifest. Returns an error when no match exists — +// helpful when a user passes `--to v9.9.9` against a manifest that +// hasn't seen v9.9.9 yet. +func (m Manifest) LookupRelease(version string) (Release, error) { + wanted := strings.TrimSpace(version) + if wanted == "" { + return Release{}, fmt.Errorf("version is required") + } + for _, r := range m.Releases { + if r.Version == wanted { + return r, nil + } + } + available := make([]string, 0, len(m.Releases)) + for _, r := range m.Releases { + available = append(available, r.Version) + } + return Release{}, fmt.Errorf("release %q not found in manifest (available: %s)", wanted, strings.Join(available, ", ")) +} + +// Latest returns the release matching the manifest's latest_stable +// pointer. Errors when the pointer doesn't reference any listed +// release — that's a manifest publishing bug worth surfacing rather +// than silently picking some other release. +func (m Manifest) Latest() (Release, error) { + return m.LookupRelease(m.LatestStable) +} diff --git a/internal/updater/manifest_test.go b/internal/updater/manifest_test.go new file mode 100644 index 0000000..abb4efc --- /dev/null +++ b/internal/updater/manifest_test.go @@ -0,0 +1,113 @@ +package updater + +import ( + "context" + "net/http" + "net/http/httptest" + "strings" + "testing" +) + +const sampleManifest = `{ + "schema_version": 1, + "latest_stable": "v0.1.1", + "releases": [ + { + "version": "v0.1.0", + "tarball_url": "https://releases.thaloco.com/banger/v0.1.0/banger-v0.1.0-linux-amd64.tar.gz", + "sha256sums_url": "https://releases.thaloco.com/banger/v0.1.0/SHA256SUMS", + "sha256sums_sig_url": "https://releases.thaloco.com/banger/v0.1.0/SHA256SUMS.sig", + "released_at": "2026-04-29T10:00:00Z" + }, + { + "version": "v0.1.1", + "tarball_url": "https://releases.thaloco.com/banger/v0.1.1/banger-v0.1.1-linux-amd64.tar.gz", + "sha256sums_url": "https://releases.thaloco.com/banger/v0.1.1/SHA256SUMS", + "sha256sums_sig_url": "https://releases.thaloco.com/banger/v0.1.1/SHA256SUMS.sig", + "released_at": "2026-05-01T10:00:00Z" + } + ] +}` + +func TestParseManifestHappyPath(t *testing.T) { + m, err := ParseManifest([]byte(sampleManifest)) + if err != nil { + t.Fatalf("ParseManifest: %v", err) + } + if m.LatestStable != "v0.1.1" || len(m.Releases) != 2 { + t.Fatalf("manifest = %+v, want 2 releases with latest_stable=v0.1.1", m) + } +} + +func TestParseManifestRejectsNewerSchema(t *testing.T) { + body := strings.Replace(sampleManifest, `"schema_version": 1`, `"schema_version": 99`, 1) + _, err := ParseManifest([]byte(body)) + if err == nil || !strings.Contains(err.Error(), "newer than this CLI") { + t.Fatalf("err = %v, want newer-schema rejection", err) + } +} + +func TestParseManifestRejectsMissingFields(t *testing.T) { + for _, tc := range []struct { + name string + body string + }{ + {name: "missing_schema_version", body: `{"latest_stable":"v0.1.0","releases":[]}`}, + {name: "missing_tarball_url", body: `{"schema_version":1,"latest_stable":"v0.1.0","releases":[{"version":"v0.1.0","sha256sums_url":"x"}]}`}, + {name: "missing_sha256sums_url", body: `{"schema_version":1,"latest_stable":"v0.1.0","releases":[{"version":"v0.1.0","tarball_url":"x"}]}`}, + {name: "empty_version", body: `{"schema_version":1,"latest_stable":"v0.1.0","releases":[{"tarball_url":"x","sha256sums_url":"y"}]}`}, + {name: "garbage", body: "not json"}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + if _, err := ParseManifest([]byte(tc.body)); err == nil { + t.Fatalf("expected error parsing %s; got success", tc.name) + } + }) + } +} + +func TestManifestLookupRelease(t *testing.T) { + m, _ := ParseManifest([]byte(sampleManifest)) + r, err := m.LookupRelease("v0.1.0") + if err != nil { + t.Fatalf("LookupRelease(v0.1.0): %v", err) + } + if !strings.HasSuffix(r.TarballURL, "banger-v0.1.0-linux-amd64.tar.gz") { + t.Fatalf("wrong tarball url: %s", r.TarballURL) + } + if _, err := m.LookupRelease("v9.9.9"); err == nil { + t.Fatal("expected error looking up missing release") + } +} + +func TestManifestLatest(t *testing.T) { + m, _ := ParseManifest([]byte(sampleManifest)) + r, err := m.Latest() + if err != nil { + t.Fatalf("Latest: %v", err) + } + if r.Version != "v0.1.1" { + t.Fatalf("Latest.Version = %s, want v0.1.1", r.Version) + } +} + +func TestFetchManifestRoundTrip(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte(sampleManifest)) + })) + defer srv.Close() + + // Drive FetchManifest by overriding the global URL temporarily. + prev := manifestURL + manifestURL = srv.URL + defer func() { manifestURL = prev }() + + m, err := FetchManifest(context.Background(), srv.Client()) + if err != nil { + t.Fatalf("FetchManifest: %v", err) + } + if m.LatestStable != "v0.1.1" { + t.Fatalf("LatestStable = %s", m.LatestStable) + } +} diff --git a/internal/updater/sha256sums.go b/internal/updater/sha256sums.go new file mode 100644 index 0000000..0a12fe6 --- /dev/null +++ b/internal/updater/sha256sums.go @@ -0,0 +1,91 @@ +package updater + +import ( + "bufio" + "fmt" + "strings" +) + +// ParseSHA256Sums turns the body of a sha256sum-format file into a +// filename → hex-digest map. Format per line: +// +// <64 hex chars> +// +// Anything else (blank lines, comments starting with '#') is +// tolerated. Returns an error only when a line that LOOKS like an +// entry is malformed — silent skipping of garbage would be the wrong +// failure mode for a security-relevant input. +// +// Used by `banger update` after downloading the SHA256SUMS file +// alongside the release tarball: look up the tarball's basename in +// the resulting map to get its expected hash. +func ParseSHA256Sums(body []byte) (map[string]string, error) { + out := map[string]string{} + scanner := bufio.NewScanner(strings.NewReader(string(body))) + scanner.Buffer(make([]byte, 64*1024), 64*1024) + lineNo := 0 + for scanner.Scan() { + lineNo++ + line := strings.TrimSpace(scanner.Text()) + if line == "" || strings.HasPrefix(line, "#") { + continue + } + // Tolerate the BSD-style `SHA256 (file) = hex` form too — + // some signing pipelines emit it. The GNU-style is what + // `sha256sum` defaults to. + if rest, ok := strings.CutPrefix(line, "SHA256 ("); ok { + closingParen := strings.Index(rest, ")") + eq := strings.LastIndex(rest, "= ") + if closingParen <= 0 || eq <= closingParen { + return nil, fmt.Errorf("line %d: malformed BSD-style sum line", lineNo) + } + file := strings.TrimSpace(rest[:closingParen]) + digest := strings.TrimSpace(rest[eq+2:]) + if !looksLikeSHA256(digest) { + return nil, fmt.Errorf("line %d: digest %q is not a 64-char hex sha256", lineNo, digest) + } + out[file] = strings.ToLower(digest) + continue + } + fields := strings.Fields(line) + if len(fields) < 2 { + return nil, fmt.Errorf("line %d: expected ` `, got %q", lineNo, line) + } + digest := fields[0] + // GNU format may prefix the filename with `*` for binary mode + // (` *file`) or a leading space; trim it. + filename := strings.TrimSpace(strings.Join(fields[1:], " ")) + filename = strings.TrimPrefix(filename, "*") + if !looksLikeSHA256(digest) { + return nil, fmt.Errorf("line %d: digest %q is not a 64-char hex sha256", lineNo, digest) + } + out[filename] = strings.ToLower(digest) + } + if err := scanner.Err(); err != nil { + return nil, err + } + if len(out) == 0 { + return nil, fmt.Errorf("SHA256SUMS body contained no entries") + } + return out, nil +} + +// looksLikeSHA256 returns true when s is exactly 64 hex characters. +// Doesn't check that those bytes are themselves a valid digest of +// anything — that's the cryptographic verifier's job, not the +// parser's. +func looksLikeSHA256(s string) bool { + if len(s) != 64 { + return false + } + for _, c := range s { + switch { + case c >= '0' && c <= '9': + case c >= 'a' && c <= 'f': + case c >= 'A' && c <= 'F': + default: + return false + } + } + return true +} diff --git a/internal/updater/sha256sums_test.go b/internal/updater/sha256sums_test.go new file mode 100644 index 0000000..77b3094 --- /dev/null +++ b/internal/updater/sha256sums_test.go @@ -0,0 +1,98 @@ +package updater + +import ( + "strings" + "testing" +) + +func TestParseSHA256SumsGNUFormat(t *testing.T) { + body := []byte(`# header comment +0000000000000000000000000000000000000000000000000000000000000001 banger-v0.1.0-linux-amd64.tar.gz +0000000000000000000000000000000000000000000000000000000000000002 banger-v0.1.0-linux-amd64.tar.gz.sig +`) + got, err := ParseSHA256Sums(body) + if err != nil { + t.Fatalf("ParseSHA256Sums: %v", err) + } + if got["banger-v0.1.0-linux-amd64.tar.gz"] != "0000000000000000000000000000000000000000000000000000000000000001" { + t.Fatalf("tarball digest = %q", got["banger-v0.1.0-linux-amd64.tar.gz"]) + } + if len(got) != 2 { + t.Fatalf("got %d entries, want 2", len(got)) + } +} + +func TestParseSHA256SumsBSDFormat(t *testing.T) { + body := []byte(`SHA256 (banger-v0.1.0-linux-amd64.tar.gz) = 0000000000000000000000000000000000000000000000000000000000000001 +`) + got, err := ParseSHA256Sums(body) + if err != nil { + t.Fatalf("ParseSHA256Sums: %v", err) + } + if got["banger-v0.1.0-linux-amd64.tar.gz"] != "0000000000000000000000000000000000000000000000000000000000000001" { + t.Fatalf("digest = %q", got["banger-v0.1.0-linux-amd64.tar.gz"]) + } +} + +func TestParseSHA256SumsBinaryStarPrefix(t *testing.T) { + // `sha256sum -b` emits ` *`. + body := []byte(`0000000000000000000000000000000000000000000000000000000000000001 *banger-v0.1.0-linux-amd64.tar.gz +`) + got, err := ParseSHA256Sums(body) + if err != nil { + t.Fatalf("ParseSHA256Sums: %v", err) + } + if _, ok := got["banger-v0.1.0-linux-amd64.tar.gz"]; !ok { + t.Fatalf("entries = %v, want star-prefix stripped", got) + } +} + +func TestParseSHA256SumsTolerantOfBlankAndComments(t *testing.T) { + body := []byte(` +# top comment +0000000000000000000000000000000000000000000000000000000000000001 a + +# inline comment +0000000000000000000000000000000000000000000000000000000000000002 b +`) + got, err := ParseSHA256Sums(body) + if err != nil { + t.Fatalf("ParseSHA256Sums: %v", err) + } + if len(got) != 2 { + t.Fatalf("got %d, want 2", len(got)) + } +} + +func TestParseSHA256SumsRejectsMalformed(t *testing.T) { + for _, tc := range []struct { + name string + body string + }{ + {name: "short_digest", body: "abc file"}, + {name: "non_hex_digest", body: "zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz file"}, + {name: "no_filename", body: "0000000000000000000000000000000000000000000000000000000000000001"}, + {name: "empty_body", body: ""}, + {name: "only_comments", body: "# comment\n# more\n"}, + {name: "bsd_no_eq", body: "SHA256 (file) 0000000000000000000000000000000000000000000000000000000000000001"}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + if _, err := ParseSHA256Sums([]byte(tc.body)); err == nil { + t.Fatalf("expected error for %s", tc.name) + } + }) + } +} + +func TestParseSHA256SumsLowercasesDigest(t *testing.T) { + body := []byte(`ABCDEF1234567890ABCDEF1234567890ABCDEF1234567890ABCDEF1234567890 upper +`) + got, err := ParseSHA256Sums(body) + if err != nil { + t.Fatalf("ParseSHA256Sums: %v", err) + } + if got["upper"] != strings.ToLower("ABCDEF1234567890ABCDEF1234567890ABCDEF1234567890ABCDEF1234567890") { + t.Fatalf("digest not lowercased: %q", got["upper"]) + } +} diff --git a/internal/updater/stage.go b/internal/updater/stage.go new file mode 100644 index 0000000..3a7794c --- /dev/null +++ b/internal/updater/stage.go @@ -0,0 +1,107 @@ +package updater + +import ( + "archive/tar" + "compress/gzip" + "fmt" + "io" + "os" + "path/filepath" + "strings" +) + +// expectedReleaseEntries is the canonical set of files a release +// tarball must contain. Anything missing OR anything extra is +// rejected — banger update should not unpack arbitrary files into +// the staging dir. +var expectedReleaseEntries = []string{ + "banger", + "bangerd", + "banger-vsock-agent", +} + +// StagedRelease describes the result of unpacking a release tarball +// into a staging directory. +type StagedRelease struct { + BangerPath string + BangerdPath string + VsockAgentPath string +} + +// StageTarball reads the gzipped tar at tarballPath and extracts the +// expected three banger binaries into stagingDir. Any extra entries, +// any path-traversal members, any non-regular-file members, and any +// missing required entry are rejected. +// +// The extracted binaries are mode 0o755 regardless of what the +// tarball claims — banger update is a privileged operation; we +// don't honour weird modes from the wire. +func StageTarball(tarballPath, stagingDir string) (StagedRelease, error) { + if err := os.MkdirAll(stagingDir, 0o700); err != nil { + return StagedRelease{}, err + } + f, err := os.Open(tarballPath) + if err != nil { + return StagedRelease{}, err + } + defer f.Close() + gz, err := gzip.NewReader(f) + if err != nil { + return StagedRelease{}, fmt.Errorf("open gzip: %w", err) + } + defer gz.Close() + + expected := map[string]struct{}{} + for _, name := range expectedReleaseEntries { + expected[name] = struct{}{} + } + seen := map[string]string{} + + tr := tar.NewReader(gz) + for { + hdr, err := tr.Next() + if err == io.EOF { + break + } + if err != nil { + return StagedRelease{}, fmt.Errorf("read tar: %w", err) + } + rel := filepath.Clean(hdr.Name) + if rel == "." || rel == string(filepath.Separator) { + continue + } + if filepath.IsAbs(rel) || rel == ".." || strings.HasPrefix(rel, ".."+string(filepath.Separator)) { + return StagedRelease{}, fmt.Errorf("unsafe path in tarball: %q", hdr.Name) + } + if _, ok := expected[rel]; !ok { + return StagedRelease{}, fmt.Errorf("unexpected entry in release tarball: %q (allowed: %v)", hdr.Name, expectedReleaseEntries) + } + if hdr.Typeflag != tar.TypeReg { + return StagedRelease{}, fmt.Errorf("entry %q is not a regular file (typeflag %d)", hdr.Name, hdr.Typeflag) + } + dst := filepath.Join(stagingDir, rel) + out, err := os.OpenFile(dst, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o755) + if err != nil { + return StagedRelease{}, err + } + if _, err := io.Copy(out, tr); err != nil { + _ = out.Close() + return StagedRelease{}, err + } + if err := out.Close(); err != nil { + return StagedRelease{}, err + } + seen[rel] = dst + } + + for _, want := range expectedReleaseEntries { + if _, ok := seen[want]; !ok { + return StagedRelease{}, fmt.Errorf("release tarball is missing required entry %q", want) + } + } + return StagedRelease{ + BangerPath: seen["banger"], + BangerdPath: seen["bangerd"], + VsockAgentPath: seen["banger-vsock-agent"], + }, nil +} diff --git a/internal/updater/swap.go b/internal/updater/swap.go new file mode 100644 index 0000000..f299deb --- /dev/null +++ b/internal/updater/swap.go @@ -0,0 +1,135 @@ +package updater + +import ( + "errors" + "fmt" + "os" + + "banger/internal/system" +) + +// previousSuffix is the filename suffix appended to the +// pre-swap binary so Rollback knows where to restore from. +// Pinned as a constant so the swap and rollback paths can't +// disagree on it. +const previousSuffix = ".previous" + +// InstallTargets lists the absolute on-disk paths the updater +// writes during a swap. Hardcoded to the system-install layout — +// banger update is a system-mode operation; the developer non- +// system-mode flow doesn't go through this code path. +type InstallTargets struct { + Banger string // /usr/local/bin/banger + Bangerd string // /usr/local/bin/bangerd + VsockAgent string // /usr/local/lib/banger/banger-vsock-agent +} + +// DefaultInstallTargets returns the canonical paths a system install +// uses (`banger system install` writes to these). Exposed for +// testability; production callers use it as-is. +func DefaultInstallTargets() InstallTargets { + return InstallTargets{ + Banger: "/usr/local/bin/banger", + Bangerd: "/usr/local/bin/bangerd", + VsockAgent: "/usr/local/lib/banger/banger-vsock-agent", + } +} + +// SwapResult records what was swapped, so Rollback knows what to +// undo. A nil SwapResult means no swap was attempted yet (nothing +// to roll back). +type SwapResult struct { + Targets InstallTargets + // SwappedTargets is the subset of Targets that were actually + // renamed into place. If the second of three Renames fails, + // SwappedTargets contains only the first; rollback unwinds in + // reverse order. + SwappedTargets []string +} + +// Swap atomically replaces each of the three banger binaries with +// its staged counterpart. Order: +// +// 1. banger-vsock-agent (companion; not currently running, swap is safe) +// 2. bangerd (the to-be-restarted daemon binary) +// 3. banger (the CLI; least disruptive last) +// +// Each AtomicReplace leaves a `.previous` backup so Rollback can +// restore the prior install if a later step fails. +// +// Returns the SwapResult even on partial failure so the caller can +// drive Rollback against what HAS been swapped. +func Swap(staged StagedRelease, targets InstallTargets) (SwapResult, error) { + res := SwapResult{Targets: targets} + steps := []struct { + src, dst string + }{ + {src: staged.VsockAgentPath, dst: targets.VsockAgent}, + {src: staged.BangerdPath, dst: targets.Bangerd}, + {src: staged.BangerPath, dst: targets.Banger}, + } + for _, s := range steps { + if err := ensureParentDir(s.dst); err != nil { + return res, fmt.Errorf("prepare %s: %w", s.dst, err) + } + if err := system.AtomicReplace(s.src, s.dst, previousSuffix); err != nil { + return res, fmt.Errorf("swap %s: %w", s.dst, err) + } + res.SwappedTargets = append(res.SwappedTargets, s.dst) + } + return res, nil +} + +// Rollback undoes a Swap by restoring each .previous backup in +// reverse order. Returns the joined errors of every individual +// rollback that failed; a half-rolled-back tree is the worst case +// and the operator gets enough information to fix it manually. +// +// Tolerant of partial input — passing a SwapResult that only +// recorded the first two of three swaps rolls back exactly those +// two. +func Rollback(res SwapResult) error { + var errs []error + for i := len(res.SwappedTargets) - 1; i >= 0; i-- { + dst := res.SwappedTargets[i] + if err := system.AtomicReplaceRollback(dst, previousSuffix); err != nil { + errs = append(errs, fmt.Errorf("rollback %s: %w", dst, err)) + } + } + return errors.Join(errs...) +} + +// CleanupBackups removes every .previous backup left behind by a +// successful update. Called after `banger doctor` confirms the new +// install is healthy — we don't keep ancient backups around forever. +func CleanupBackups(res SwapResult) error { + var errs []error + for _, dst := range res.SwappedTargets { + if err := os.Remove(dst + previousSuffix); err != nil && !os.IsNotExist(err) { + errs = append(errs, fmt.Errorf("remove %s%s: %w", dst, previousSuffix, err)) + } + } + return errors.Join(errs...) +} + +func ensureParentDir(p string) error { + parent := dirOf(p) + if parent == "" { + return nil + } + if _, err := os.Stat(parent); err == nil { + return nil + } + return os.MkdirAll(parent, 0o755) +} + +// dirOf is a tiny path.Dir wrapper that returns "" for paths with +// no separator (so the ensure-parent logic doesn't try to mkdir(".")). +func dirOf(p string) string { + for i := len(p) - 1; i >= 0; i-- { + if p[i] == '/' { + return p[:i] + } + } + return "" +} diff --git a/internal/updater/verify_signature.go b/internal/updater/verify_signature.go new file mode 100644 index 0000000..d2a9985 --- /dev/null +++ b/internal/updater/verify_signature.go @@ -0,0 +1,144 @@ +package updater + +import ( + "context" + "crypto/ecdsa" + "crypto/sha256" + "crypto/x509" + "encoding/base64" + "encoding/pem" + "errors" + "fmt" + "net/http" + "strings" +) + +// MaxSignatureBytes caps the cosign signature download. A blob +// signature is ~70 bytes raw (an ECDSA P-256 ASN.1 signature) plus +// some base64 overhead and a trailing newline; 1 KiB is generous. +const MaxSignatureBytes int64 = 1024 + +// BangerReleasePublicKey is the cosign-generated public key used to +// verify SHA256SUMS for every banger release. SET ME BEFORE THE +// FIRST RELEASE. The placeholder below is intentionally invalid so +// `banger update` refuses every download until a real key lands. +// +// Production-cut workflow (for the maintainer cutting v0.1.0): +// +// 1. Generate the keypair (one-time, store the private key offline): +// cosign generate-key-pair +// Produces cosign.key (private) and cosign.pub (public). The +// private key is password-protected; remember the password. +// +// 2. Replace the PEM block below with the contents of cosign.pub. +// Commit. From this point on, every banger CLI baked from this +// repo will only trust signatures made with cosign.key. +// +// 3. At release time, sign SHA256SUMS: +// cosign sign-blob --key cosign.key --output-signature \ +// SHA256SUMS.sig SHA256SUMS +// Publish SHA256SUMS.sig alongside SHA256SUMS in the bucket; +// the manifest's `sha256sums_sig_url` field references it. +// +// 4. Rotating the key after publication means publishing a new +// banger release that embeds the new key, then re-signing +// every release artifact with the new key. v0.1.x is too +// early to design a clean rotation story; defer. +// +// var (rather than const) only because tests need to swap it for an +// in-test-generated key; production sets it at compile time and +// never mutates it. +var BangerReleasePublicKey = `-----BEGIN PUBLIC KEY----- +MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAElWFSLKLosBrdjfuF8ZS6U01Ufky4 +zNeVPCkA6HEJ/oe634fRqwFxkXKGWg03eGFSnlwRxnUxN2+duXQSsR0pzQ== +-----END PUBLIC KEY-----` + +// ErrSignatureRequired is returned by VerifyManifestRelease when the +// embedded public key is the placeholder. Surfaces as a clear "the +// release maintainer hasn't published their cosign key yet, refusing +// to update" rather than a cryptic crypto error. +var ErrSignatureRequired = errors.New("banger release public key is the placeholder; the maintainer must replace it and re-cut a release before update can proceed") + +// VerifyBlobSignature checks that sigBase64 is a valid cosign-blob +// signature over body, made with the private counterpart of +// BangerReleasePublicKey. +func VerifyBlobSignature(body, sigBase64 []byte) error { + return VerifyBlobSignatureWithKey(body, sigBase64, BangerReleasePublicKey) +} + +// VerifyBlobSignatureWithKey is VerifyBlobSignature against an +// explicit PEM-encoded public key. Used by the smoke suite (via +// `banger update --pubkey-file …`) so an end-to-end update test can +// trust a locally-generated keypair without rebuilding the binary. +// +// Refuses outright if pubKeyPEM is the build-time placeholder so an +// unset key can't slip through as "verification disabled". +// +// cosign's blob signature format is a base64-encoded ASN.1-DER ECDSA +// signature over SHA256(body) — that's what ecdsa.VerifyASN1 takes. +func VerifyBlobSignatureWithKey(body, sigBase64 []byte, pubKeyPEM string) error { + if isPlaceholderKey(pubKeyPEM) { + return ErrSignatureRequired + } + block, _ := pem.Decode([]byte(pubKeyPEM)) + if block == nil { + return fmt.Errorf("decode banger release public key: no PEM block") + } + pubAny, err := x509.ParsePKIXPublicKey(block.Bytes) + if err != nil { + return fmt.Errorf("parse banger release public key: %w", err) + } + pub, ok := pubAny.(*ecdsa.PublicKey) + if !ok { + return fmt.Errorf("banger release public key is not ECDSA") + } + sigBytes, err := base64.StdEncoding.DecodeString(strings.TrimSpace(string(sigBase64))) + if err != nil { + return fmt.Errorf("decode signature base64: %w", err) + } + digest := sha256.Sum256(body) + if !ecdsa.VerifyASN1(pub, digest[:], sigBytes) { + return fmt.Errorf("signature does not verify against banger release public key") + } + return nil +} + +// FetchAndVerifySignature pulls the SHA256SUMS.sig URL from the +// release, downloads it (capped), and verifies it against sumsBody. +// Returns nil on a clean pass, or an error describing exactly why +// verification failed. +// +// If release.SHA256SumsSigURL is empty, treat that as "release was +// not signed" — refuse rather than silently proceeding. v0.1.0 +// requires every release to be cosign-signed; an unsigned release +// is a manifest publishing bug we'd rather catch loudly. +func FetchAndVerifySignature(ctx context.Context, client *http.Client, release Release, sumsBody []byte) error { + return FetchAndVerifySignatureWithKey(ctx, client, release, sumsBody, BangerReleasePublicKey) +} + +// FetchAndVerifySignatureWithKey is FetchAndVerifySignature against +// an explicit PEM-encoded public key. +func FetchAndVerifySignatureWithKey(ctx context.Context, client *http.Client, release Release, sumsBody []byte, pubKeyPEM string) error { + if strings.TrimSpace(release.SHA256SumsSigURL) == "" { + return fmt.Errorf("release %s has no sha256sums_sig_url; refusing to install an unsigned release", release.Version) + } + if client == nil { + client = http.DefaultClient + } + sig, err := fetchBounded(ctx, client, release.SHA256SumsSigURL, MaxSignatureBytes) + if err != nil { + return fmt.Errorf("fetch signature: %w", err) + } + if err := VerifyBlobSignatureWithKey(sumsBody, sig, pubKeyPEM); err != nil { + return fmt.Errorf("verify SHA256SUMS signature: %w", err) + } + return nil +} + +// isPlaceholderKey detects the build-time placeholder constant. A +// real cosign-generated PEM never contains the string "PLACEHOLDER"; +// a real ECDSA P-256 key block decodes to ~91 bytes of content, +// nowhere near our padded constant. +func isPlaceholderKey(pem string) bool { + return strings.Contains(pem, "PLACEHOLDER") +} diff --git a/internal/updater/verify_signature_test.go b/internal/updater/verify_signature_test.go new file mode 100644 index 0000000..7f0121f --- /dev/null +++ b/internal/updater/verify_signature_test.go @@ -0,0 +1,127 @@ +package updater + +import ( + "crypto/ecdsa" + "crypto/elliptic" + "crypto/rand" + "crypto/sha256" + "crypto/x509" + "encoding/base64" + "encoding/pem" + "errors" + "strings" + "testing" +) + +// generateTestKey produces an ECDSA P-256 keypair in PEM form, +// matching the shape `cosign generate-key-pair` emits for the public +// half. The private half stays in-test for signing. +func generateTestKey(t *testing.T) (privKey *ecdsa.PrivateKey, pubPEM string) { + t.Helper() + priv, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader) + if err != nil { + t.Fatalf("generate key: %v", err) + } + der, err := x509.MarshalPKIXPublicKey(&priv.PublicKey) + if err != nil { + t.Fatalf("marshal public key: %v", err) + } + pubPEM = string(pem.EncodeToMemory(&pem.Block{Type: "PUBLIC KEY", Bytes: der})) + return priv, pubPEM +} + +// signBlob mimics `cosign sign-blob`'s output: base64-encoded ASN.1-DER +// ECDSA signature over SHA256(body). +func signBlob(t *testing.T, priv *ecdsa.PrivateKey, body []byte) string { + t.Helper() + digest := sha256.Sum256(body) + sig, err := ecdsa.SignASN1(rand.Reader, priv, digest[:]) + if err != nil { + t.Fatalf("sign: %v", err) + } + return base64.StdEncoding.EncodeToString(sig) +} + +func TestVerifyBlobSignaturePlaceholderRefuses(t *testing.T) { + // A build that hasn't replaced the placeholder key must refuse + // every verify call with ErrSignatureRequired so an un-rotated + // build can't silently accept anything. Swap the embedded key + // out for the placeholder shape and assert that. + prev := BangerReleasePublicKey + BangerReleasePublicKey = `-----BEGIN PUBLIC KEY----- +MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEPLACEHOLDER0000000000000000000 +000000000000000000000000000000000000000000000000000000000000PLACE +-----END PUBLIC KEY-----` + defer func() { BangerReleasePublicKey = prev }() + err := VerifyBlobSignature([]byte("body"), []byte("sig")) + if !errors.Is(err, ErrSignatureRequired) { + t.Fatalf("err = %v, want ErrSignatureRequired", err) + } +} + +func TestVerifyBlobSignatureHappyPath(t *testing.T) { + priv, pubPEM := generateTestKey(t) + prev := BangerReleasePublicKey + BangerReleasePublicKey = pubPEM + defer func() { BangerReleasePublicKey = prev }() + + body := []byte("SHA256SUMS body bytes") + sig := signBlob(t, priv, body) + if err := VerifyBlobSignature(body, []byte(sig)); err != nil { + t.Fatalf("VerifyBlobSignature: %v", err) + } +} + +func TestVerifyBlobSignatureRejectsTamperedBody(t *testing.T) { + priv, pubPEM := generateTestKey(t) + prev := BangerReleasePublicKey + BangerReleasePublicKey = pubPEM + defer func() { BangerReleasePublicKey = prev }() + + body := []byte("original body") + sig := signBlob(t, priv, body) + tampered := []byte("tampered body") + err := VerifyBlobSignature(tampered, []byte(sig)) + if err == nil || !strings.Contains(err.Error(), "does not verify") { + t.Fatalf("err = %v, want signature-mismatch", err) + } +} + +func TestVerifyBlobSignatureRejectsWrongKey(t *testing.T) { + // Sign with one key, verify with a different one. + signingPriv, _ := generateTestKey(t) + _, otherPubPEM := generateTestKey(t) + prev := BangerReleasePublicKey + BangerReleasePublicKey = otherPubPEM + defer func() { BangerReleasePublicKey = prev }() + + body := []byte("body") + sig := signBlob(t, signingPriv, body) + err := VerifyBlobSignature(body, []byte(sig)) + if err == nil || !strings.Contains(err.Error(), "does not verify") { + t.Fatalf("err = %v, want wrong-key rejection", err) + } +} + +func TestVerifyBlobSignatureRejectsMalformed(t *testing.T) { + _, pubPEM := generateTestKey(t) + prev := BangerReleasePublicKey + BangerReleasePublicKey = pubPEM + defer func() { BangerReleasePublicKey = prev }() + for _, tc := range []struct { + name string + sig string + }{ + {name: "not_base64", sig: "!!!not_b64!!!"}, + {name: "empty", sig: ""}, + {name: "garbage_bytes", sig: base64.StdEncoding.EncodeToString([]byte{0x01, 0x02, 0x03})}, + } { + tc := tc + t.Run(tc.name, func(t *testing.T) { + err := VerifyBlobSignature([]byte("body"), []byte(tc.sig)) + if err == nil { + t.Fatalf("expected error for %s; got success", tc.name) + } + }) + } +} diff --git a/internal/updater/verify_smoke_check_test.go b/internal/updater/verify_smoke_check_test.go new file mode 100644 index 0000000..6929880 --- /dev/null +++ b/internal/updater/verify_smoke_check_test.go @@ -0,0 +1,54 @@ +package updater + +import ( + "os/exec" + "path/filepath" + "testing" +) + +// TestVerifyBlobSignatureWithOpenSSL is a confidence test for the +// smoke release-builder path: openssl's `dgst -sha256 -sign` produces +// the exact same encoding cosign emits for blob signatures (base64 +// ASN.1 ECDSA over SHA256(body)). If this ever stops verifying, the +// smoke update scenarios will silently skip the signature check — +// catching it here avoids a heisenbug in scripts/smoke.sh. +func TestVerifyBlobSignatureWithOpenSSL(t *testing.T) { + if _, err := exec.LookPath("openssl"); err != nil { + t.Skip("openssl not on PATH") + } + dir := t.TempDir() + keyPath := filepath.Join(dir, "cosign.key") + pubPath := filepath.Join(dir, "cosign.pub") + bodyPath := filepath.Join(dir, "body.txt") + sigPath := filepath.Join(dir, "body.sig") + + mustRun := func(name string, args ...string) { + t.Helper() + out, err := exec.Command(name, args...).CombinedOutput() + if err != nil { + t.Fatalf("%s %v: %v\n%s", name, args, err, string(out)) + } + } + + mustRun("openssl", "ecparam", "-name", "prime256v1", "-genkey", "-noout", "-out", keyPath) + mustRun("openssl", "ec", "-in", keyPath, "-pubout", "-out", pubPath) + mustRun("sh", "-c", "printf 'banger smoke release sums\n' > "+bodyPath) + mustRun("sh", "-c", "openssl dgst -sha256 -sign "+keyPath+" "+bodyPath+" | base64 -w0 > "+sigPath) + + body := readFile(t, bodyPath) + sig := readFile(t, sigPath) + pub := readFile(t, pubPath) + + if err := VerifyBlobSignatureWithKey(body, sig, string(pub)); err != nil { + t.Fatalf("VerifyBlobSignatureWithKey: %v", err) + } +} + +func readFile(t *testing.T, p string) []byte { + t.Helper() + out, err := exec.Command("cat", p).Output() + if err != nil { + t.Fatalf("read %s: %v", p, err) + } + return out +} diff --git a/internal/vmdns/remove_test.go b/internal/vmdns/remove_test.go new file mode 100644 index 0000000..09eb302 --- /dev/null +++ b/internal/vmdns/remove_test.go @@ -0,0 +1,93 @@ +package vmdns + +import ( + "testing" +) + +func TestServerRemoveDropsRecord(t *testing.T) { + server := startTestServer(t) + if err := server.Set("devbox.vm", "172.16.0.8"); err != nil { + t.Fatalf("Set: %v", err) + } + if _, ok := server.Lookup("devbox.vm"); !ok { + t.Fatal("record missing before remove") + } + + if err := server.Remove("devbox.vm"); err != nil { + t.Fatalf("Remove: %v", err) + } + if _, ok := server.Lookup("devbox.vm"); ok { + t.Fatal("record still present after Remove") + } +} + +func TestServerRemoveInvalidNameIsNoop(t *testing.T) { + server := startTestServer(t) + // Non-.vm names silently normalize-fail, returning nil. + if err := server.Remove("example.com"); err != nil { + t.Fatalf("Remove: %v", err) + } +} + +func TestServerRemoveNilReceiver(t *testing.T) { + var s *Server + if err := s.Remove("anything.vm"); err != nil { + t.Fatalf("nil Remove: %v", err) + } +} + +func TestServerSetRejectsIPv6(t *testing.T) { + server := startTestServer(t) + if err := server.Set("six.vm", "::1"); err == nil { + t.Fatal("expected error for IPv6 address") + } +} + +func TestServerSetRejectsBadIP(t *testing.T) { + server := startTestServer(t) + if err := server.Set("bad.vm", "not-an-ip"); err == nil { + t.Fatal("expected parse error for bogus IP") + } +} + +func TestServerSetRejectsNonVMName(t *testing.T) { + server := startTestServer(t) + if err := server.Set("example.com", "172.16.0.1"); err == nil { + t.Fatal("expected error for non-.vm name") + } +} + +func TestServerReplaceRejectsBadIP(t *testing.T) { + server := startTestServer(t) + err := server.Replace(map[string]string{"bad.vm": "nope"}) + if err == nil { + t.Fatal("expected parse error") + } +} + +func TestServerReplaceRejectsIPv6(t *testing.T) { + server := startTestServer(t) + err := server.Replace(map[string]string{"six.vm": "::1"}) + if err == nil { + t.Fatal("expected IPv6 rejection") + } +} + +func TestServerNilLookupAndAddr(t *testing.T) { + var s *Server + if _, ok := s.Lookup("x.vm"); ok { + t.Fatal("nil Lookup should return false") + } + if got := s.Addr(); got != "" { + t.Fatalf("nil Addr = %q, want empty", got) + } + if err := s.Close(); err != nil { + t.Fatalf("nil Close: %v", err) + } + if err := s.Set("x.vm", "172.16.0.1"); err != nil { + t.Fatalf("nil Set: %v", err) + } + if err := s.Replace(map[string]string{"x.vm": "172.16.0.1"}); err != nil { + t.Fatalf("nil Replace: %v", err) + } +} diff --git a/internal/webui/assets/app.js b/internal/webui/assets/app.js deleted file mode 100644 index 0897317..0000000 --- a/internal/webui/assets/app.js +++ /dev/null @@ -1,130 +0,0 @@ -(() => { - const operationCard = document.querySelector("[data-operation-url]"); - if (operationCard) { - const stageNode = document.getElementById("operation-stage"); - const detailNode = document.getElementById("operation-detail"); - const errorNode = document.getElementById("operation-error"); - const logNode = document.getElementById("operation-log"); - const statusUrl = operationCard.dataset.operationUrl; - const successUrl = operationCard.dataset.operationSuccess; - - const poll = async () => { - const response = await fetch(statusUrl, { headers: { Accept: "application/json" } }); - if (!response.ok) { - return; - } - const payload = await response.json(); - const op = payload.operation || {}; - if (stageNode) stageNode.textContent = op.stage || "queued"; - if (detailNode) detailNode.textContent = op.detail || ""; - if (errorNode) errorNode.textContent = op.error || ""; - if (logNode && op.build_log_path) logNode.textContent = op.build_log_path; - if (op.done && op.success && successUrl) { - window.location.assign(successUrl); - return; - } - if (!op.done) { - window.setTimeout(poll, 1000); - } - }; - window.setTimeout(poll, 800); - } - - const copyButtons = document.querySelectorAll("[data-copy-text]"); - copyButtons.forEach((button) => { - button.addEventListener("click", async () => { - try { - await navigator.clipboard.writeText(button.dataset.copyText || ""); - button.textContent = "Copied"; - window.setTimeout(() => { button.textContent = "Copy"; }, 1000); - } catch (_) {} - }); - }); - - document.querySelectorAll("form[data-confirm]").forEach((form) => { - form.addEventListener("submit", (event) => { - const message = form.dataset.confirm || "Are you sure?"; - if (!window.confirm(message)) { - event.preventDefault(); - } - }); - }); - - const logToggle = document.getElementById("log-auto-refresh"); - if (logToggle) { - const schedule = () => { - if (!logToggle.checked) return; - window.setTimeout(() => { - if (logToggle.checked) { - window.location.reload(); - } - }, 4000); - }; - logToggle.addEventListener("change", schedule); - schedule(); - } - - const dialog = document.getElementById("path-picker"); - if (!dialog) return; - - const listNode = document.getElementById("picker-list"); - const currentPathNode = document.getElementById("picker-current-path"); - const closeButton = document.getElementById("picker-close"); - const selectCurrentButton = document.getElementById("picker-select-current"); - let currentInput = null; - let currentKind = "file"; - let currentPath = "/"; - - const loadListing = async (path) => { - const response = await fetch(`/api/fs?path=${encodeURIComponent(path)}&kind=${encodeURIComponent(currentKind)}`, { - headers: { Accept: "application/json" } - }); - if (!response.ok) return; - const payload = await response.json(); - currentPath = payload.path; - currentPathNode.textContent = payload.path; - listNode.innerHTML = ""; - payload.entries.forEach((entry) => { - const button = document.createElement("button"); - button.type = "button"; - button.className = "picker-entry"; - button.dataset.kind = entry.kind; - button.dataset.path = entry.path; - button.innerHTML = `${entry.name}${entry.kind}`; - button.addEventListener("click", () => { - if (entry.kind === "dir" || entry.kind === "up") { - loadListing(entry.path); - return; - } - if (currentInput) { - currentInput.value = entry.path; - dialog.close(); - } - }); - listNode.appendChild(button); - }); - }; - - document.querySelectorAll("[data-picker-target]").forEach((button) => { - button.addEventListener("click", () => { - const fieldName = button.dataset.pickerTarget; - currentKind = button.dataset.pickerKind || "file"; - currentInput = document.querySelector(`input[name="${fieldName}"]`); - if (!currentInput) return; - const initialPath = currentInput.value || "/"; - dialog.showModal(); - loadListing(initialPath); - }); - }); - - document.querySelectorAll("[data-picker-root]").forEach((button) => { - button.addEventListener("click", () => loadListing(button.dataset.pickerRoot || "/")); - }); - - closeButton.addEventListener("click", () => dialog.close()); - selectCurrentButton.addEventListener("click", () => { - if (!currentInput) return; - currentInput.value = currentPath; - dialog.close(); - }); -})(); diff --git a/internal/webui/assets/style.css b/internal/webui/assets/style.css deleted file mode 100644 index 0b28255..0000000 --- a/internal/webui/assets/style.css +++ /dev/null @@ -1,513 +0,0 @@ -:root { - --bg: #f2eadf; - --panel: rgba(255, 252, 246, 0.92); - --panel-strong: #fffdf7; - --ink: #1f2a22; - --muted: #5f675f; - --accent: #c8622d; - --accent-strong: #9a3f14; - --success: #33643b; - --warning: #9a5b11; - --danger: #8f2f24; - --line: rgba(31, 42, 34, 0.14); - --shadow: 0 24px 60px rgba(57, 41, 24, 0.12); - --radius: 20px; -} - -* { box-sizing: border-box; } -body { - margin: 0; - font-family: "IBM Plex Sans", "Avenir Next", "Segoe UI", sans-serif; - color: var(--ink); - background: - radial-gradient(circle at top left, rgba(200, 98, 45, 0.18), transparent 28%), - radial-gradient(circle at top right, rgba(92, 141, 89, 0.14), transparent 24%), - linear-gradient(180deg, #efe1d1 0%, #f7f1ea 48%, #efe8de 100%); -} - -code, pre, input, select, button { - font-family: "IBM Plex Mono", "SFMono-Regular", monospace; -} - -a { color: inherit; text-decoration: none; } -a[href] { cursor: pointer; } -button:not(:disabled) { cursor: pointer; } - -.app-shell { - max-width: 1320px; - margin: 0 auto; - padding: 28px 20px 56px; -} - -.topbar, .content-panel, .summary-card, .banner, .detail-card, .operation-card { - backdrop-filter: blur(12px); - background: var(--panel); - box-shadow: var(--shadow); -} - -.topbar, .content-panel, .banner { - border-radius: var(--radius); -} - -.topbar { - display: flex; - justify-content: space-between; - align-items: end; - gap: 24px; - padding: 24px 28px; -} - -.topbar h1, .panel-head h2, .detail-card h2, .detail-card h3, .operation-card h2, .operation-card h3 { - margin: 0; - font-family: Georgia, "Iowan Old Style", serif; -} - -.eyebrow { - margin: 0 0 8px; - text-transform: uppercase; - letter-spacing: 0.16em; - font-size: 0.72rem; - color: var(--muted); -} - -.nav { - display: flex; - gap: 10px; - flex-wrap: wrap; -} - -.nav a, .button { - display: inline-flex; - align-items: center; - justify-content: center; - border-radius: 999px; - border: 1px solid transparent; - padding: 11px 16px; - transition: 160ms ease; - cursor: pointer; -} - -.nav a { - background: rgba(255, 255, 255, 0.48); -} - -.nav a.active, .nav a:hover { - background: #fff7ee; - border-color: rgba(200, 98, 45, 0.22); -} - -.banner { - margin-top: 18px; - padding: 16px 20px; - display: flex; - gap: 12px; - flex-wrap: wrap; - border: 1px solid var(--line); -} - -.banner.warning { border-color: rgba(154, 91, 17, 0.25); } -.banner.success { border-color: rgba(51, 100, 59, 0.25); } -.banner.error { border-color: rgba(143, 47, 36, 0.25); } -.banner.info { border-color: rgba(31, 42, 34, 0.18); } - -.summary-grid, .detail-grid, .split-grid, .command-grid { - display: grid; - gap: 16px; - margin-top: 20px; -} - -.summary-grid { - grid-template-columns: repeat(auto-fit, minmax(260px, 1fr)); -} - -.summary-card, .detail-card, .operation-card { - border-radius: 18px; - border: 1px solid var(--line); - padding: 18px 20px; -} - -.detail-card h2, .operation-card h2 { - margin-bottom: 12px; - font-size: 1.25rem; -} - -.summary-card p:last-child { margin: 0; color: var(--muted); } - -.resource-card { - display: grid; - gap: 14px; - padding: 20px 22px; - overflow: hidden; - position: relative; -} - -.resource-card::before { - content: ""; - position: absolute; - inset: 0; - opacity: 0.7; - pointer-events: none; -} - -.resource-card.cpu::before { - background: radial-gradient(circle at top right, rgba(200, 98, 45, 0.18), transparent 38%); -} - -.resource-card.memory::before { - background: radial-gradient(circle at top right, rgba(92, 141, 89, 0.16), transparent 38%); -} - -.resource-card.disk::before { - background: radial-gradient(circle at top right, rgba(31, 42, 34, 0.1), transparent 42%); -} - -.resource-head, .resource-foot { - display: flex; - justify-content: space-between; - align-items: baseline; - gap: 12px; - flex-wrap: wrap; - position: relative; - z-index: 1; -} - -.resource-card h2 { - margin: 0; - font-size: 1rem; - letter-spacing: 0.08em; - text-transform: uppercase; - color: var(--muted); -} - -.resource-ratio { - font-size: 1.8rem; - line-height: 1; - letter-spacing: -0.04em; -} - -.resource-meter { - position: relative; - z-index: 1; - height: 16px; - border-radius: 999px; - overflow: hidden; - border: 1px solid rgba(31, 42, 34, 0.12); - background: - linear-gradient(180deg, rgba(255, 255, 255, 0.95), rgba(236, 227, 216, 0.9)), - repeating-linear-gradient(90deg, rgba(31, 42, 34, 0.05) 0 32px, transparent 32px 64px); -} - -.resource-fill { - display: block; - height: 100%; - border-radius: inherit; - position: relative; -} - -.resource-fill::after { - content: ""; - position: absolute; - inset: 0; - background: repeating-linear-gradient(135deg, rgba(255, 255, 255, 0.28) 0 10px, transparent 10px 20px); -} - -.resource-card.cpu .resource-fill { - background: linear-gradient(90deg, #c8622d, #e08a4f); -} - -.resource-card.memory .resource-fill { - background: linear-gradient(90deg, #4d8155, #79ab72); -} - -.resource-card.disk .resource-fill { - background: linear-gradient(90deg, #415147, #69806f); -} - -.resource-foot { - font-size: 0.86rem; - color: var(--muted); -} - -.summary-notes { - display: flex; - flex-wrap: wrap; - gap: 10px; - margin-top: 12px; -} - -.summary-notes span { - display: inline-flex; - align-items: center; - gap: 8px; - padding: 8px 12px; - border-radius: 999px; - border: 1px solid var(--line); - background: rgba(255, 252, 246, 0.72); - color: var(--muted); -} - -.content-panel { - margin-top: 22px; - padding: 28px; -} - -.panel-head, .section-head { - display: flex; - justify-content: space-between; - align-items: center; - gap: 14px; - flex-wrap: wrap; -} - -.section-head { margin-bottom: 16px; } - -.muted { color: var(--muted); } -.inline-error { - background: rgba(143, 47, 36, 0.08); - color: var(--danger); - border: 1px solid rgba(143, 47, 36, 0.2); - padding: 14px 16px; - border-radius: 14px; - margin-bottom: 18px; -} - -table { - width: 100%; - border-collapse: collapse; - border: 1px solid var(--line); - border-radius: 16px; - overflow: hidden; -} - -th, td { - text-align: left; - padding: 14px 12px; - border-bottom: 1px solid var(--line); - vertical-align: top; -} - -th { - font-size: 0.78rem; - text-transform: uppercase; - letter-spacing: 0.12em; - color: var(--muted); - background: rgba(255,255,255,0.42); -} - -tr:last-child td { border-bottom: 0; } - -.table-link { - font-weight: 600; - transition: 160ms ease; - cursor: pointer; -} - -.table-link:hover { - font-weight: 700; - text-decoration: underline; -} - -.state-pill { - display: inline-flex; - align-items: center; - gap: 8px; - border-radius: 999px; - padding: 6px 10px; - font-size: 0.82rem; - border: 1px solid var(--line); -} - -.state-pill.running { color: var(--success); border-color: rgba(51, 100, 59, 0.25); } -.state-pill.stopped { color: var(--muted); } -.state-pill.error { color: var(--danger); border-color: rgba(143, 47, 36, 0.22); } - -.button { - background: var(--accent); - color: #fff8f0; - border: 1px solid rgba(0,0,0,0.04); - font-weight: 600; -} - -.button:hover { - background: var(--accent-strong); - font-weight: 700; - text-decoration: underline; -} -.button.secondary { - background: rgba(255,255,255,0.74); - color: var(--ink); - border-color: rgba(31, 42, 34, 0.12); -} -.button.danger { background: var(--danger); } -.button:disabled { opacity: 0.55; cursor: not-allowed; } - -.form-grid { - display: grid; - grid-template-columns: repeat(auto-fit, minmax(230px, 1fr)); - gap: 16px; -} - -.form-grid.compact { margin-top: 12px; } - -label { - display: grid; - gap: 8px; - font-size: 0.94rem; -} - -input[type="text"], input[type="number"], select { - width: 100%; - border: 1px solid rgba(31, 42, 34, 0.18); - border-radius: 14px; - padding: 12px 14px; - background: var(--panel-strong); - color: var(--ink); -} - -.checkbox { - grid-auto-flow: column; - justify-content: start; - align-items: center; -} - -.checkbox.inline { display: inline-flex; gap: 8px; } - -.stack-inline { - display: flex; - gap: 10px; - flex-wrap: wrap; -} - -.form-actions { - grid-column: 1 / -1; - display: flex; - justify-content: flex-end; - gap: 10px; -} - -.detail-grid { - grid-template-columns: repeat(auto-fit, minmax(240px, 1fr)); -} - -.split-grid { - grid-template-columns: repeat(auto-fit, minmax(360px, 1fr)); -} - -.command-grid { - grid-template-columns: repeat(auto-fit, minmax(320px, 1fr)); - margin: 18px 0; -} - -dl { - margin: 14px 0 0; - display: grid; - grid-template-columns: auto 1fr; - gap: 10px 12px; -} - -dt { color: var(--muted); } -dd { margin: 0; word-break: break-word; } - -pre { - margin: 0; - white-space: pre-wrap; - word-break: break-word; -} - -.log-output { - min-height: 260px; - padding: 16px; - border-radius: 16px; - background: #201d1a; - color: #f3eee4; - overflow: auto; -} - -.picker-field { grid-column: 1 / -1; } -.picker-input { display: flex; gap: 10px; } -.picker-input input { flex: 1; } - -.picker-dialog { - border: 0; - padding: 0; - border-radius: 22px; - width: min(960px, calc(100vw - 24px)); - max-width: 100%; -} - -.picker-dialog::backdrop { - background: rgba(17, 12, 8, 0.48); -} - -.picker-shell { - display: grid; - grid-template-columns: 220px 1fr; - min-height: 420px; -} - -.picker-sidebar { - padding: 20px; - border-right: 1px solid var(--line); - background: rgba(255,255,255,0.56); -} - -.picker-roots { - display: grid; - gap: 8px; -} - -.picker-root, .picker-entry { - display: flex; - width: 100%; - align-items: center; - justify-content: space-between; - gap: 12px; - border: 1px solid var(--line); - background: white; - border-radius: 12px; - padding: 10px 12px; - cursor: pointer; -} - -.picker-main { - padding: 20px; -} - -.picker-bar { - display: flex; - justify-content: space-between; - align-items: center; - gap: 14px; - flex-wrap: wrap; -} - -.picker-actions { - display: flex; - gap: 10px; -} - -.picker-list { - display: grid; - gap: 8px; - max-height: 320px; - overflow: auto; - margin-top: 16px; -} - -.picker-help { color: var(--muted); margin: 12px 0 0; } - -.operation-card { - min-height: 180px; - display: grid; - gap: 12px; - align-content: start; -} - -@media (max-width: 760px) { - .app-shell { padding: 18px 14px 40px; } - .topbar, .content-panel { padding: 20px; } - .resource-ratio { font-size: 1.45rem; } - .picker-shell { grid-template-columns: 1fr; } - .picker-sidebar { border-right: 0; border-bottom: 1px solid var(--line); } - .picker-input { flex-direction: column; } -} diff --git a/internal/webui/server.go b/internal/webui/server.go deleted file mode 100644 index 0199b41..0000000 --- a/internal/webui/server.go +++ /dev/null @@ -1,1229 +0,0 @@ -package webui - -import ( - "context" - "crypto/rand" - "embed" - "encoding/base64" - "encoding/hex" - "encoding/json" - "errors" - "fmt" - "html/template" - "io/fs" - "math" - "net/http" - "net/url" - "os" - "path/filepath" - "sort" - "strconv" - "strings" - "time" - - "banger/internal/api" - "banger/internal/model" - "banger/internal/paths" -) - -type Backend interface { - Config() model.DaemonConfig - Layout() paths.Layout - DashboardSummary(context.Context) (api.DashboardSummary, error) - ListVMs(context.Context) ([]model.VMRecord, error) - FindVM(context.Context, string) (model.VMRecord, error) - GetVMStats(context.Context, string) (model.VMRecord, model.VMStats, error) - BeginVMCreate(context.Context, api.VMCreateParams) (api.VMCreateOperation, error) - VMCreateStatus(context.Context, string) (api.VMCreateOperation, error) - StartVM(context.Context, string) (model.VMRecord, error) - StopVM(context.Context, string) (model.VMRecord, error) - RestartVM(context.Context, string) (model.VMRecord, error) - DeleteVM(context.Context, string) (model.VMRecord, error) - SetVM(context.Context, api.VMSetParams) (model.VMRecord, error) - PortsVM(context.Context, string) (api.VMPortsResult, error) - ListImages(context.Context) ([]model.Image, error) - FindImage(context.Context, string) (model.Image, error) - BeginImageBuild(context.Context, api.ImageBuildParams) (api.ImageBuildOperation, error) - ImageBuildStatus(context.Context, string) (api.ImageBuildOperation, error) - RegisterImage(context.Context, api.ImageRegisterParams) (model.Image, error) - PromoteImage(context.Context, string) (model.Image, error) - DeleteImage(context.Context, string) (model.Image, error) -} - -type Server struct { - backend Backend - templates *template.Template - pickerFS fs.FS -} - -type pickerRoot struct { - Label string - Path string -} - -type flashMessage struct { - Kind string - Message string -} - -type vmCreateForm struct { - Name string - ImageName string - VCPU string - Memory string - SystemOverlaySize string - WorkDiskSize string - NATEnabled bool - NoStart bool -} - -type vmSetForm struct { - VCPU string - Memory string - WorkDiskSize string - NATEnabled bool -} - -type imageBuildForm struct { - Name string - FromImage string - Size string - KernelPath string - InitrdPath string - ModulesDir string - Docker bool -} - -type imageRegisterForm struct { - Name string - RootfsPath string - WorkSeedPath string - KernelPath string - InitrdPath string - ModulesDir string - Docker bool -} - -type pageData struct { - Title string - BodyTemplate string - BodyHTML template.HTML - Section string - Summary api.DashboardSummary - Flash *flashMessage - CSRFToken string - PickerRoots []pickerRoot - MutationAllowed bool - ErrorMessage string - VMs []model.VMRecord - VM model.VMRecord - VMImage model.Image - VMStats model.VMStats - VMPorts api.VMPortsResult - VMPortsError string - VMCreateForm vmCreateForm - VMSetForm vmSetForm - Images []model.Image - Image model.Image - ImageUsers int - ImageBuildForm imageBuildForm - ImageRegisterForm imageRegisterForm - LogText string - VMCreateOperation *api.VMCreateOperation - ImageBuildOperation *api.ImageBuildOperation - OperationStatusURL string - OperationSuccessURL string - OperationLogPath string - OperationKind string -} - -type fsEntry struct { - Name string `json:"name"` - Path string `json:"path"` - Kind string `json:"kind"` -} - -type fsListingResponse struct { - Path string `json:"path"` - Parent string `json:"parent,omitempty"` - Kind string `json:"kind"` - Entries []fsEntry `json:"entries"` - Roots []pickerRoot `json:"roots"` -} - -//go:embed templates/*.html assets/* -var embeddedAssets embed.FS - -func NewHandler(backend Backend) http.Handler { - tmpl := template.Must(template.New("page").Funcs(template.FuncMap{ - "shortID": shortID, - "formatBytes": formatBytes, - "formatBytesCompact": formatBytesCompact, - "formatPercent": formatPercent, - "percentOf": percentOf, - "relativeTime": relativeTime, - "formatBool": formatBool, - "stateClass": stateClass, - "findImage": findImage, - "endpointHref": endpointHref, - "sumInt64": sumInt64, - "eq": func(a, b any) bool { return fmt.Sprint(a) == fmt.Sprint(b) }, - }).ParseFS(embeddedAssets, "templates/*.html")) - staticFS, err := fs.Sub(embeddedAssets, "assets") - if err != nil { - panic(err) - } - server := &Server{ - backend: backend, - templates: tmpl, - pickerFS: staticFS, - } - mux := http.NewServeMux() - server.registerRoutes(mux) - return mux -} - -func (s *Server) registerRoutes(mux *http.ServeMux) { - mux.Handle("GET /static/", http.StripPrefix("/static/", http.FileServerFS(s.pickerFS))) - mux.HandleFunc("GET /", s.wrap(s.handleDashboard)) - mux.HandleFunc("GET /vms", s.wrap(s.handleVMList)) - mux.HandleFunc("GET /vms/new", s.wrap(s.handleVMNew)) - mux.HandleFunc("POST /vms", s.wrap(s.handleVMCreate)) - mux.HandleFunc("GET /vms/{id}", s.wrap(s.handleVMShow)) - mux.HandleFunc("GET /vms/{id}/logs", s.wrap(s.handleVMLogs)) - mux.HandleFunc("POST /vms/{id}/start", s.wrap(s.handleVMStart)) - mux.HandleFunc("POST /vms/{id}/stop", s.wrap(s.handleVMStop)) - mux.HandleFunc("POST /vms/{id}/restart", s.wrap(s.handleVMRestart)) - mux.HandleFunc("POST /vms/{id}/delete", s.wrap(s.handleVMDelete)) - mux.HandleFunc("POST /vms/{id}/set", s.wrap(s.handleVMSet)) - mux.HandleFunc("GET /images", s.wrap(s.handleImageList)) - mux.HandleFunc("GET /images/build", s.wrap(s.handleImageBuildForm)) - mux.HandleFunc("POST /images/build", s.wrap(s.handleImageBuild)) - mux.HandleFunc("GET /images/register", s.wrap(s.handleImageRegisterForm)) - mux.HandleFunc("POST /images/register", s.wrap(s.handleImageRegister)) - mux.HandleFunc("GET /images/{id}", s.wrap(s.handleImageShow)) - mux.HandleFunc("POST /images/{id}/promote", s.wrap(s.handleImagePromote)) - mux.HandleFunc("POST /images/{id}/delete", s.wrap(s.handleImageDelete)) - mux.HandleFunc("GET /operations/vm-create/{id}", s.wrap(s.handleVMCreateOperationPage)) - mux.HandleFunc("GET /operations/image-build/{id}", s.wrap(s.handleImageBuildOperationPage)) - mux.HandleFunc("GET /api/operations/vm-create/{id}", s.wrap(s.handleVMCreateOperationAPI)) - mux.HandleFunc("GET /api/operations/image-build/{id}", s.wrap(s.handleImageBuildOperationAPI)) - mux.HandleFunc("GET /api/fs", s.wrap(s.handleFSAPI)) -} - -func (s *Server) wrap(fn func(http.ResponseWriter, *http.Request) error) http.HandlerFunc { - return func(w http.ResponseWriter, r *http.Request) { - if err := fn(w, r); err != nil { - s.writeError(w, r, err) - } - } -} - -func (s *Server) writeError(w http.ResponseWriter, r *http.Request, err error) { - status := http.StatusInternalServerError - lower := strings.ToLower(err.Error()) - switch { - case errors.Is(err, os.ErrNotExist), strings.Contains(lower, "not found"): - status = http.StatusNotFound - case strings.Contains(lower, "csrf"), strings.Contains(lower, "cross-origin"): - status = http.StatusForbidden - case strings.Contains(lower, "path must"), strings.Contains(lower, "not a directory"): - status = http.StatusBadRequest - } - if status == http.StatusInternalServerError { - http.Error(w, err.Error(), status) - return - } - if renderErr := s.renderPage(w, r, status, "Not Found", "error_content", func(data *pageData) error { - data.Section = "none" - data.ErrorMessage = err.Error() - return nil - }); renderErr != nil { - http.Error(w, err.Error(), status) - } -} - -func (s *Server) renderPage(w http.ResponseWriter, r *http.Request, status int, title, body string, fill func(*pageData) error) error { - summary, err := s.backend.DashboardSummary(r.Context()) - if err != nil { - return err - } - flash := s.popFlash(w, r) - data := &pageData{ - Title: title, - BodyTemplate: body, - Summary: summary, - Flash: flash, - CSRFToken: s.ensureCSRFToken(w, r), - PickerRoots: s.pickerRoots(), - MutationAllowed: summary.Sudo.Available, - } - if fill != nil { - if err := fill(data); err != nil { - return err - } - } - var bodyHTML strings.Builder - if err := s.templates.ExecuteTemplate(&bodyHTML, body, data); err != nil { - return err - } - data.BodyHTML = template.HTML(bodyHTML.String()) - w.Header().Set("Content-Type", "text/html; charset=utf-8") - w.WriteHeader(status) - return s.templates.ExecuteTemplate(w, "page", data) -} - -func (s *Server) handleDashboard(w http.ResponseWriter, r *http.Request) error { - return s.renderPage(w, r, http.StatusOK, "Dashboard", "dashboard_content", func(data *pageData) error { - data.Section = "dashboard" - vms, err := s.backend.ListVMs(r.Context()) - if err != nil { - return err - } - images, err := s.backend.ListImages(r.Context()) - if err != nil { - return err - } - data.VMs = vms - data.Images = images - return nil - }) -} - -func (s *Server) handleVMList(w http.ResponseWriter, r *http.Request) error { - return s.renderPage(w, r, http.StatusOK, "VMs", "vm_list_content", func(data *pageData) error { - data.Section = "vms" - vms, err := s.backend.ListVMs(r.Context()) - if err != nil { - return err - } - images, err := s.backend.ListImages(r.Context()) - if err != nil { - return err - } - data.VMs = vms - data.Images = images - return nil - }) -} - -func (s *Server) handleVMNew(w http.ResponseWriter, r *http.Request) error { - return s.renderVMNewPage(w, r, vmCreateForm{ - VCPU: strconv.Itoa(model.DefaultVCPUCount), - Memory: strconv.Itoa(model.DefaultMemoryMiB), - SystemOverlaySize: model.FormatSizeBytes(model.DefaultSystemOverlaySize), - WorkDiskSize: model.FormatSizeBytes(model.DefaultWorkDiskSize), - }, "") -} - -func (s *Server) renderVMNewPage(w http.ResponseWriter, r *http.Request, form vmCreateForm, formErr string) error { - return s.renderPage(w, r, http.StatusOK, "Create VM", "vm_new_content", func(data *pageData) error { - data.Section = "vms" - images, err := s.backend.ListImages(r.Context()) - if err != nil { - return err - } - data.Images = images - data.VMCreateForm = form - data.ErrorMessage = formErr - return nil - }) -} - -func (s *Server) handleVMCreate(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - form, params, err := s.parseVMCreateForm(r) - if err != nil { - return s.renderVMNewPage(w, r, form, err.Error()) - } - if !allowed { - return s.renderVMNewPage(w, r, form, "mutating actions are unavailable until `sudo -v` succeeds") - } - op, err := s.backend.BeginVMCreate(r.Context(), params) - if err != nil { - return s.renderVMNewPage(w, r, form, err.Error()) - } - http.Redirect(w, r, "/operations/vm-create/"+url.PathEscape(op.ID), http.StatusSeeOther) - return nil -} - -func (s *Server) handleVMShow(w http.ResponseWriter, r *http.Request) error { - _, vmStats, err := s.backend.GetVMStats(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - vm, err := s.backend.FindVM(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - image, _ := s.backend.FindImage(r.Context(), vm.ImageID) - return s.renderPage(w, r, http.StatusOK, vm.Name, "vm_show_content", func(data *pageData) error { - data.Section = "vms" - data.VM = vm - data.VMImage = image - data.VMStats = vmStats - data.VMSetForm = vmSetForm{ - VCPU: strconv.Itoa(vm.Spec.VCPUCount), - Memory: strconv.Itoa(vm.Spec.MemoryMiB), - WorkDiskSize: model.FormatSizeBytes(vm.Spec.WorkDiskSizeBytes), - NATEnabled: vm.Spec.NATEnabled, - } - if vm.State == model.VMStateRunning { - ports, err := s.backend.PortsVM(r.Context(), vm.ID) - if err != nil { - data.VMPortsError = err.Error() - } else { - data.VMPorts = ports - } - } - return nil - }) -} - -func (s *Server) handleVMLogs(w http.ResponseWriter, r *http.Request) error { - vm, err := s.backend.FindVM(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - logText, err := tailFile(vm.Runtime.LogPath, 200) - if err != nil { - logText = err.Error() - } - return s.renderPage(w, r, http.StatusOK, vm.Name+" Logs", "vm_logs_content", func(data *pageData) error { - data.Section = "vms" - data.VM = vm - data.LogText = logText - return nil - }) -} - -func (s *Server) handleVMStart(w http.ResponseWriter, r *http.Request) error { - return s.runVMAction(w, r, func(ctx context.Context, id string) error { - _, err := s.backend.StartVM(ctx, id) - return err - }, "VM started") -} - -func (s *Server) handleVMStop(w http.ResponseWriter, r *http.Request) error { - return s.runVMAction(w, r, func(ctx context.Context, id string) error { - _, err := s.backend.StopVM(ctx, id) - return err - }, "VM stopped") -} - -func (s *Server) handleVMRestart(w http.ResponseWriter, r *http.Request) error { - return s.runVMAction(w, r, func(ctx context.Context, id string) error { - _, err := s.backend.RestartVM(ctx, id) - return err - }, "VM restarted") -} - -func (s *Server) handleVMDelete(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - if !allowed { - s.setFlash(w, "error", "mutating actions are unavailable until `sudo -v` succeeds") - http.Redirect(w, r, "/vms/"+url.PathEscape(r.PathValue("id")), http.StatusSeeOther) - return nil - } - if _, err := s.backend.DeleteVM(r.Context(), r.PathValue("id")); err != nil { - s.setFlash(w, "error", err.Error()) - http.Redirect(w, r, "/vms/"+url.PathEscape(r.PathValue("id")), http.StatusSeeOther) - return nil - } - s.setFlash(w, "success", "VM deleted") - http.Redirect(w, r, "/vms", http.StatusSeeOther) - return nil -} - -func (s *Server) handleVMSet(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - target := "/vms/" + url.PathEscape(r.PathValue("id")) - if !allowed { - s.setFlash(w, "error", "mutating actions are unavailable until `sudo -v` succeeds") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - vm, err := s.backend.FindVM(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - params, err := s.parseVMSetForm(r, vm) - if err != nil { - s.setFlash(w, "error", err.Error()) - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - if params.VCPUCount == nil && params.MemoryMiB == nil && params.WorkDiskSize == "" && params.NATEnabled == nil { - s.setFlash(w, "info", "No VM settings changed") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - if _, err := s.backend.SetVM(r.Context(), params); err != nil { - s.setFlash(w, "error", err.Error()) - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - s.setFlash(w, "success", "VM settings updated") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil -} - -func (s *Server) runVMAction(w http.ResponseWriter, r *http.Request, action func(context.Context, string) error, successMessage string) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - target := "/vms/" + url.PathEscape(r.PathValue("id")) - if !allowed { - s.setFlash(w, "error", "mutating actions are unavailable until `sudo -v` succeeds") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - if err := action(r.Context(), r.PathValue("id")); err != nil { - s.setFlash(w, "error", err.Error()) - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - s.setFlash(w, "success", successMessage) - http.Redirect(w, r, target, http.StatusSeeOther) - return nil -} - -func (s *Server) handleImageList(w http.ResponseWriter, r *http.Request) error { - return s.renderPage(w, r, http.StatusOK, "Images", "image_list_content", func(data *pageData) error { - data.Section = "images" - images, err := s.backend.ListImages(r.Context()) - if err != nil { - return err - } - data.Images = images - return nil - }) -} - -func (s *Server) handleImageBuildForm(w http.ResponseWriter, r *http.Request) error { - return s.renderImageBuildPage(w, r, imageBuildForm{}, "") -} - -func (s *Server) renderImageBuildPage(w http.ResponseWriter, r *http.Request, form imageBuildForm, formErr string) error { - return s.renderPage(w, r, http.StatusOK, "Build Image", "image_build_content", func(data *pageData) error { - data.Section = "images" - data.ImageBuildForm = form - data.ErrorMessage = formErr - return nil - }) -} - -func (s *Server) handleImageBuild(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - form, params, err := s.parseImageBuildForm(r) - if err != nil { - return s.renderImageBuildPage(w, r, form, err.Error()) - } - if !allowed { - return s.renderImageBuildPage(w, r, form, "mutating actions are unavailable until `sudo -v` succeeds") - } - op, err := s.backend.BeginImageBuild(r.Context(), params) - if err != nil { - return s.renderImageBuildPage(w, r, form, err.Error()) - } - http.Redirect(w, r, "/operations/image-build/"+url.PathEscape(op.ID), http.StatusSeeOther) - return nil -} - -func (s *Server) handleImageRegisterForm(w http.ResponseWriter, r *http.Request) error { - return s.renderImageRegisterPage(w, r, imageRegisterForm{}, "") -} - -func (s *Server) renderImageRegisterPage(w http.ResponseWriter, r *http.Request, form imageRegisterForm, formErr string) error { - return s.renderPage(w, r, http.StatusOK, "Register Image", "image_register_content", func(data *pageData) error { - data.Section = "images" - data.ImageRegisterForm = form - data.ErrorMessage = formErr - return nil - }) -} - -func (s *Server) handleImageRegister(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - form, params, err := s.parseImageRegisterForm(r) - if err != nil { - return s.renderImageRegisterPage(w, r, form, err.Error()) - } - if !allowed { - return s.renderImageRegisterPage(w, r, form, "mutating actions are unavailable until `sudo -v` succeeds") - } - image, err := s.backend.RegisterImage(r.Context(), params) - if err != nil { - return s.renderImageRegisterPage(w, r, form, err.Error()) - } - s.setFlash(w, "success", "Image registered") - http.Redirect(w, r, "/images/"+url.PathEscape(image.ID), http.StatusSeeOther) - return nil -} - -func (s *Server) handleImageShow(w http.ResponseWriter, r *http.Request) error { - image, err := s.backend.FindImage(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - vms, err := s.backend.ListVMs(r.Context()) - if err != nil { - return err - } - userCount := 0 - for _, vm := range vms { - if vm.ImageID == image.ID { - userCount++ - } - } - return s.renderPage(w, r, http.StatusOK, image.Name, "image_show_content", func(data *pageData) error { - data.Section = "images" - data.Image = image - data.ImageUsers = userCount - return nil - }) -} - -func (s *Server) handleImagePromote(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - target := "/images/" + url.PathEscape(r.PathValue("id")) - if !allowed { - s.setFlash(w, "error", "mutating actions are unavailable until `sudo -v` succeeds") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - if _, err := s.backend.PromoteImage(r.Context(), r.PathValue("id")); err != nil { - s.setFlash(w, "error", err.Error()) - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - s.setFlash(w, "success", "Image promoted") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil -} - -func (s *Server) handleImageDelete(w http.ResponseWriter, r *http.Request) error { - if err := s.verifyPOST(w, r); err != nil { - return err - } - allowed, err := s.requireMutationAllowed(r.Context()) - if err != nil { - return err - } - target := "/images/" + url.PathEscape(r.PathValue("id")) - if !allowed { - s.setFlash(w, "error", "mutating actions are unavailable until `sudo -v` succeeds") - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - if _, err := s.backend.DeleteImage(r.Context(), r.PathValue("id")); err != nil { - s.setFlash(w, "error", err.Error()) - http.Redirect(w, r, target, http.StatusSeeOther) - return nil - } - s.setFlash(w, "success", "Image deleted") - http.Redirect(w, r, "/images", http.StatusSeeOther) - return nil -} - -func (s *Server) handleVMCreateOperationPage(w http.ResponseWriter, r *http.Request) error { - op, err := s.backend.VMCreateStatus(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - return s.renderPage(w, r, http.StatusOK, "Creating VM", "operation_content", func(data *pageData) error { - data.Section = "vms" - data.OperationKind = "vm" - data.VMCreateOperation = &op - data.OperationStatusURL = "/api/operations/vm-create/" + url.PathEscape(op.ID) - if op.VMID != "" { - data.OperationSuccessURL = "/vms/" + url.PathEscape(op.VMID) - } - return nil - }) -} - -func (s *Server) handleImageBuildOperationPage(w http.ResponseWriter, r *http.Request) error { - op, err := s.backend.ImageBuildStatus(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - return s.renderPage(w, r, http.StatusOK, "Building Image", "operation_content", func(data *pageData) error { - data.Section = "images" - data.OperationKind = "image" - data.ImageBuildOperation = &op - data.OperationStatusURL = "/api/operations/image-build/" + url.PathEscape(op.ID) - if op.ImageID != "" { - data.OperationSuccessURL = "/images/" + url.PathEscape(op.ImageID) - } - data.OperationLogPath = op.BuildLogPath - return nil - }) -} - -func (s *Server) handleVMCreateOperationAPI(w http.ResponseWriter, r *http.Request) error { - op, err := s.backend.VMCreateStatus(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - return writeJSON(w, api.VMCreateStatusResult{Operation: op}) -} - -func (s *Server) handleImageBuildOperationAPI(w http.ResponseWriter, r *http.Request) error { - op, err := s.backend.ImageBuildStatus(r.Context(), r.PathValue("id")) - if err != nil { - return err - } - return writeJSON(w, api.ImageBuildStatusResult{Operation: op}) -} - -func (s *Server) handleFSAPI(w http.ResponseWriter, r *http.Request) error { - path := strings.TrimSpace(r.URL.Query().Get("path")) - if path == "" { - path = s.pickerRoots()[0].Path - } - path = filepath.Clean(path) - if !filepath.IsAbs(path) { - return fmt.Errorf("path must be absolute") - } - info, err := os.Stat(path) - if err != nil { - return err - } - if !info.IsDir() { - return fmt.Errorf("%s is not a directory", path) - } - kind := r.URL.Query().Get("kind") - if kind != "dir" { - kind = "file" - } - entries, err := os.ReadDir(path) - if err != nil { - return err - } - result := fsListingResponse{ - Path: path, - Kind: kind, - Entries: make([]fsEntry, 0, len(entries)+1), - Roots: s.pickerRoots(), - } - parent := filepath.Dir(path) - if parent != path { - result.Parent = parent - result.Entries = append(result.Entries, fsEntry{Name: "..", Path: parent, Kind: "up"}) - } - for _, entry := range entries { - entryKind := "file" - if entry.IsDir() { - entryKind = "dir" - } - result.Entries = append(result.Entries, fsEntry{ - Name: entry.Name(), - Path: filepath.Join(path, entry.Name()), - Kind: entryKind, - }) - } - sort.Slice(result.Entries, func(i, j int) bool { - left, right := result.Entries[i], result.Entries[j] - leftRank := kindRank(left.Kind) - rightRank := kindRank(right.Kind) - if leftRank != rightRank { - return leftRank < rightRank - } - return strings.ToLower(left.Name) < strings.ToLower(right.Name) - }) - return writeJSON(w, result) -} - -func kindRank(kind string) int { - switch kind { - case "up": - return 0 - case "dir": - return 1 - default: - return 2 - } -} - -func (s *Server) pickerRoots() []pickerRoot { - seen := map[string]struct{}{} - roots := []pickerRoot{{Label: "Filesystem", Path: "/"}} - if home, err := os.UserHomeDir(); err == nil && strings.TrimSpace(home) != "" { - roots = append(roots, pickerRoot{Label: "Home", Path: home}) - } - layout := s.backend.Layout() - if layout.StateDir != "" { - roots = append(roots, pickerRoot{Label: "State", Path: layout.StateDir}) - } - result := make([]pickerRoot, 0, len(roots)) - for _, root := range roots { - root.Path = filepath.Clean(root.Path) - if _, ok := seen[root.Path]; ok { - continue - } - seen[root.Path] = struct{}{} - result = append(result, root) - } - return result -} - -func (s *Server) verifyPOST(w http.ResponseWriter, r *http.Request) error { - if r.Method != http.MethodPost { - return nil - } - if err := r.ParseForm(); err != nil { - return err - } - if err := verifySameOrigin(r); err != nil { - return err - } - tokenCookie, err := r.Cookie("banger_csrf") - if err != nil { - return errors.New("missing csrf cookie") - } - if tokenCookie.Value == "" || r.FormValue("csrf_token") != tokenCookie.Value { - return errors.New("csrf token mismatch") - } - return nil -} - -func verifySameOrigin(r *http.Request) error { - for _, raw := range []string{r.Header.Get("Origin"), r.Header.Get("Referer")} { - if strings.TrimSpace(raw) == "" { - continue - } - parsed, err := url.Parse(raw) - if err != nil { - return fmt.Errorf("invalid origin: %w", err) - } - if parsed.Host != r.Host { - return errors.New("cross-origin POST rejected") - } - return nil - } - return nil -} - -func (s *Server) ensureCSRFToken(w http.ResponseWriter, r *http.Request) string { - if cookie, err := r.Cookie("banger_csrf"); err == nil && strings.TrimSpace(cookie.Value) != "" { - return cookie.Value - } - buf := make([]byte, 32) - if _, err := rand.Read(buf); err != nil { - panic(err) - } - token := hex.EncodeToString(buf) - http.SetCookie(w, &http.Cookie{ - Name: "banger_csrf", - Value: token, - Path: "/", - HttpOnly: true, - SameSite: http.SameSiteLaxMode, - }) - return token -} - -func (s *Server) setFlash(w http.ResponseWriter, kind, message string) { - payload := base64.RawURLEncoding.EncodeToString([]byte(kind + "\n" + message)) - http.SetCookie(w, &http.Cookie{ - Name: "banger_flash", - Value: payload, - Path: "/", - HttpOnly: true, - SameSite: http.SameSiteLaxMode, - }) -} - -func (s *Server) popFlash(w http.ResponseWriter, r *http.Request) *flashMessage { - cookie, err := r.Cookie("banger_flash") - if err != nil || cookie.Value == "" { - return nil - } - http.SetCookie(w, &http.Cookie{ - Name: "banger_flash", - Value: "", - Path: "/", - MaxAge: -1, - HttpOnly: true, - SameSite: http.SameSiteLaxMode, - }) - data, err := base64.RawURLEncoding.DecodeString(cookie.Value) - if err != nil { - return nil - } - parts := strings.SplitN(string(data), "\n", 2) - if len(parts) != 2 { - return nil - } - return &flashMessage{Kind: parts[0], Message: parts[1]} -} - -func (s *Server) requireMutationAllowed(ctx context.Context) (bool, error) { - summary, err := s.backend.DashboardSummary(ctx) - if err != nil { - return false, err - } - return summary.Sudo.Available, nil -} - -func (s *Server) parseVMCreateForm(r *http.Request) (vmCreateForm, api.VMCreateParams, error) { - if err := s.verifyPOST(nilResponseWriter{}, r); err != nil { - return vmCreateForm{}, api.VMCreateParams{}, err - } - form := vmCreateForm{ - Name: strings.TrimSpace(r.FormValue("name")), - ImageName: strings.TrimSpace(r.FormValue("image_name")), - VCPU: strings.TrimSpace(r.FormValue("vcpu")), - Memory: strings.TrimSpace(r.FormValue("memory")), - SystemOverlaySize: strings.TrimSpace(r.FormValue("system_overlay_size")), - WorkDiskSize: strings.TrimSpace(r.FormValue("work_disk_size")), - NATEnabled: r.FormValue("nat_enabled") == "on", - NoStart: r.FormValue("no_start") == "on", - } - vcpu, err := strconv.Atoi(form.VCPU) - if err != nil { - return form, api.VMCreateParams{}, errors.New("vcpu must be an integer") - } - memory, err := strconv.Atoi(form.Memory) - if err != nil { - return form, api.VMCreateParams{}, errors.New("memory must be an integer") - } - params := api.VMCreateParams{ - Name: form.Name, - ImageName: form.ImageName, - VCPUCount: &vcpu, - MemoryMiB: &memory, - SystemOverlaySize: form.SystemOverlaySize, - WorkDiskSize: form.WorkDiskSize, - NATEnabled: form.NATEnabled, - NoStart: form.NoStart, - } - return form, params, nil -} - -func (s *Server) parseVMSetForm(r *http.Request, vm model.VMRecord) (api.VMSetParams, error) { - if err := s.verifyPOST(nilResponseWriter{}, r); err != nil { - return api.VMSetParams{}, err - } - params := api.VMSetParams{IDOrName: vm.ID} - if raw := strings.TrimSpace(r.FormValue("vcpu")); raw != "" { - value, err := strconv.Atoi(raw) - if err != nil { - return api.VMSetParams{}, errors.New("vcpu must be an integer") - } - if value != vm.Spec.VCPUCount { - params.VCPUCount = &value - } - } - if raw := strings.TrimSpace(r.FormValue("memory")); raw != "" { - value, err := strconv.Atoi(raw) - if err != nil { - return api.VMSetParams{}, errors.New("memory must be an integer") - } - if value != vm.Spec.MemoryMiB { - params.MemoryMiB = &value - } - } - if raw := strings.TrimSpace(r.FormValue("work_disk_size")); raw != "" && raw != model.FormatSizeBytes(vm.Spec.WorkDiskSizeBytes) { - params.WorkDiskSize = raw - } - if raw := strings.TrimSpace(r.FormValue("nat_enabled")); raw != "" { - value := raw == "true" - if value != vm.Spec.NATEnabled { - params.NATEnabled = &value - } - } - return params, nil -} - -func (s *Server) parseImageBuildForm(r *http.Request) (imageBuildForm, api.ImageBuildParams, error) { - if err := s.verifyPOST(nilResponseWriter{}, r); err != nil { - return imageBuildForm{}, api.ImageBuildParams{}, err - } - form := imageBuildForm{ - Name: strings.TrimSpace(r.FormValue("name")), - FromImage: strings.TrimSpace(r.FormValue("from_image")), - Size: strings.TrimSpace(r.FormValue("size")), - KernelPath: strings.TrimSpace(r.FormValue("kernel_path")), - InitrdPath: strings.TrimSpace(r.FormValue("initrd_path")), - ModulesDir: strings.TrimSpace(r.FormValue("modules_dir")), - Docker: r.FormValue("docker") == "on", - } - params := api.ImageBuildParams{ - Name: form.Name, - FromImage: form.FromImage, - Size: form.Size, - KernelPath: form.KernelPath, - InitrdPath: form.InitrdPath, - ModulesDir: form.ModulesDir, - Docker: form.Docker, - } - return form, params, nil -} - -func (s *Server) parseImageRegisterForm(r *http.Request) (imageRegisterForm, api.ImageRegisterParams, error) { - if err := s.verifyPOST(nilResponseWriter{}, r); err != nil { - return imageRegisterForm{}, api.ImageRegisterParams{}, err - } - form := imageRegisterForm{ - Name: strings.TrimSpace(r.FormValue("name")), - RootfsPath: strings.TrimSpace(r.FormValue("rootfs_path")), - WorkSeedPath: strings.TrimSpace(r.FormValue("work_seed_path")), - KernelPath: strings.TrimSpace(r.FormValue("kernel_path")), - InitrdPath: strings.TrimSpace(r.FormValue("initrd_path")), - ModulesDir: strings.TrimSpace(r.FormValue("modules_dir")), - Docker: r.FormValue("docker") == "on", - } - params := api.ImageRegisterParams{ - Name: form.Name, - RootfsPath: form.RootfsPath, - WorkSeedPath: form.WorkSeedPath, - KernelPath: form.KernelPath, - InitrdPath: form.InitrdPath, - ModulesDir: form.ModulesDir, - Docker: form.Docker, - } - return form, params, nil -} - -type nilResponseWriter struct{} - -func (nilResponseWriter) Header() http.Header { return http.Header{} } -func (nilResponseWriter) Write([]byte) (int, error) { return 0, nil } -func (nilResponseWriter) WriteHeader(statusCode int) {} - -func writeJSON(w http.ResponseWriter, value any) error { - w.Header().Set("Content-Type", "application/json") - return json.NewEncoder(w).Encode(value) -} - -func tailFile(path string, maxLines int) (string, error) { - if strings.TrimSpace(path) == "" { - return "", errors.New("log path is unavailable") - } - data, err := os.ReadFile(path) - if err != nil { - return "", err - } - lines := strings.Split(strings.TrimRight(string(data), "\n"), "\n") - if maxLines > 0 && len(lines) > maxLines { - lines = lines[len(lines)-maxLines:] - } - return strings.Join(lines, "\n"), nil -} - -func findImage(images []model.Image, id string) model.Image { - for _, image := range images { - if image.ID == id { - return image - } - } - return model.Image{} -} - -func endpointHref(endpoint string) string { - endpoint = strings.TrimSpace(endpoint) - if strings.HasPrefix(endpoint, "http://") || strings.HasPrefix(endpoint, "https://") { - return endpoint - } - return "" -} - -func shortID(id string) string { - if len(id) <= 12 { - return id - } - return id[:12] -} - -func sumInt64(values ...int64) int64 { - var total int64 - for _, value := range values { - total += value - } - return total -} - -func formatBytes(bytes int64) string { - const ( - ki = 1024 - mi = ki * 1024 - gi = mi * 1024 - ti = gi * 1024 - ) - switch { - case bytes >= ti: - return fmt.Sprintf("%.1f TiB", float64(bytes)/float64(ti)) - case bytes >= gi: - return fmt.Sprintf("%.1f GiB", float64(bytes)/float64(gi)) - case bytes >= mi: - return fmt.Sprintf("%.1f MiB", float64(bytes)/float64(mi)) - case bytes >= ki: - return fmt.Sprintf("%.1f KiB", float64(bytes)/float64(ki)) - default: - return fmt.Sprintf("%d B", bytes) - } -} - -func formatBytesCompact(bytes int64) string { - const ( - ki = 1024 - mi = ki * 1024 - gi = mi * 1024 - ti = gi * 1024 - ) - type unit struct { - size int64 - suffix string - } - units := []unit{ - {size: ti, suffix: "T"}, - {size: gi, suffix: "G"}, - {size: mi, suffix: "M"}, - {size: ki, suffix: "K"}, - } - abs := bytes - if abs < 0 { - abs = -abs - } - for _, candidate := range units { - if abs >= candidate.size { - value := float64(bytes) / float64(candidate.size) - if math.Abs(value-math.Round(value)) < 0.05 { - return fmt.Sprintf("%.0f%s", math.Round(value), candidate.suffix) - } - return fmt.Sprintf("%.1f%s", value, candidate.suffix) - } - } - return fmt.Sprintf("%dB", bytes) -} - -func percentOf(used, total any) int { - totalValue := numericValue(total) - if totalValue <= 0 { - return 0 - } - usedValue := numericValue(used) - percent := int(math.Round((usedValue / totalValue) * 100)) - switch { - case percent < 0: - return 0 - case percent > 100: - return 100 - default: - return percent - } -} - -func numericValue(value any) float64 { - switch typed := value.(type) { - case int: - return float64(typed) - case int8: - return float64(typed) - case int16: - return float64(typed) - case int32: - return float64(typed) - case int64: - return float64(typed) - case uint: - return float64(typed) - case uint8: - return float64(typed) - case uint16: - return float64(typed) - case uint32: - return float64(typed) - case uint64: - return float64(typed) - case float32: - return float64(typed) - case float64: - return typed - default: - return 0 - } -} - -func formatPercent(value float64) string { - return fmt.Sprintf("%.1f%%", value) -} - -func relativeTime(ts time.Time) string { - if ts.IsZero() { - return "-" - } - delta := time.Since(ts) - switch { - case delta < time.Minute: - return "just now" - case delta < time.Hour: - return fmt.Sprintf("%d minutes ago", int(delta.Minutes())) - case delta < 24*time.Hour: - return fmt.Sprintf("%d hours ago", int(delta.Hours())) - default: - return fmt.Sprintf("%d days ago", int(delta.Hours()/24)) - } -} - -func formatBool(value bool) string { - if value { - return "yes" - } - return "no" -} - -func stateClass(state model.VMState) string { - switch state { - case model.VMStateRunning: - return "running" - case model.VMStateStopped: - return "stopped" - case model.VMStateError: - return "error" - default: - return "created" - } -} diff --git a/internal/webui/server_test.go b/internal/webui/server_test.go deleted file mode 100644 index bbe6f0c..0000000 --- a/internal/webui/server_test.go +++ /dev/null @@ -1,231 +0,0 @@ -package webui - -import ( - "context" - "io" - "net/http" - "net/http/httptest" - "net/url" - "os" - "path/filepath" - "strings" - "testing" - - "banger/internal/api" - "banger/internal/model" - "banger/internal/paths" -) - -type fakeBackend struct { - layout paths.Layout - config model.DaemonConfig - summary api.DashboardSummary - vms []model.VMRecord - images []model.Image - vm model.VMRecord - image model.Image - ports api.VMPortsResult - createOp api.VMCreateOperation - buildOp api.ImageBuildOperation -} - -func (f fakeBackend) Config() model.DaemonConfig { return f.config } -func (f fakeBackend) Layout() paths.Layout { return f.layout } -func (f fakeBackend) DashboardSummary(context.Context) (api.DashboardSummary, error) { - return f.summary, nil -} -func (f fakeBackend) ListVMs(context.Context) ([]model.VMRecord, error) { return f.vms, nil } -func (f fakeBackend) FindVM(context.Context, string) (model.VMRecord, error) { return f.vm, nil } -func (f fakeBackend) GetVMStats(context.Context, string) (model.VMRecord, model.VMStats, error) { - return f.vm, f.vm.Stats, nil -} -func (f fakeBackend) BeginVMCreate(context.Context, api.VMCreateParams) (api.VMCreateOperation, error) { - return f.createOp, nil -} -func (f fakeBackend) VMCreateStatus(context.Context, string) (api.VMCreateOperation, error) { - return f.createOp, nil -} -func (f fakeBackend) StartVM(context.Context, string) (model.VMRecord, error) { return f.vm, nil } -func (f fakeBackend) StopVM(context.Context, string) (model.VMRecord, error) { return f.vm, nil } -func (f fakeBackend) RestartVM(context.Context, string) (model.VMRecord, error) { return f.vm, nil } -func (f fakeBackend) DeleteVM(context.Context, string) (model.VMRecord, error) { return f.vm, nil } -func (f fakeBackend) SetVM(context.Context, api.VMSetParams) (model.VMRecord, error) { - return f.vm, nil -} -func (f fakeBackend) PortsVM(context.Context, string) (api.VMPortsResult, error) { return f.ports, nil } -func (f fakeBackend) ListImages(context.Context) ([]model.Image, error) { return f.images, nil } -func (f fakeBackend) FindImage(context.Context, string) (model.Image, error) { return f.image, nil } -func (f fakeBackend) BeginImageBuild(context.Context, api.ImageBuildParams) (api.ImageBuildOperation, error) { - return f.buildOp, nil -} -func (f fakeBackend) ImageBuildStatus(context.Context, string) (api.ImageBuildOperation, error) { - return f.buildOp, nil -} -func (f fakeBackend) RegisterImage(context.Context, api.ImageRegisterParams) (model.Image, error) { - return f.image, nil -} -func (f fakeBackend) PromoteImage(context.Context, string) (model.Image, error) { return f.image, nil } -func (f fakeBackend) DeleteImage(context.Context, string) (model.Image, error) { return f.image, nil } - -func TestDashboardPageRendersSummaryAndTables(t *testing.T) { - backend := fakeBackend{ - layout: paths.Layout{StateDir: t.TempDir()}, - config: model.DaemonConfig{SSHKeyPath: "/tmp/id"}, - summary: api.DashboardSummary{ - Host: api.HostSummary{CPUCount: 8, TotalMemoryBytes: 16 << 30, StateFilesystemFreeBytes: 9 << 30, StateFilesystemTotalBytes: 20 << 30}, - Sudo: api.SudoStatus{Available: true, Command: "sudo -v"}, - Banger: api.BangerSummary{ - VMCount: 1, RunningVMCount: 1, ImageCount: 1, ManagedImageCount: 1, ConfiguredVCPUCount: 2, - ConfiguredMemoryBytes: 1 << 30, - ConfiguredDiskBytes: 8 << 30, - UsedWorkDiskBytes: 3 << 30, - }, - }, - vms: []model.VMRecord{{ID: "vm-1", Name: "smth", State: model.VMStateRunning, CreatedAt: model.Now(), Runtime: model.VMRuntime{GuestIP: "172.16.0.2"}, Spec: model.VMSpec{VCPUCount: 2, MemoryMiB: 1024, WorkDiskSizeBytes: 8 << 30}}}, - images: []model.Image{{ID: "img-1", Name: "void-exp", Managed: true, RootfsPath: "/tmp/rootfs.ext4", CreatedAt: model.Now()}}, - } - - req := httptest.NewRequest(http.MethodGet, "/", nil) - rec := httptest.NewRecorder() - NewHandler(backend).ServeHTTP(rec, req) - - if rec.Code != http.StatusOK { - t.Fatalf("status = %d, want 200", rec.Code) - } - body := rec.Body.String() - for _, want := range []string{"vCPU", "2 / 8", "1G / 16G", "8G / 20G", "9G free", "smth", "void-exp", "Create VM"} { - if !strings.Contains(body, want) { - t.Fatalf("body missing %q\n%s", want, body) - } - } - if len(rec.Result().Cookies()) == 0 { - t.Fatal("expected csrf cookie to be set") - } -} - -func TestVMActionRejectsMissingCSRF(t *testing.T) { - backend := fakeBackend{ - layout: paths.Layout{StateDir: t.TempDir()}, - summary: api.DashboardSummary{Sudo: api.SudoStatus{Available: true}}, - vm: model.VMRecord{ID: "vm-1", Name: "smth"}, - } - req := httptest.NewRequest(http.MethodPost, "/vms/vm-1/start", strings.NewReader("")) - req.Header.Set("Origin", "http://example.com") - rec := httptest.NewRecorder() - NewHandler(backend).ServeHTTP(rec, req) - if rec.Code != http.StatusForbidden { - t.Fatalf("status = %d, want 403", rec.Code) - } -} - -func TestFSAPIListsEntries(t *testing.T) { - dir := t.TempDir() - if err := os.Mkdir(filepath.Join(dir, "nested"), 0o755); err != nil { - t.Fatalf("mkdir: %v", err) - } - if err := os.WriteFile(filepath.Join(dir, "rootfs.ext4"), []byte("data"), 0o644); err != nil { - t.Fatalf("write: %v", err) - } - backend := fakeBackend{ - layout: paths.Layout{StateDir: dir}, - summary: api.DashboardSummary{Sudo: api.SudoStatus{Available: true}}, - } - - req := httptest.NewRequest(http.MethodGet, "/api/fs?path="+url.QueryEscape(dir)+"&kind=file", nil) - rec := httptest.NewRecorder() - NewHandler(backend).ServeHTTP(rec, req) - - if rec.Code != http.StatusOK { - t.Fatalf("status = %d, want 200", rec.Code) - } - data, err := io.ReadAll(rec.Body) - if err != nil { - t.Fatalf("ReadAll: %v", err) - } - body := string(data) - for _, want := range []string{"rootfs.ext4", "nested"} { - if !strings.Contains(body, want) { - t.Fatalf("body missing %q\n%s", want, body) - } - } -} - -func TestVMShowPageRendersRunningActions(t *testing.T) { - backend := fakeBackend{ - layout: paths.Layout{StateDir: t.TempDir()}, - config: model.DaemonConfig{SSHKeyPath: "/tmp/id"}, - summary: api.DashboardSummary{Sudo: api.SudoStatus{Available: true, Command: "sudo -v"}}, - vm: model.VMRecord{ - ID: "vm-1", - Name: "smth", - State: model.VMStateRunning, - Runtime: model.VMRuntime{ - GuestIP: "172.16.0.2", - }, - Spec: model.VMSpec{ - VCPUCount: 2, - MemoryMiB: 1024, - WorkDiskSizeBytes: 8 << 30, - }, - Stats: model.VMStats{ - CPUPercent: 12.5, - RSSBytes: 64 << 20, - SystemOverlayBytes: 2 << 20, - WorkDiskBytes: 32 << 20, - }, - }, - image: model.Image{ID: "img-1", Name: "void-exp"}, - ports: api.VMPortsResult{ - Name: "smth", - Ports: []api.VMPort{ - {Proto: "tcp", Port: 4096, Endpoint: "http://172.16.0.2:4096", Process: "opencode"}, - }, - }, - } - - req := httptest.NewRequest(http.MethodGet, "/vms/vm-1", nil) - rec := httptest.NewRecorder() - NewHandler(backend).ServeHTTP(rec, req) - - if rec.Code != http.StatusOK { - t.Fatalf("status = %d, want 200", rec.Code) - } - body := rec.Body.String() - for _, want := range []string{"Stop", "Restart", "href=\"http://172.16.0.2:4096\"", "data-confirm=\"Stop VM smth?\"", "data-confirm=\"Delete VM smth?\""} { - if !strings.Contains(body, want) { - t.Fatalf("body missing %q\n%s", want, body) - } - } - for _, unwanted := range []string{"opencode attach", "root@172.16.0.2"} { - if strings.Contains(body, unwanted) { - t.Fatalf("body unexpectedly contains %q\n%s", unwanted, body) - } - } -} - -func TestVMListShowsImageNameAndLink(t *testing.T) { - backend := fakeBackend{ - layout: paths.Layout{StateDir: t.TempDir()}, - summary: api.DashboardSummary{Sudo: api.SudoStatus{Available: true}}, - vms: []model.VMRecord{ - {ID: "vm-1", Name: "smth", ImageID: "img-1", State: model.VMStateRunning, CreatedAt: model.Now(), Spec: model.VMSpec{VCPUCount: 2, MemoryMiB: 1024, WorkDiskSizeBytes: 8 << 30}}, - }, - images: []model.Image{ - {ID: "img-1", Name: "void-exp"}, - }, - } - - req := httptest.NewRequest(http.MethodGet, "/vms", nil) - rec := httptest.NewRecorder() - NewHandler(backend).ServeHTTP(rec, req) - - if rec.Code != http.StatusOK { - t.Fatalf("status = %d, want 200", rec.Code) - } - body := rec.Body.String() - for _, want := range []string{">void-exp", "href=\"/images/img-1\""} { - if !strings.Contains(body, want) { - t.Fatalf("body missing %q\n%s", want, body) - } - } -} diff --git a/internal/webui/templates/base.html b/internal/webui/templates/base.html deleted file mode 100644 index 2fb2473..0000000 --- a/internal/webui/templates/base.html +++ /dev/null @@ -1,124 +0,0 @@ -{{define "page"}} - - - - - - {{.Title}} · banger - - - -
-
-
-

Local Control Plane

-

banger

-
- -
- - {{if not .MutationAllowed}} - - {{end}} - - {{if .Flash}} - - {{end}} - -
-
-
-

vCPU

- {{.Summary.Banger.ConfiguredVCPUCount}} / {{.Summary.Host.CPUCount}} -
- -
- {{percentOf .Summary.Banger.ConfiguredVCPUCount .Summary.Host.CPUCount}}% allocated - {{.Summary.Banger.RunningVMCount}} running -
-
-
-
-

Memory

- {{formatBytesCompact .Summary.Banger.ConfiguredMemoryBytes}} / {{formatBytesCompact .Summary.Host.TotalMemoryBytes}} -
- -
- {{percentOf .Summary.Banger.ConfiguredMemoryBytes .Summary.Host.TotalMemoryBytes}}% allocated - {{formatBytesCompact .Summary.Banger.RunningRSSBytes}} RSS live -
-
-
-
-

Disk

- {{formatBytesCompact .Summary.Banger.ConfiguredDiskBytes}} / {{formatBytesCompact .Summary.Host.StateFilesystemTotalBytes}} -
- -
- {{formatBytesCompact .Summary.Host.StateFilesystemFreeBytes}} free - {{formatBytesCompact (sumInt64 .Summary.Banger.UsedSystemOverlayBytes .Summary.Banger.UsedWorkDiskBytes)}} actual -
-
-
-
- {{.Summary.Banger.RunningVMCount}} / {{.Summary.Banger.VMCount}} running - {{.Summary.Banger.ImageCount}} images - {{.Summary.Banger.ManagedImageCount}} managed - {{formatPercent .Summary.Banger.RunningCPUPercent}} live CPU -
- -
-
-

{{.Title}}

-
- {{.BodyHTML}} -
-
- - -
-
-

Roots

-
- {{range .PickerRoots}} - - {{end}} -
-
-
-
- / -
- - -
-
-

Choose a host path. Directories open in place; files select immediately.

-
-
-
-
- - - - -{{end}} - -{{define "csrf_field"}} - -{{end}} diff --git a/internal/webui/templates/dashboard.html b/internal/webui/templates/dashboard.html deleted file mode 100644 index aa18698..0000000 --- a/internal/webui/templates/dashboard.html +++ /dev/null @@ -1,65 +0,0 @@ -{{define "dashboard_content"}} -
-
-
-

Virtual Machines

- Create VM -
- - - - - - - - - - - - {{range .VMs}} - - - - - - - - {{else}} - - {{end}} - -
NameStateIPSpecCreated
{{.Name}}{{.State}}{{if .Runtime.GuestIP}}{{.Runtime.GuestIP}}{{else}}-{{end}}{{.Spec.VCPUCount}} vCPU / {{.Spec.MemoryMiB}} MiB / {{formatBytes .Spec.WorkDiskSizeBytes}}{{relativeTime .CreatedAt}}
No VMs yet.
-
-
-
-

Images

-
- Register - Build -
-
- - - - - - - - - - - {{range .Images}} - - - - - - - {{else}} - - {{end}} - -
NameManagedRootfsCreated
{{.Name}}{{formatBool .Managed}}{{.RootfsPath}}{{relativeTime .CreatedAt}}
No images registered.
-
-
-{{end}} diff --git a/internal/webui/templates/error.html b/internal/webui/templates/error.html deleted file mode 100644 index 71e45b1..0000000 --- a/internal/webui/templates/error.html +++ /dev/null @@ -1,3 +0,0 @@ -{{define "error_content"}} -
{{.ErrorMessage}}
-{{end}} diff --git a/internal/webui/templates/images.html b/internal/webui/templates/images.html deleted file mode 100644 index f8e884b..0000000 --- a/internal/webui/templates/images.html +++ /dev/null @@ -1,168 +0,0 @@ -{{define "image_list_content"}} -
-

Manage registered rootfs/kernel stacks and promote unmanaged experiments into daemon-owned artifacts.

- -
- - - - - - - - - - - - {{range .Images}} - - - - - - - - {{else}} - - {{end}} - -
NameManagedDockerRootfsCreated
{{.Name}}{{formatBool .Managed}}{{formatBool .Docker}}{{.RootfsPath}}{{relativeTime .CreatedAt}}
No images registered.
-{{end}} - -{{define "image_build_content"}} -

Build a managed image from an existing registered image, then redirect into the async build progress view.

-{{if .ErrorMessage}} -
{{.ErrorMessage}}
-{{end}} -
- {{template "csrf_field" .}} - - - - - - - -
- Cancel - -
-
-{{end}} - -{{define "image_register_content"}} -

Register an existing host-side image stack. Paths stay on the host; nothing is uploaded through the browser.

-{{if .ErrorMessage}} -
{{.ErrorMessage}}
-{{end}} -
- {{template "csrf_field" .}} - - - - - - - -
- Cancel - -
-
-{{end}} - -{{define "image_show_content"}} -
-
-

{{.Image.Name}}

-
-
ID
{{.Image.ID}}
-
Managed
{{formatBool .Image.Managed}}
-
Docker
{{formatBool .Image.Docker}}
-
Used By
{{.ImageUsers}} VM(s)
-
-
-
-

Artifacts

-
-
Rootfs
{{.Image.RootfsPath}}
-
Work Seed
{{if .Image.WorkSeedPath}}{{.Image.WorkSeedPath}}{{else}}-{{end}}
-
Kernel
{{.Image.KernelPath}}
-
Initrd
{{if .Image.InitrdPath}}{{.Image.InitrdPath}}{{else}}-{{end}}
-
Modules
{{if .Image.ModulesDir}}{{.Image.ModulesDir}}{{else}}-{{end}}
-
-
-
-

Lifecycle

-
-
Created
{{relativeTime .Image.CreatedAt}}
-
Updated
{{relativeTime .Image.UpdatedAt}}
-
Artifact Dir
{{if .Image.ArtifactDir}}{{.Image.ArtifactDir}}{{else}}-{{end}}
-
-
-
- -
- {{if not .Image.Managed}} -
{{template "csrf_field" .}}
- {{end}} -
{{template "csrf_field" .}}
-
-{{end}} diff --git a/internal/webui/templates/operation.html b/internal/webui/templates/operation.html deleted file mode 100644 index 87ff45e..0000000 --- a/internal/webui/templates/operation.html +++ /dev/null @@ -1,20 +0,0 @@ -{{define "operation_content"}} -
-

{{if eq .OperationKind "vm"}}VM readiness{{else}}Managed image build{{end}}

- {{if .VMCreateOperation}} -

{{.VMCreateOperation.Stage}}

-

{{.VMCreateOperation.Detail}}

-

{{.VMCreateOperation.Error}}

- {{end}} - {{if .ImageBuildOperation}} -

{{.ImageBuildOperation.Stage}}

-

{{.ImageBuildOperation.Detail}}

-

{{.ImageBuildOperation.Error}}

- {{end}} - {{if .OperationLogPath}} -

Build log: {{.OperationLogPath}}

- {{else}} -

- {{end}} -
-{{end}} diff --git a/internal/webui/templates/vms.html b/internal/webui/templates/vms.html deleted file mode 100644 index 886e44c..0000000 --- a/internal/webui/templates/vms.html +++ /dev/null @@ -1,191 +0,0 @@ -{{define "vm_list_content"}} -
-

Inspect lifecycle, capacity, and reachability for every VM.

- Create VM -
- - - - - - - - - - - - - - - {{range .VMs}} - - - - - - - - - - - {{else}} - - {{end}} - -
NameStateImageIPvCPUMemoryDiskCreated
{{.Name}}{{.State}}{{$image := findImage $.Images .ImageID}}{{if $image.ID}}{{$image.Name}}{{else}}{{shortID .ImageID}}{{end}}{{if .Runtime.GuestIP}}{{.Runtime.GuestIP}}{{else}}-{{end}}{{.Spec.VCPUCount}}{{.Spec.MemoryMiB}} MiB{{formatBytes .Spec.WorkDiskSizeBytes}}{{relativeTime .CreatedAt}}
No VMs registered.
-{{end}} - -{{define "vm_new_content"}} -

Create a VM and wait until the guest is fully ready. The browser will follow live create progress automatically.

-{{if .ErrorMessage}} -
{{.ErrorMessage}}
-{{end}} -
- {{template "csrf_field" .}} - - - - - - - - -
- Cancel - -
-
-{{end}} - -{{define "vm_show_content"}} -
-
-

{{.VM.Name}}

-
-
ID
{{.VM.ID}}
-
Image
{{if .VMImage.ID}}{{.VMImage.Name}}{{else}}{{shortID .VM.ImageID}}{{end}}
-
State
{{.VM.State}}
-
Guest IP
{{if .VM.Runtime.GuestIP}}{{.VM.Runtime.GuestIP}}{{else}}-{{end}}
-
Created
{{relativeTime .VM.CreatedAt}}
-
-
-
-

Configured Spec

-
-
vCPU
{{.VM.Spec.VCPUCount}}
-
Memory
{{.VM.Spec.MemoryMiB}} MiB
-
Disk
{{formatBytes .VM.Spec.WorkDiskSizeBytes}}
-
NAT
{{formatBool .VM.Spec.NATEnabled}}
-
-
-
-

Current Usage

-
-
CPU
{{formatPercent .VMStats.CPUPercent}}
-
RSS
{{formatBytes .VMStats.RSSBytes}}
-
Overlay
{{formatBytes .VMStats.SystemOverlayBytes}}
-
Work Disk
{{formatBytes .VMStats.WorkDiskBytes}}
-
-
-
- -
-

Actions

- Logs -
-
- {{if eq .VM.State "running"}} -
{{template "csrf_field" .}}
-
{{template "csrf_field" .}}
- {{else}} -
{{template "csrf_field" .}}
- {{end}} -
{{template "csrf_field" .}}
-
- -
-
-

Listening Ports

- {{if .VMPortsError}} -

{{.VMPortsError}}

- {{else}} - - - - - - {{range .VMPorts.Ports}} - - - - - - {{else}} - - {{end}} - -
PortProcessEndpoint
{{.Proto}}/{{.Port}}{{if .Process}}{{.Process}}{{else}}-{{end}}{{if .Endpoint}}{{if endpointHref .Endpoint}}{{.Endpoint}}{{else}}{{.Endpoint}}{{end}}{{else}}-{{end}}
No host-reachable listeners reported.
- {{end}} -
-
-

Update Settings

-
- {{template "csrf_field" .}} - - - - -
- Cancel - -
-
-
-
-{{end}} - -{{define "vm_logs_content"}} -
-

Showing the last 200 lines from the Firecracker log.

-
- - Refresh -
-
-
{{.LogText}}
-{{end}} diff --git a/mise.toml b/mise.toml new file mode 100644 index 0000000..28b5184 --- /dev/null +++ b/mise.toml @@ -0,0 +1,3 @@ +[tools] +go = "1.25.0" +shellcheck = "0.10.0" diff --git a/scripts/bench-create.sh b/scripts/bench-create.sh deleted file mode 100644 index ff30290..0000000 --- a/scripts/bench-create.sh +++ /dev/null @@ -1,120 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[bench-create] %s\n' "$*" >&2 -} - -usage() { - cat <<'EOF' -Usage: ./scripts/bench-create.sh [--runs N] [--image NAME] [--keep] - -Measures: - - create_ms: time for `banger vm create` - - ssh_ready_ms: time until `banger vm ssh -- true` succeeds -EOF -} - -RUNS=5 -IMAGE_NAME="" -KEEP=0 - -while [[ $# -gt 0 ]]; do - case "$1" in - --runs) - RUNS="${2:-}" - shift 2 - ;; - --image) - IMAGE_NAME="${2:-}" - shift 2 - ;; - --keep) - KEEP=1 - shift - ;; - -h|--help) - usage - exit 0 - ;; - *) - log "unknown option: $1" - usage - exit 1 - ;; - esac -done - -if ! [[ "$RUNS" =~ ^[0-9]+$ ]] || [[ "$RUNS" -le 0 ]]; then - log "--runs must be a positive integer" - exit 1 -fi - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -if [[ -z "${BANGER_BIN:-}" ]]; then - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - BANGER_BIN="$REPO_ROOT/build/bin/banger" - else - BANGER_BIN="$REPO_ROOT/banger" - fi -fi -if [[ ! -x "$BANGER_BIN" ]]; then - log "banger binary not found: $BANGER_BIN" - log "run 'make build' or set BANGER_BIN" - exit 1 -fi - -timestamp_ms() { - date +%s%3N -} - -json_escape() { - python3 - <<'PY' "$1" -import json, sys -print(json.dumps(sys.argv[1])) -PY -} - -printf '[\n' -for run in $(seq 1 "$RUNS"); do - vm_name="bench-$(date +%s)-$run" - create_args=("$BANGER_BIN" vm create --name "$vm_name") - if [[ -n "$IMAGE_NAME" ]]; then - create_args+=(--image "$IMAGE_NAME") - fi - - create_start="$(timestamp_ms)" - if ! "${create_args[@]}" >/dev/null; then - log "create failed for $vm_name" - exit 1 - fi - create_end="$(timestamp_ms)" - - ssh_start="$create_end" - ssh_ready=0 - deadline=$((ssh_start + 60000)) - while (( $(timestamp_ms) < deadline )); do - if "$BANGER_BIN" vm ssh "$vm_name" -- true >/dev/null 2>&1; then - ssh_ready="$(timestamp_ms)" - break - fi - sleep 0.5 - done - if [[ "$ssh_ready" -eq 0 ]]; then - log "ssh did not become ready for $vm_name" - exit 1 - fi - - if [[ "$KEEP" -ne 1 ]]; then - "$BANGER_BIN" vm delete "$vm_name" >/dev/null || true - fi - - printf ' {"run": %d, "vm_name": %s, "create_ms": %d, "ssh_ready_ms": %d}%s\n' \ - "$run" \ - "$(json_escape "$vm_name")" \ - "$((create_end - create_start))" \ - "$((ssh_ready - create_start))" \ - "$( [[ "$run" -lt "$RUNS" ]] && printf ',' )" -done -printf ']\n' diff --git a/scripts/customize.sh b/scripts/customize.sh deleted file mode 100755 index eacc51e..0000000 --- a/scripts/customize.sh +++ /dev/null @@ -1,571 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[customize] %s\n' "$*" -} - -usage() { - cat <<'EOF' -Usage: ./scripts/customize.sh [--out ] [--size ] [--kernel ] [--initrd ] [--docker] [--modules ] - -Creates a copy of rootfs.ext4, optionally resizes it, boots a VM using the -copy as a writable rootfs, then applies base configuration and packages. -EOF -} - -parse_size() { - local raw="$1" - if [[ "$raw" =~ ^([0-9]+)([KMG])?$ ]]; then - local num="${BASH_REMATCH[1]}" - local unit="${BASH_REMATCH[2]}" - case "$unit" in - K) echo $((num * 1024)) ;; - M|"") echo $((num * 1024 * 1024)) ;; - G) echo $((num * 1024 * 1024 * 1024)) ;; - esac - return 0 - fi - return 1 -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -STATE="${BANGER_STATE_DIR:-${XDG_STATE_HOME:-$HOME/.local/state}/banger/image-build}" -VM_ROOT="$STATE/vms" -mkdir -p "$VM_ROOT" - -BR_DEV="br-fc" -BR_IP="172.16.0.1" -CIDR="24" -DNS_SERVER="1.1.1.1" - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; install/build banger or set BANGER_BIN" - exit 1 -} - -BANGER_BIN="$(resolve_banger_bin)" -NAT_ACTIVE=0 -FC_BIN="$("$BANGER_BIN" internal firecracker-path)" -SSH_KEY="$("$BANGER_BIN" internal ssh-key-path)" -VSOCK_AGENT="$("$BANGER_BIN" internal vsock-agent-path)" - -banger_nat() { - local action="$1" - "$BANGER_BIN" internal nat "$action" --guest-ip "$GUEST_IP" --tap "$TAP_DEV" -} - -load_package_preset() { - local preset="$1" - local -n out="$2" - mapfile -t out < <("$BANGER_BIN" internal packages "$preset") - (( ${#out[@]} > 0 )) -} - -write_rootfs_manifest_metadata() { - local rootfs_path="$1" - local manifest_hash="$2" - printf '%s\n' "$manifest_hash" > "${rootfs_path}.packages.sha256" -} - -BASE_ROOTFS="" -OUT_ROOTFS="" -SIZE_SPEC="" -INSTALL_DOCKER=0 -KERNEL="" -INITRD="" -MISE_VERSION="v2025.12.0" -MISE_INSTALL_PATH="/usr/local/bin/mise" -MISE_ACTIVATE_LINE='eval "$(/usr/local/bin/mise activate bash)"' -TMUX_PLUGIN_DIR="/root/.tmux/plugins" -TMUX_RESURRECT_DIR="/root/.tmux/resurrect" -TMUX_TPM_REPO="https://github.com/tmux-plugins/tpm" -TMUX_RESURRECT_REPO="https://github.com/tmux-plugins/tmux-resurrect" -TMUX_CONTINUUM_REPO="https://github.com/tmux-plugins/tmux-continuum" -TMUX_MANAGED_START="# >>> banger tmux plugins >>>" -TMUX_MANAGED_END="# <<< banger tmux plugins <<<" -MODULES_DIR="" -while [[ $# -gt 0 ]]; do - case "$1" in - --out) - OUT_ROOTFS="${2:-}" - shift 2 - ;; - --size) - SIZE_SPEC="${2:-}" - shift 2 - ;; - --kernel) - KERNEL="${2:-}" - shift 2 - ;; - --initrd) - INITRD="${2:-}" - shift 2 - ;; - --docker) - INSTALL_DOCKER=1 - shift - ;; - --modules) - MODULES_DIR="${2:-}" - shift 2 - ;; - -h|--help) - usage - exit 0 - ;; - *) - if [[ -z "$BASE_ROOTFS" ]]; then - BASE_ROOTFS="$1" - shift - else - log "unknown option: $1" - usage - exit 1 - fi - ;; - esac -done - -if [[ -z "$BASE_ROOTFS" ]]; then - usage - exit 1 -fi - -if [[ ! -f "$BASE_ROOTFS" ]]; then - log "base rootfs not found: $BASE_ROOTFS" - exit 1 -fi - -if [[ -z "$OUT_ROOTFS" ]]; then - base_dir="$(dirname "$BASE_ROOTFS")" - base_name="$(basename "$BASE_ROOTFS")" - OUT_ROOTFS="${base_dir}/docker-${base_name}" -fi -if [[ "$OUT_ROOTFS" == *.ext4 ]]; then - WORK_SEED="${OUT_ROOTFS%.ext4}.work-seed.ext4" -else - WORK_SEED="${OUT_ROOTFS}.work-seed" -fi -if [[ -z "$KERNEL" ]]; then - log "kernel path is required; pass --kernel" - exit 1 -fi -if [[ ! -f "$KERNEL" ]]; then - log "kernel not found: $KERNEL" - exit 1 -fi -if [[ -n "$INITRD" && ! -f "$INITRD" ]]; then - log "initrd not found: $INITRD" - exit 1 -fi -if [[ -n "$MODULES_DIR" && ! -d "$MODULES_DIR" ]]; then - log "modules dir not found: $MODULES_DIR" - exit 1 -fi - -if [[ -e "$OUT_ROOTFS" ]]; then - log "output rootfs already exists: $OUT_ROOTFS" - exit 1 -fi - -if ! command -v resize2fs >/dev/null 2>&1; then - log "resize2fs required" - exit 1 -fi -if ! command -v jq >/dev/null 2>&1; then - log "jq required" - exit 1 -fi -if ! command -v sha256sum >/dev/null 2>&1; then - log "sha256sum required to record package preset metadata" - exit 1 -fi -if [[ ! -x "$VSOCK_AGENT" ]]; then - log "vsock agent not found or not executable: $VSOCK_AGENT" - log "run 'make build'" - exit 1 -fi - -APT_PACKAGES=() -if ! load_package_preset debian APT_PACKAGES; then - log "debian package preset is empty" - exit 1 -fi -if ! PACKAGES_HASH="$(printf '%s\n' "${APT_PACKAGES[@]}" | sha256sum | awk '{print $1}')"; then - log "failed to hash package preset" - exit 1 -fi -printf -v APT_PACKAGES_ESCAPED '%q ' "${APT_PACKAGES[@]}" - -log "copying base rootfs to $OUT_ROOTFS" -cp --reflink=auto "$BASE_ROOTFS" "$OUT_ROOTFS" - -if [[ -n "$SIZE_SPEC" ]]; then - SIZE_BYTES="$(parse_size "$SIZE_SPEC")" - BASE_BYTES="$(stat -c%s "$BASE_ROOTFS")" - if [[ -z "$SIZE_BYTES" || "$SIZE_BYTES" -lt "$BASE_BYTES" ]]; then - log "size must be >= base image size" - exit 1 - fi - log "resizing rootfs to $SIZE_SPEC" - truncate -s "$SIZE_BYTES" "$OUT_ROOTFS" - e2fsck -p -f "$OUT_ROOTFS" >/dev/null - resize2fs "$OUT_ROOTFS" >/dev/null -fi - -VM_ID="$(head -c 32 /dev/urandom | xxd -p -c 256)" -VM_TAG="${VM_ID:0:8}" -VM_NAME="customize-${VM_TAG}" -VM_DIR="$VM_ROOT/$VM_ID" -mkdir -p "$VM_DIR" - -API_SOCK="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}/banger/fc-$VM_TAG.sock" -LOG_FILE="$VM_DIR/firecracker.log" -TAP_DEV="tap-fc-$VM_TAG" - -# Allocate guest IP -NEXT_IP_FILE="$STATE/next_ip" -NEXT_IP="$(cat "$NEXT_IP_FILE" 2>/dev/null || echo 2)" -GUEST_IP="172.16.0.$NEXT_IP" -echo "$((NEXT_IP + 1))" > "$NEXT_IP_FILE" - -sudo -v - -cleanup() { - sudo kill "${FC_PID:-}" 2>/dev/null || true - if [[ "$NAT_ACTIVE" -eq 1 ]]; then - banger_nat down >/dev/null 2>&1 || true - fi - sudo ip link del "$TAP_DEV" 2>/dev/null || true - rm -f "$API_SOCK" - rm -rf "$VM_DIR" -} -trap cleanup EXIT - -sudo mkdir -p "$(dirname "$API_SOCK")" -sudo chown "$(id -u):$(id -g)" "$(dirname "$API_SOCK")" - -# Host bridge -if ! ip link show "$BR_DEV" >/dev/null 2>&1; then - log "creating host bridge $BR_DEV ($BR_IP/$CIDR)" - sudo ip link add name "$BR_DEV" type bridge - sudo ip addr add "${BR_IP}/${CIDR}" dev "$BR_DEV" - sudo ip link set "$BR_DEV" up -else - sudo ip link set "$BR_DEV" up -fi - -log "creating tap device $TAP_DEV" -TAP_USER="${SUDO_UID:-$(id -u)}" -TAP_GROUP="${SUDO_GID:-$(id -g)}" -sudo ip tuntap add dev "$TAP_DEV" mode tap user "$TAP_USER" group "$TAP_GROUP" -sudo ip link set "$TAP_DEV" master "$BR_DEV" -sudo ip link set "$TAP_DEV" up -sudo ip link set "$BR_DEV" up - -log "starting firecracker process" -rm -f "$API_SOCK" -nohup sudo -E "$FC_BIN" --api-sock "$API_SOCK" >"$LOG_FILE" 2>&1 & -FC_PID="$!" - -log "waiting for firecracker api socket" -for _ in $(seq 1 200); do - [[ -S "$API_SOCK" ]] && break - sleep 0.02 -done -[[ -S "$API_SOCK" ]] || { log "firecracker api socket not ready"; exit 1; } - -log "configuring machine" -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/machine-config \ - -H "Content-Type: application/json" \ - -d '{ - "vcpu_count": 2, - "mem_size_mib": 1024, - "smt": false - }' >/dev/null - -KCMD="console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda rootfstype=ext4 rw ip=${GUEST_IP}::${BR_IP}:255.255.255.0:${VM_NAME}:eth0:off:${DNS_SERVER} hostname=${VM_NAME} systemd.mask=home.mount systemd.mask=var.mount" - -INITRD_JSON="" -if [[ -n "$INITRD" ]]; then - INITRD_JSON=", \"initrd_path\": \"$INITRD\"" -fi - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/boot-source \ - -H "Content-Type: application/json" \ - -d "{ - \"kernel_image_path\": \"$KERNEL\", - \"boot_args\": \"$KCMD\"${INITRD_JSON} - }" >/dev/null - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/drives/rootfs \ - -H "Content-Type: application/json" \ - -d "{ - \"drive_id\": \"rootfs\", - \"path_on_host\": \"$OUT_ROOTFS\", - \"is_root_device\": true, - \"is_read_only\": false - }" >/dev/null - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/network-interfaces/eth0 \ - -H "Content-Type: application/json" \ - -d "{ - \"iface_id\": \"eth0\", - \"host_dev_name\": \"$TAP_DEV\" - }" >/dev/null - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/actions \ - -H "Content-Type: application/json" \ - -d '{ "action_type": "InstanceStart" }' >/dev/null - -SUDO_CHILD_PID="$(pgrep -n -f "$API_SOCK" || true)" -if [[ -n "$SUDO_CHILD_PID" ]]; then - FC_PID="$SUDO_CHILD_PID" -fi - -VM_CONFIG_JSON="$(sudo -E curl --unix-socket "$API_SOCK" -sS http://localhost/vm/config)" -CREATED_AT="$(date -Iseconds)" -jq -n \ - --arg id "$VM_ID" \ - --arg name "$VM_NAME" \ - --arg pid "$FC_PID" \ - --arg created_at "$CREATED_AT" \ - --arg guest_ip "$GUEST_IP" \ - --arg tap "$TAP_DEV" \ - --arg api_sock "$API_SOCK" \ - --arg log "$LOG_FILE" \ - --arg rootfs "$OUT_ROOTFS" \ - --arg kernel "$KERNEL" \ - --argjson config "$VM_CONFIG_JSON" \ - '{meta:{id:$id,name:$name,pid:$pid,created_at:$created_at,guest_ip:$guest_ip,tap:$tap,api_sock:$api_sock,log:$log,rootfs:$rootfs,kernel:$kernel},config:$config}' \ - > "$VM_DIR/vm.json" - -log "enabling NAT for customization" -banger_nat up >/dev/null -NAT_ACTIVE=1 - -log "waiting for SSH" -SSH_READY=0 -for _ in $(seq 1 60); do - if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "root@${GUEST_IP}" "true" >/dev/null 2>&1; then - SSH_READY=1 - break - fi - sleep 1 -done -if [[ "$SSH_READY" -ne 1 ]]; then - log "ssh did not become ready on $GUEST_IP" - exit 1 -fi - -log "configuring guest" -log "installing vsock agent" -scp -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "$VSOCK_AGENT" "root@${GUEST_IP}:/usr/local/bin/banger-vsock-agent" >/dev/null - -ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "root@${GUEST_IP}" bash -lc "set -e -printf 'nameserver %s\n' \"$DNS_SERVER\" > /etc/resolv.conf -echo \"$VM_NAME\" > /etc/hostname -printf '127.0.0.1 localhost\n127.0.1.1 %s\n' \"$VM_NAME\" > /etc/hosts -touch /etc/fstab -sed -i '\|^/dev/vdb[[:space:]]\+/home[[:space:]]|d; \|^/dev/vdc[[:space:]]\+/var[[:space:]]|d' /etc/fstab -if ! grep -q '^tmpfs /run ' /etc/fstab; then - echo 'tmpfs /run tmpfs defaults,nodev,nosuid,mode=0755 0 0' >> /etc/fstab -fi -if ! grep -q '^tmpfs /tmp ' /etc/fstab; then - echo 'tmpfs /tmp tmpfs defaults,nodev,nosuid,mode=1777 0 0' >> /etc/fstab -fi -apt-get update -DEBIAN_FRONTEND=noninteractive apt-get -y upgrade -DEBIAN_FRONTEND=noninteractive apt-get -y install ${APT_PACKAGES_ESCAPED} -curl -fsSL https://mise.run | MISE_INSTALL_PATH=\"$MISE_INSTALL_PATH\" MISE_VERSION=\"$MISE_VERSION\" sh -\"$MISE_INSTALL_PATH\" use -g github:anomalyco/opencode -\"$MISE_INSTALL_PATH\" reshim -if [[ ! -e /root/.local/share/mise/shims/opencode ]]; then - echo 'opencode shim not found after mise install' >&2 - exit 1 -fi -ln -snf /root/.local/share/mise/shims/opencode /usr/local/bin/opencode -mkdir -p /etc/profile.d -cat > /etc/profile.d/mise.sh <<'MISEPROFILE' -if [ -n \"\${BASH_VERSION:-}\" ] && [ -x \"$MISE_INSTALL_PATH\" ]; then - eval \"\$($MISE_INSTALL_PATH activate bash)\" -fi -MISEPROFILE -chmod 0644 /etc/profile.d/mise.sh -touch /etc/bash.bashrc -if ! grep -Fqx '$MISE_ACTIVATE_LINE' /etc/bash.bashrc; then - printf '\n%s\n' '$MISE_ACTIVATE_LINE' >> /etc/bash.bashrc -fi -if [[ \"$INSTALL_DOCKER\" == \"1\" ]]; then - DEBIAN_FRONTEND=noninteractive apt-get -y remove containerd || true - if ! DEBIAN_FRONTEND=noninteractive apt-get -y install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin; then - DEBIAN_FRONTEND=noninteractive apt-get -y install docker.io - fi - if command -v systemctl >/dev/null 2>&1; then - systemctl enable --now docker || true - fi -fi -rm -f /root/get-docker /root/get-docker.sh /tmp/get-docker /tmp/get-docker.sh -chmod 0755 /usr/local/bin/banger-vsock-agent -mkdir -p /etc/modules-load.d /etc/systemd/system -cat > /etc/systemd/system/banger-opencode.service <<'EOF' -[Unit] -Description=Banger opencode server -After=network.target -RequiresMountsFor=/root - -[Service] -Type=simple -Environment=HOME=/root -WorkingDirectory=/root -ExecStart=/usr/local/bin/opencode serve --hostname 0.0.0.0 --port 4096 -Restart=on-failure -RestartSec=1 - -[Install] -WantedBy=multi-user.target -EOF -chmod 0644 /etc/systemd/system/banger-opencode.service -if command -v systemctl >/dev/null 2>&1; then - systemctl daemon-reload || true - systemctl enable --now banger-opencode.service || true -fi -cat > /etc/modules-load.d/banger-vsock.conf <<'EOF' -vsock -vmw_vsock_virtio_transport -EOF -chmod 0644 /etc/modules-load.d/banger-vsock.conf -cat > /etc/systemd/system/banger-vsock-agent.service <<'EOF' -[Unit] -Description=Banger vsock agent -After=network.target - -[Service] -Type=simple -ExecStart=/usr/local/bin/banger-vsock-agent -Restart=on-failure -RestartSec=1 - -[Install] -WantedBy=multi-user.target -EOF -chmod 0644 /etc/systemd/system/banger-vsock-agent.service -if command -v systemctl >/dev/null 2>&1; then - systemctl daemon-reload || true - systemctl enable --now banger-vsock-agent.service || true -fi -git config --system init.defaultBranch main -" - -log "configuring tmux resurrect" -ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "root@${GUEST_IP}" bash -se < "\$tmp_tmux_conf" -else - : > "\$tmp_tmux_conf" -fi -if [[ -s "\$tmp_tmux_conf" ]]; then - printf '\n' >> "\$tmp_tmux_conf" -fi -cat >> "\$tmp_tmux_conf" <<'TMUXCONF' -$TMUX_MANAGED_START -set -g @plugin 'tmux-plugins/tpm' -set -g @plugin 'tmux-plugins/tmux-resurrect' -set -g @plugin 'tmux-plugins/tmux-continuum' -set -g @continuum-save-interval '15' -set -g @continuum-restore 'off' -set -g @resurrect-dir '/root/.tmux/resurrect' -run '~/.tmux/plugins/tpm/tpm' -$TMUX_MANAGED_END -TMUXCONF -mv "\$tmp_tmux_conf" "\$TMUX_CONF" -chmod 0644 "\$TMUX_CONF" -EOF - -if [[ -n "$MODULES_DIR" ]]; then - MODULES_BASE="$(basename "$MODULES_DIR")" - log "copying kernel modules ($MODULES_BASE) into guest" - tar -C "$(dirname "$MODULES_DIR")" -cf - "$MODULES_BASE" | \ - ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "root@${GUEST_IP}" bash -lc "set -e -mkdir -p /lib/modules -tar -C /lib/modules -xf - -depmod -a \"$MODULES_BASE\" - mkdir -p /etc/modules-load.d - printf 'nf_tables\nnft_chain_nat\nveth\nbr_netfilter\noverlay\n' > /etc/modules-load.d/docker-netfilter.conf - mkdir -p /etc/sysctl.d - cat > /etc/sysctl.d/99-docker.conf <<'SYSCTL' -net.bridge.bridge-nf-call-iptables = 1 -net.bridge.bridge-nf-call-ip6tables = 1 -net.ipv4.ip_forward = 1 -SYSCTL - sysctl --system >/dev/null 2>&1 || true -sync -" -fi - -log "shutting down guest" -ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "root@${GUEST_IP}" bash -lc "sync" || true -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/actions \ - -H "Content-Type: application/json" \ - -d '{ "action_type": "SendCtrlAltDel" }' >/dev/null || true -for _ in $(seq 1 200); do - if ! ps -p "$FC_PID" >/dev/null 2>&1; then - break - fi - sleep 0.05 -done -write_rootfs_manifest_metadata "$OUT_ROOTFS" "$PACKAGES_HASH" -log "building work seed $WORK_SEED" -"$BANGER_BIN" internal work-seed --rootfs "$OUT_ROOTFS" --out "$WORK_SEED" -log "done" diff --git a/scripts/install.sh b/scripts/install.sh new file mode 100755 index 0000000..9b8f0fd --- /dev/null +++ b/scripts/install.sh @@ -0,0 +1,237 @@ +#!/usr/bin/env bash +# install.sh — one-command installer for banger. +# +# Designed to be invoked as: +# +# curl -fsSL https://releases.thaloco.com/banger/install.sh | bash +# +# The script runs as the invoking user, downloads + verifies the +# release tarball unprivileged, and only re-execs sudo for the actual +# install step (writing to /usr/local/* and creating systemd units). +# Right before the sudo prompt the user gets a plain-language summary +# of exactly what's about to happen, so they're authorising a known +# scope rather than the whole pipeline. +# +# Flags: +# --yes skip the interactive confirmation (CI / scripted use). +# Same effect as exporting BANGER_INSTALL_NONINTERACTIVE=1, +# which is friendlier through `curl | bash` since you can +# set the env var in the same line. +# --version v install a specific version instead of latest_stable +# +# Trust model: +# * The cosign public key below is pinned at script-write time and +# matches internal/updater/verify_signature.go in the source repo. +# * The script verifies the cosign signature on SHA256SUMS, then +# verifies the tarball's hash against SHA256SUMS, before extracting. +# * Verification uses openssl (every Linux distro ships it). cosign +# is needed only for *signing* a release, never for verifying one. +# * Manifest URL is hardcoded so a DNS-redirect cannot point us at a +# different bucket. + +set -euo pipefail + +MANIFEST_URL="https://releases.thaloco.com/banger/manifest.json" +BUCKET_BASE="https://releases.thaloco.com/banger" +TRUST_DOC_URL="https://git.thaloco.com/thaloco/banger/src/branch/main/docs/privileges.md" + +# This must stay byte-identical to BangerReleasePublicKey in +# internal/updater/verify_signature.go — publish-banger-release.sh +# rejects publishing if they drift apart. +BANGER_PUBLIC_KEY='-----BEGIN PUBLIC KEY----- +MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAElWFSLKLosBrdjfuF8ZS6U01Ufky4 +zNeVPCkA6HEJ/oe634fRqwFxkXKGWg03eGFSnlwRxnUxN2+duXQSsR0pzQ== +-----END PUBLIC KEY-----' + +log() { printf '[banger-install] %s\n' "$*" >&2; } +warn() { printf '[banger-install] WARN: %s\n' "$*" >&2; } +die() { printf '[banger-install] ERROR: %s\n' "$*" >&2; exit 1; } + +# ---------------------------------------------------------------------- +# Flag parsing +# ---------------------------------------------------------------------- +ASSUME_YES="${BANGER_INSTALL_NONINTERACTIVE:-0}" +TARGET_VERSION="" + +while [[ $# -gt 0 ]]; do + case "$1" in + -y|--yes) ASSUME_YES=1 ;; + --version) TARGET_VERSION="${2:-}"; shift ;; + --version=*) TARGET_VERSION="${1#--version=}" ;; + -h|--help) + sed -n '2,/^$/p' "$0" | sed 's/^# \{0,1\}//' + exit 0 + ;; + *) die "unknown argument: $1 (try --help)" ;; + esac + shift +done + +# ---------------------------------------------------------------------- +# Platform + tool prerequisites +# ---------------------------------------------------------------------- +[[ "$(uname -s)" == "Linux" ]] || die "banger only supports Linux (saw $(uname -s))" +[[ "$(uname -m)" == "x86_64" ]] || die "banger only supports amd64 (saw $(uname -m))" + +for tool in curl sha256sum tar openssl mktemp base64 grep sed; do + command -v "$tool" >/dev/null \ + || die "required tool not in PATH: $tool" +done + +# ---------------------------------------------------------------------- +# Resolve target version +# ---------------------------------------------------------------------- +if [[ -z "$TARGET_VERSION" ]]; then + log "fetching $MANIFEST_URL" + MANIFEST=$(curl -fsSL --max-time 30 "$MANIFEST_URL") \ + || die "failed to fetch manifest" + # Pull `latest_stable` out without depending on jq — manifest shape + # is well-defined and we control it. + TARGET_VERSION=$(printf '%s' "$MANIFEST" \ + | grep -oE '"latest_stable"[[:space:]]*:[[:space:]]*"v[^"]+"' \ + | head -n1 \ + | sed -E 's/.*"v([^"]+)".*/v\1/') + [[ -n "$TARGET_VERSION" ]] || die "could not parse latest_stable from manifest" +fi + +case "$TARGET_VERSION" in + v*.*.*) ;; + *) die "unexpected version shape: $TARGET_VERSION (want vX.Y.Z)" ;; +esac + +log "target version: $TARGET_VERSION" + +# ---------------------------------------------------------------------- +# Download tarball + sums + signature +# ---------------------------------------------------------------------- +WORK_DIR=$(mktemp -d -t banger-install.XXXXXX) +trap 'rm -rf "$WORK_DIR"' EXIT + +TARBALL_NAME="banger-$TARGET_VERSION-linux-amd64.tar.gz" +RELEASE_BASE="$BUCKET_BASE/$TARGET_VERSION" + +log "downloading $TARBALL_NAME" +curl -fsSL --max-time 300 "$RELEASE_BASE/$TARBALL_NAME" -o "$WORK_DIR/$TARBALL_NAME" \ + || die "failed to download tarball" +curl -fsSL --max-time 30 "$RELEASE_BASE/SHA256SUMS" -o "$WORK_DIR/SHA256SUMS" \ + || die "failed to download SHA256SUMS" +curl -fsSL --max-time 30 "$RELEASE_BASE/SHA256SUMS.sig" -o "$WORK_DIR/SHA256SUMS.sig" \ + || die "failed to download SHA256SUMS.sig" + +# ---------------------------------------------------------------------- +# Verify cosign signature on SHA256SUMS (the tarball's hash is INSIDE +# SHA256SUMS, so a valid signature on SHA256SUMS plus a hash match on +# the tarball authenticates the whole release). +# ---------------------------------------------------------------------- +log "verifying signature on SHA256SUMS" +printf '%s\n' "$BANGER_PUBLIC_KEY" > "$WORK_DIR/cosign.pub" +base64 -d "$WORK_DIR/SHA256SUMS.sig" > "$WORK_DIR/SHA256SUMS.sig.bin" \ + || die "signature is not valid base64" +openssl dgst -sha256 \ + -verify "$WORK_DIR/cosign.pub" \ + -signature "$WORK_DIR/SHA256SUMS.sig.bin" \ + "$WORK_DIR/SHA256SUMS" >/dev/null 2>&1 \ + || die "signature verification failed — refusing to install" +log " signature OK" + +# ---------------------------------------------------------------------- +# Verify tarball hash against SHA256SUMS +# ---------------------------------------------------------------------- +log "verifying $TARBALL_NAME against SHA256SUMS" +( cd "$WORK_DIR" && sha256sum -c --status SHA256SUMS ) \ + || die "tarball hash mismatch — refusing to install" +log " hash OK" + +# ---------------------------------------------------------------------- +# Extract (validation is server-side via StageTarball when banger +# update runs; the install script trusts the verified tarball). +# ---------------------------------------------------------------------- +log "extracting" +mkdir -p "$WORK_DIR/stage" +tar -xzf "$WORK_DIR/$TARBALL_NAME" -C "$WORK_DIR/stage" + +for bin in banger bangerd banger-vsock-agent; do + [[ -f "$WORK_DIR/stage/$bin" ]] \ + || die "tarball missing expected binary: $bin" +done + +# ---------------------------------------------------------------------- +# System install: confirm scope, then re-exec sudo +# ---------------------------------------------------------------------- +SUMMARY=$(cat <.vm) + • manage VM storage (rootfs snapshots, loop devices, image files) + • launch and stop firecracker processes under jailer isolation + • install the binaries to /usr/local and the systemd units above + +Once installed, day-to-day commands like 'banger vm run' and +'banger image pull' run as you. Only the narrow set of operations +above goes through the privileged helper service. + +For details, see: $TRUST_DOC_URL + +EOF +) +printf '%s\n' "$SUMMARY" + +if [[ "$ASSUME_YES" -ne 1 ]]; then + if [[ ! -t 0 ]] && [[ ! -r /dev/tty ]]; then + die "no terminal available to confirm; re-run with --yes" + fi + REPLY="" + if [[ -t 0 ]]; then + read -r -p "Continue? [y/N] " REPLY + else + # curl|bash path: stdin is the pipe; reach for the user's tty. + read -r -p "Continue? [y/N] " REPLY < /dev/tty + fi + case "$REPLY" in + y|Y|yes|YES) ;; + *) die "aborted by user" ;; + esac +fi + +log "elevating to sudo for the install step" +SUDO="" +if [[ "$EUID" -ne 0 ]]; then + command -v sudo >/dev/null \ + || die "not running as root and sudo is not in PATH" + SUDO="sudo" +fi + +# Copy binaries into place. We do the copies + chmod + system install +# from the *staged* tarball under $WORK_DIR; using `install` is the +# right tool here because it handles atomic-ish replacement and mode +# bits in one shot. +$SUDO install -m 0755 -D "$WORK_DIR/stage/banger" /usr/local/bin/banger +$SUDO install -m 0755 -D "$WORK_DIR/stage/bangerd" /usr/local/bin/bangerd +$SUDO install -m 0755 -D "$WORK_DIR/stage/banger-vsock-agent" /usr/local/lib/banger/banger-vsock-agent + +log "registering systemd units (banger system install)" +$SUDO /usr/local/bin/banger system install + +cat <&2 + +banger $TARGET_VERSION installed. + +Next steps: + banger doctor # confirm host readiness + banger vm run # boot a sandbox + banger ssh-config --install # optional: enable 'ssh .vm' + +Updates land via: + banger update --check + sudo banger update + +EOF diff --git a/scripts/interactive.sh b/scripts/interactive.sh deleted file mode 100755 index deb262b..0000000 --- a/scripts/interactive.sh +++ /dev/null @@ -1,306 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[interactive] %s\n' "$*" -} - -usage() { - cat <<'EOF' -Usage: ./scripts/interactive.sh --kernel [--initrd ] [--size ] - -Creates a writable copy of the base rootfs and boots a VM so you can -customize it manually over SSH. No automatic package/config changes -are applied. -EOF -} - -parse_size() { - local raw="$1" - if [[ "$raw" =~ ^([0-9]+)([KMG])?$ ]]; then - local num="${BASH_REMATCH[1]}" - local unit="${BASH_REMATCH[2]}" - case "$unit" in - K) echo $((num * 1024)) ;; - M|"") echo $((num * 1024 * 1024)) ;; - G) echo $((num * 1024 * 1024 * 1024)) ;; - esac - return 0 - fi - return 1 -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -STATE="${BANGER_STATE_DIR:-${XDG_STATE_HOME:-$HOME/.local/state}/banger/interactive}" -VM_ROOT="$STATE/vms" -mkdir -p "$VM_ROOT" - -BR_DEV="br-fc" -BR_IP="172.16.0.1" -CIDR="24" -DNS_SERVER="1.1.1.1" - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; install/build banger or set BANGER_BIN" - exit 1 -} - -BANGER_BIN="$(resolve_banger_bin)" -NAT_ACTIVE=0 -FC_BIN="$("$BANGER_BIN" internal firecracker-path)" -SSH_KEY="$("$BANGER_BIN" internal ssh-key-path)" -KERNEL="" -INITRD="" - -banger_nat() { - local action="$1" - "$BANGER_BIN" internal nat "$action" --guest-ip "$GUEST_IP" --tap "$TAP_DEV" -} - -BASE_ROOTFS="" -OUT_ROOTFS="" -SIZE_SPEC="" -while [[ $# -gt 0 ]]; do - case "$1" in - --out) - OUT_ROOTFS="${2:-}" - shift 2 - ;; - --size) - SIZE_SPEC="${2:-}" - shift 2 - ;; - --kernel) - KERNEL="${2:-}" - shift 2 - ;; - --initrd) - INITRD="${2:-}" - shift 2 - ;; - -h|--help) - usage - exit 0 - ;; - *) - if [[ -z "$BASE_ROOTFS" ]]; then - BASE_ROOTFS="$1" - shift - else - log "unknown option: $1" - usage - exit 1 - fi - ;; - esac -done - -if [[ -z "$BASE_ROOTFS" ]]; then - usage - exit 1 -fi -if [[ ! -f "$BASE_ROOTFS" ]]; then - log "base rootfs not found: $BASE_ROOTFS" - exit 1 -fi -if [[ -z "$KERNEL" ]]; then - log "kernel path is required; pass --kernel" - exit 1 -fi -if [[ ! -f "$KERNEL" ]]; then - log "kernel not found: $KERNEL" - exit 1 -fi -if [[ -n "$INITRD" && ! -f "$INITRD" ]]; then - log "initrd not found: $INITRD" - exit 1 -fi - -if [[ -z "$OUT_ROOTFS" ]]; then - base_dir="$(dirname "$BASE_ROOTFS")" - base_name="$(basename "$BASE_ROOTFS")" - OUT_ROOTFS="${base_dir}/rw-${base_name}" -fi -if [[ -e "$OUT_ROOTFS" ]]; then - log "output rootfs already exists: $OUT_ROOTFS" - exit 1 -fi - -log "copying base rootfs to $OUT_ROOTFS" -cp --reflink=auto "$BASE_ROOTFS" "$OUT_ROOTFS" - -if [[ -n "$SIZE_SPEC" ]]; then - SIZE_BYTES="$(parse_size "$SIZE_SPEC")" - BASE_BYTES="$(stat -c%s "$BASE_ROOTFS")" - if [[ -z "$SIZE_BYTES" || "$SIZE_BYTES" -lt "$BASE_BYTES" ]]; then - log "size must be >= base image size" - exit 1 - fi - log "resizing rootfs to $SIZE_SPEC" - truncate -s "$SIZE_BYTES" "$OUT_ROOTFS" - e2fsck -p -f "$OUT_ROOTFS" >/dev/null - resize2fs "$OUT_ROOTFS" >/dev/null -fi - -VM_ID="$(head -c 32 /dev/urandom | xxd -p -c 256)" -VM_TAG="${VM_ID:0:8}" -VM_NAME="interactive-${VM_TAG}" -VM_DIR="$VM_ROOT/$VM_ID" -mkdir -p "$VM_DIR" - -API_SOCK="${XDG_RUNTIME_DIR:-/run/user/$(id -u)}/banger/fc-$VM_TAG.sock" -LOG_FILE="$VM_DIR/firecracker.log" -TAP_DEV="tap-fc-$VM_TAG" - -# Allocate guest IP -NEXT_IP_FILE="$STATE/next_ip" -NEXT_IP="$(cat "$NEXT_IP_FILE" 2>/dev/null || echo 2)" -GUEST_IP="172.16.0.$NEXT_IP" -echo "$((NEXT_IP + 1))" > "$NEXT_IP_FILE" - -sudo -v - -cleanup() { - sudo kill "${FC_PID:-}" 2>/dev/null || true - if [[ "$NAT_ACTIVE" -eq 1 ]]; then - banger_nat down >/dev/null 2>&1 || true - fi - sudo ip link del "$TAP_DEV" 2>/dev/null || true - rm -f "$API_SOCK" - rm -rf "$VM_DIR" -} -trap cleanup EXIT - -sudo mkdir -p "$(dirname "$API_SOCK")" -sudo chown "$(id -u):$(id -g)" "$(dirname "$API_SOCK")" - -# Host bridge -if ! ip link show "$BR_DEV" >/dev/null 2>&1; then - log "creating host bridge $BR_DEV ($BR_IP/$CIDR)" - sudo ip link add name "$BR_DEV" type bridge - sudo ip addr add "${BR_IP}/${CIDR}" dev "$BR_DEV" - sudo ip link set "$BR_DEV" up -else - sudo ip link set "$BR_DEV" up -fi - -log "creating tap device $TAP_DEV" -TAP_USER="${SUDO_UID:-$(id -u)}" -TAP_GROUP="${SUDO_GID:-$(id -g)}" -sudo ip tuntap add dev "$TAP_DEV" mode tap user "$TAP_USER" group "$TAP_GROUP" -sudo ip link set "$TAP_DEV" master "$BR_DEV" -sudo ip link set "$TAP_DEV" up -sudo ip link set "$BR_DEV" up - -log "starting firecracker process" -rm -f "$API_SOCK" -nohup sudo -E "$FC_BIN" --api-sock "$API_SOCK" >"$LOG_FILE" 2>&1 & -FC_PID="$!" - -log "waiting for firecracker api socket" -for _ in $(seq 1 200); do - [[ -S "$API_SOCK" ]] && break - sleep 0.02 -done -[[ -S "$API_SOCK" ]] || { log "firecracker api socket not ready"; exit 1; } - -log "configuring machine" -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/machine-config \ - -H "Content-Type: application/json" \ - -d '{ - "vcpu_count": 2, - "mem_size_mib": 1024, - "smt": false - }' >/dev/null - -KCMD="console=ttyS0 reboot=k panic=1 pci=off root=/dev/vda rootfstype=ext4 rw ip=${GUEST_IP}::${BR_IP}:255.255.255.0:${VM_NAME}:eth0:off:${DNS_SERVER} hostname=${VM_NAME} systemd.mask=home.mount systemd.mask=var.mount" - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/boot-source \ - -H "Content-Type: application/json" \ - -d "{ - \"kernel_image_path\": \"$KERNEL\", - \"boot_args\": \"$KCMD\", - \"initrd_path\": \"$INITRD\" - }" >/dev/null - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/drives/rootfs \ - -H "Content-Type: application/json" \ - -d "{ - \"drive_id\": \"rootfs\", - \"path_on_host\": \"$OUT_ROOTFS\", - \"is_root_device\": true, - \"is_read_only\": false - }" >/dev/null - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/network-interfaces/eth0 \ - -H "Content-Type: application/json" \ - -d "{ - \"iface_id\": \"eth0\", - \"host_dev_name\": \"$TAP_DEV\" - }" >/dev/null - -sudo -E curl --unix-socket "$API_SOCK" -X PUT http://localhost/actions \ - -H "Content-Type: application/json" \ - -d '{ "action_type": "InstanceStart" }' >/dev/null - -SUDO_CHILD_PID="$(pgrep -n -f "$API_SOCK" || true)" -if [[ -n "$SUDO_CHILD_PID" ]]; then - FC_PID="$SUDO_CHILD_PID" -fi - -VM_CONFIG_JSON="$(sudo -E curl --unix-socket "$API_SOCK" -sS http://localhost/vm/config)" -CREATED_AT="$(date -Iseconds)" -jq -n \ - --arg id "$VM_ID" \ - --arg name "$VM_NAME" \ - --arg pid "$FC_PID" \ - --arg created_at "$CREATED_AT" \ - --arg guest_ip "$GUEST_IP" \ - --arg tap "$TAP_DEV" \ - --arg api_sock "$API_SOCK" \ - --arg log "$LOG_FILE" \ - --arg rootfs "$OUT_ROOTFS" \ - --arg kernel "$KERNEL" \ - --argjson config "$VM_CONFIG_JSON" \ - '{meta:{id:$id,name:$name,pid:$pid,created_at:$created_at,guest_ip:$guest_ip,tap:$tap,api_sock:$api_sock,log:$log,rootfs:$rootfs,kernel:$kernel},config:$config}' \ - > "$VM_DIR/vm.json" - -log "enabling NAT for interactive session" -banger_nat up >/dev/null -NAT_ACTIVE=1 - -log "waiting for SSH" -log "guest ip: $GUEST_IP" -log "ssh: ssh -i \"$SSH_KEY\" root@${GUEST_IP}" -for _ in $(seq 1 60); do - if ssh -i "$SSH_KEY" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ - "root@${GUEST_IP}" "true" >/dev/null 2>&1; then - log "ssh ready" - break - fi - sleep 1 -done - -log "output rootfs: $OUT_ROOTFS" -log "press Ctrl+C to stop and clean up" - -while kill -0 "$FC_PID" >/dev/null 2>&1; do - sleep 1 -done diff --git a/scripts/make-alpine-kernel.sh b/scripts/make-alpine-kernel.sh deleted file mode 100755 index 8bcf2fe..0000000 --- a/scripts/make-alpine-kernel.sh +++ /dev/null @@ -1,363 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[make-alpine-kernel] %s\n' "$*" -} - -usage() { - cat <<'EOF' -Usage: ./scripts/make-alpine-kernel.sh [--out-dir ] [--release ] [--mirror ] [--arch ] [--print-register-flags] - -Download and stage an Alpine Linux virt kernel under ./build/manual/alpine-kernel -for the experimental Alpine guest flow. - -Defaults: - --out-dir ./build/manual/alpine-kernel - --release 3.23.3 - --mirror https://dl-cdn.alpinelinux.org/alpine - --arch x86_64 - -The staged output contains: - boot/vmlinuz- Alpine virt kernel image - boot/initramfs-.img Matching Alpine initramfs - boot/config- Alpine kernel config when present - lib/modules// Matching kernel modules from modloop-virt - -If --print-register-flags is passed, the script does not download anything. It -prints the banger image register flags for an existing staged Alpine kernel. -EOF -} - -require_command() { - local name="$1" - command -v "$name" >/dev/null 2>&1 || { - log "required command not found: $name" - exit 1 - } -} - -check_elf() { - local path="$1" - readelf -h "$path" >/dev/null 2>&1 -} - -find_latest_matching() { - local dir="$1" - local pattern="$2" - if [[ ! -d "$dir" ]]; then - return 1 - fi - find "$dir" -maxdepth 1 -type f -name "$pattern" | sort | tail -n 1 -} - -find_latest_module_dir() { - local root="$1" - local dir="" - if [[ ! -d "$root" ]]; then - return 1 - fi - while IFS= read -r dir; do - if [[ -d "$dir/kernel" || -f "$dir/modules.dep" || -f "$dir/modules.dep.bin" ]]; then - printf '%s\n' "$dir" - return 0 - fi - done < <(find "$root" -mindepth 1 -maxdepth 1 -type d | sort) - return 1 -} - -find_tar_entry() { - local archive="$1" - local needle="$2" - local entry="" - - while IFS= read -r entry; do - case "$entry" in - "$needle"|*/"$needle") - printf '%s\n' "$entry" - return 0 - ;; - esac - done < <(tar -tf "$archive") - - return 1 -} - -find_tar_config_entry() { - local archive="$1" - local entry="" - - while IFS= read -r entry; do - case "$entry" in - config-*-virt|*/config-*-virt) - printf '%s\n' "$entry" - return 0 - ;; - esac - done < <(tar -tf "$archive") - - return 1 -} - -resolve_release_branch() { - local release="$1" - printf 'v%s\n' "${release%.*}" -} - -extract_vmlinux() { - local image="$1" - local out="$2" - local tmp="$TMP_DIR/vmlinux.extract" - - if check_elf "$image"; then - install -m 0644 "$image" "$out" - return 0 - fi - - try_decompress() { - local header="$1" - local marker="$2" - local command="$3" - local pos="" - - while IFS= read -r pos; do - [[ -n "$pos" ]] || continue - pos="${pos%%:*}" - tail -c+"$pos" "$image" | eval "$command" >"$tmp" 2>/dev/null || true - if check_elf "$tmp"; then - install -m 0644 "$tmp" "$out" - return 0 - fi - done < <(tr "$header\n$marker" "\n$marker=" < "$image" | grep -abo "^$marker" || true) - - return 1 - } - - try_decompress '\037\213\010' "xy" "gunzip" && return 0 - try_decompress '\3757zXZ\000' "abcde" "unxz" && return 0 - try_decompress "BZh" "xy" "bunzip2" && return 0 - try_decompress '\135\000\000\000' "xxx" "unlzma" && return 0 - try_decompress '\002!L\030' "xxx" "lz4 -d" && return 0 - try_decompress '(\265/\375' "xxx" "unzstd" && return 0 - - return 1 -} - -print_register_flags() { - local kernel="" - local initrd="" - local modules="" - - kernel="$(find_latest_matching "$OUT_DIR/boot" 'vmlinux-*' || true)" - if [[ -z "$kernel" ]]; then - kernel="$(find_latest_matching "$OUT_DIR/boot" 'vmlinuz-*' || true)" - fi - initrd="$(find_latest_matching "$OUT_DIR/boot" 'initramfs-*' || true)" - modules="$(find_latest_module_dir "$OUT_DIR/lib/modules" || true)" - - if [[ -z "$kernel" || -z "$modules" ]]; then - log "staged Alpine kernel not found under $OUT_DIR" - exit 1 - fi - - printf -- '--kernel %q ' "$kernel" - if [[ -n "$initrd" ]]; then - printf -- '--initrd %q ' "$initrd" - fi - printf -- '--modules %q\n' "$modules" -} - -cleanup() { - if [[ "${MODLOOP_MOUNTED:-0}" == "1" ]] && [[ -n "${MODLOOP_MOUNT:-}" ]]; then - sudo umount "$MODLOOP_MOUNT" || true - fi - if [[ -n "${TMP_DIR:-}" && -d "${TMP_DIR:-}" ]]; then - rm -rf "$TMP_DIR" - fi -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -MANUAL_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -OUT_DIR="$MANUAL_DIR/alpine-kernel" -RELEASE="${ALPINE_RELEASE:-3.23.3}" -MIRROR="https://dl-cdn.alpinelinux.org/alpine" -ARCH="x86_64" -PRINT_REGISTER_FLAGS=0 - -while [[ $# -gt 0 ]]; do - case "$1" in - --out-dir) - OUT_DIR="${2:-}" - shift 2 - ;; - --release) - RELEASE="${2:-}" - shift 2 - ;; - --mirror) - MIRROR="${2:-}" - shift 2 - ;; - --arch) - ARCH="${2:-}" - shift 2 - ;; - --print-register-flags) - PRINT_REGISTER_FLAGS=1 - shift - ;; - -h|--help) - usage - exit 0 - ;; - *) - log "unknown option: $1" - usage - exit 1 - ;; - esac -done - -if [[ "$PRINT_REGISTER_FLAGS" == "1" ]]; then - print_register_flags - exit 0 -fi - -if [[ "$ARCH" != "x86_64" ]]; then - log "unsupported arch: $ARCH" - log "this experimental builder currently supports only x86_64" - exit 1 -fi -if [[ -d "$OUT_DIR" ]]; then - log "output directory already exists: $OUT_DIR" - log "remove it first if you want to re-stage a different Alpine kernel" - exit 1 -fi - -require_command curl -require_command tar -require_command sha256sum -require_command install -require_command find -require_command cp -require_command readelf -require_command file -require_command tail -require_command grep -require_command cut -require_command gzip -require_command xz -require_command bzip2 - -if command -v unsquashfs >/dev/null 2>&1; then - USE_UNSQUASHFS=1 -else - USE_UNSQUASHFS=0 - require_command sudo - require_command mount - require_command umount -fi - -TMP_DIR="$(mktemp -d -t banger-alpine-kernel-XXXXXX)" -EXTRACT_DIR="$TMP_DIR/extract" -MODLOOP_DIR="$TMP_DIR/modloop" -MODLOOP_MOUNT="$TMP_DIR/modloop.mount" -ARCHIVE="$TMP_DIR/alpine-netboot.tar.gz" -MODLOOP_MOUNTED=0 -trap cleanup EXIT - -mkdir -p "$EXTRACT_DIR" "$MODLOOP_DIR" "$MODLOOP_MOUNT" - -BRANCH="$(resolve_release_branch "$RELEASE")" -RELEASE_DIR="$MIRROR/$BRANCH/releases/$ARCH" -ARCHIVE_URL="$RELEASE_DIR/alpine-netboot-$RELEASE-$ARCH.tar.gz" -SHA256_URL="$ARCHIVE_URL.sha256" - -log "downloading Alpine netboot bundle from $ARCHIVE_URL" -curl -fsSL "$ARCHIVE_URL" -o "$ARCHIVE" -expected_sha="$(curl -fsSL "$SHA256_URL" | awk '{print $1}')" -actual_sha="$(sha256sum "$ARCHIVE" | awk '{print $1}')" -if [[ -z "$expected_sha" ]]; then - log "failed to read SHA256 from $SHA256_URL" - exit 1 -fi -if [[ "$expected_sha" != "$actual_sha" ]]; then - log "sha256 mismatch for $ARCHIVE_URL" - log "expected: $expected_sha" - log "actual: $actual_sha" - exit 1 -fi - -VMLINUX_ENTRY="$(find_tar_entry "$ARCHIVE" 'vmlinuz-virt' || true)" -INITRD_ENTRY="$(find_tar_entry "$ARCHIVE" 'initramfs-virt' || true)" -MODLOOP_ENTRY="$(find_tar_entry "$ARCHIVE" 'modloop-virt' || true)" -CONFIG_ENTRY="$(find_tar_config_entry "$ARCHIVE" || true)" - -if [[ -z "$VMLINUX_ENTRY" || -z "$INITRD_ENTRY" || -z "$MODLOOP_ENTRY" ]]; then - log "Alpine netboot bundle is missing expected virt boot artifacts" - exit 1 -fi - -log "extracting Alpine virt boot artifacts" -tar_args=("$VMLINUX_ENTRY" "$INITRD_ENTRY" "$MODLOOP_ENTRY") -if [[ -n "$CONFIG_ENTRY" ]]; then - tar_args+=("$CONFIG_ENTRY") -fi -tar -xf "$ARCHIVE" -C "$EXTRACT_DIR" "${tar_args[@]}" - -VMLINUX_SRC="$EXTRACT_DIR/$VMLINUX_ENTRY" -INITRD_SRC="$EXTRACT_DIR/$INITRD_ENTRY" -MODLOOP_SRC="$EXTRACT_DIR/$MODLOOP_ENTRY" -CONFIG_SRC="" -if [[ -n "$CONFIG_ENTRY" ]]; then - CONFIG_SRC="$EXTRACT_DIR/$CONFIG_ENTRY" -fi - -if [[ "$USE_UNSQUASHFS" == "1" ]]; then - log "extracting kernel modules with unsquashfs" - unsquashfs -f -d "$MODLOOP_DIR" "$MODLOOP_SRC" >/dev/null -else - log "extracting kernel modules with a read-only loop mount" - sudo mount -o loop,ro "$MODLOOP_SRC" "$MODLOOP_MOUNT" - MODLOOP_MOUNTED=1 - cp -a "$MODLOOP_MOUNT/." "$MODLOOP_DIR/" - sudo umount "$MODLOOP_MOUNT" - MODLOOP_MOUNTED=0 -fi - -MODULES_ROOT="" -if [[ -d "$MODLOOP_DIR/modules" ]]; then - MODULES_ROOT="$MODLOOP_DIR/modules" -elif [[ -d "$MODLOOP_DIR/lib/modules" ]]; then - MODULES_ROOT="$MODLOOP_DIR/lib/modules" -fi -if [[ -z "$MODULES_ROOT" ]]; then - log "extracted modloop is missing a modules directory" - exit 1 -fi - -MODULES_SRC="$(find_latest_module_dir "$MODULES_ROOT" || true)" -if [[ -z "$MODULES_SRC" ]]; then - log "failed to locate a kernel modules tree inside modloop-virt" - exit 1 -fi - -KERNEL_VERSION="$(basename "$MODULES_SRC")" -mkdir -p "$OUT_DIR/boot" "$OUT_DIR/lib/modules" -install -m 0644 "$VMLINUX_SRC" "$OUT_DIR/boot/vmlinuz-$KERNEL_VERSION" -install -m 0644 "$INITRD_SRC" "$OUT_DIR/boot/initramfs-$KERNEL_VERSION.img" -if [[ -n "$CONFIG_SRC" && -f "$CONFIG_SRC" ]]; then - install -m 0644 "$CONFIG_SRC" "$OUT_DIR/boot/config-$KERNEL_VERSION" -fi -cp -a "$MODULES_SRC" "$OUT_DIR/lib/modules/" - -log "extracting Firecracker kernel from vmlinuz-$KERNEL_VERSION" -if ! extract_vmlinux "$VMLINUX_SRC" "$OUT_DIR/boot/vmlinux-$KERNEL_VERSION"; then - log "failed to extract an uncompressed vmlinux from $VMLINUX_SRC" - log "raw kernel image type: $(file -b "$VMLINUX_SRC")" - exit 1 -fi - -log "staged Alpine kernel artifacts in $OUT_DIR" -log "kernel version: $KERNEL_VERSION" diff --git a/scripts/make-generic-kernel.sh b/scripts/make-generic-kernel.sh new file mode 100755 index 0000000..c732048 --- /dev/null +++ b/scripts/make-generic-kernel.sh @@ -0,0 +1,148 @@ +#!/usr/bin/env bash +# make-generic-kernel.sh +# +# Build a minimal Firecracker-optimized vmlinux from upstream kernel.org +# sources using the vendored Firecracker config. All essential drivers +# (virtio_blk, virtio_net, ext4, vsock) are compiled in — no modules, +# no initramfs needed. The result boots any OCI-pulled rootfs directly. +# +# Usage: +# scripts/make-generic-kernel.sh [--version 6.12.8] +# +# Output: +# build/manual/generic-kernel/boot/vmlinux- +# build/manual/generic-kernel/metadata.json + +set -euo pipefail + +log() { printf '[make-generic-kernel] %s\n' "$*" >&2; } + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +OUT_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}/generic-kernel" +CONFIG="$REPO_ROOT/configs/firecracker-x86_64-6.1.config" +KERNEL_VERSION="${KERNEL_VERSION:-6.12.8}" +KERNEL_MAJOR="${KERNEL_VERSION%%.*}" +JOBS="${JOBS:-$(nproc)}" + +usage() { + cat <] + +Downloads kernel from kernel.org, applies the vendored Firecracker +config, and builds a minimal vmlinux. Default version: $KERNEL_VERSION + +Output: $OUT_DIR/boot/vmlinux- +EOF +} + +while [[ $# -gt 0 ]]; do + case "$1" in + --version) KERNEL_VERSION="$2"; KERNEL_MAJOR="${KERNEL_VERSION%%.*}"; shift 2;; + -h|--help) usage; exit 0;; + *) log "unknown arg: $1"; exit 1;; + esac +done + +for tool in curl tar xz make gcc gpg gpgv; do + command -v "$tool" >/dev/null 2>&1 || { log "missing required tool: $tool"; exit 1; } +done +[[ -f "$CONFIG" ]] || { log "config not found: $CONFIG"; exit 1; } + +# kernel.org release signing keys. Stable (Greg KH) signs most point +# releases; mainline (Linus) signs .0 drops; Sasha Levin sometimes +# signs longterm backports. Listing all three keeps the script +# working across every release channel the user might pick. Rotations +# are rare and announced; update this list if gpg complains. +# +# Fingerprints verified against kernel.org: +# https://www.kernel.org/signature.html +KERNEL_SIGNING_KEYS=( + 647F28654894E3BD457199BE38DBBDC86092693E # Greg Kroah-Hartman + ABAF11C65A2970B130ABE3C479BE3E4300411886 # Linus Torvalds + E27E5D8A3403A2EF66873BBCDEA66FF797772CDC # Sasha Levin +) + +TARBALL="linux-${KERNEL_VERSION}.tar.xz" +SIGNATURE="linux-${KERNEL_VERSION}.tar.sign" +BASE_URL="https://cdn.kernel.org/pub/linux/kernel/v${KERNEL_MAJOR}.x" +SRC_DIR="$(mktemp -d)" +trap 'rm -rf "$SRC_DIR"' EXIT + +# Isolated GNUPGHOME so the verification step can't accidentally +# trust whatever the invoking user already has in their keyring. The +# trap above cleans the whole SRC_DIR, including this. +GPG_HOME="$SRC_DIR/gnupg" +install -d -m 0700 "$GPG_HOME" +export GNUPGHOME="$GPG_HOME" + +log "importing kernel.org signing keys" +# keyserver.ubuntu.com first: it returns keys with user IDs intact, +# which gpg needs to mark the key as usable. keys.openpgp.org (the +# current SKS successor) strips unverified UIDs on upload, and the +# kernel.org devs haven't all completed its email verification flow, +# so pulling from there returns UID-less keys that gpg then refuses +# to trust. We fall back to it anyway in case ubuntu is unreachable. +if ! gpg --batch --keyserver hkps://keyserver.ubuntu.com --recv-keys "${KERNEL_SIGNING_KEYS[@]}" 2>/dev/null; then + log "key fetch from keyserver.ubuntu.com failed; trying keys.openpgp.org" + gpg --batch --keyserver hkps://keys.openpgp.org --recv-keys "${KERNEL_SIGNING_KEYS[@]}" +fi + +log "downloading kernel $KERNEL_VERSION from $BASE_URL/$TARBALL" +curl -fSL --progress-bar -o "$SRC_DIR/$TARBALL" "$BASE_URL/$TARBALL" +curl -fSL --progress-bar -o "$SRC_DIR/$SIGNATURE" "$BASE_URL/$SIGNATURE" + +log "verifying signature" +# The .tar.sign is a detached signature over the *uncompressed* tar, +# per kernel.org convention. Pipe the xz-decompressed stream into +# gpg --verify so we never materialise an unverified tarball on disk. +# Require VALIDSIG (the cryptographic proof — GOODSIG alone is +# printed even for expired/revoked keys, VALIDSIG requires a usable +# key and a mathematically valid signature). +VERIFY_STATUS="$SRC_DIR/verify.status" +xz -cd "$SRC_DIR/$TARBALL" | gpg --batch --status-fd 3 --verify "$SRC_DIR/$SIGNATURE" - 3>"$VERIFY_STATUS" 2>/dev/null || true +if ! grep -qE '^\[GNUPG:\] VALIDSIG' "$VERIFY_STATUS"; then + log "signature verification FAILED — refusing to build" + log "gpg status:" + cat "$VERIFY_STATUS" >&2 || true + exit 1 +fi +log "signature OK (signed by $(awk '/^\[GNUPG:\] VALIDSIG/ {print $3}' "$VERIFY_STATUS"))" + +log "extracting" +tar -xf "$SRC_DIR/$TARBALL" -C "$SRC_DIR" --strip-components=1 + +log "applying firecracker config" +cp "$CONFIG" "$SRC_DIR/.config" +# Adapt the 6.1 config to whatever version we're building. make olddefconfig +# fills in any new symbols with defaults. +make -C "$SRC_DIR" olddefconfig >/dev/null 2>&1 + +log "building vmlinux (jobs=$JOBS)" +make -C "$SRC_DIR" -j"$JOBS" vmlinux 2>&1 | tail -5 + +VMLINUX="$SRC_DIR/vmlinux" +if [[ ! -f "$VMLINUX" ]]; then + log "vmlinux not found after build; check build output above" + exit 1 +fi + +mkdir -p "$OUT_DIR/boot" +DEST="$OUT_DIR/boot/vmlinux-${KERNEL_VERSION}" +cp "$VMLINUX" "$DEST" + +log "verifying: $(file -b "$DEST" | head -c 80)" + +cat > "$OUT_DIR/metadata.json" < docker create -> docker export | banger internal make-bundle +# +# Usage: +# scripts/make-golden-bundle.sh [--name ] [--kernel-ref ] \ +# [--distro ] [--arch ] [--description "..."] \ +# [--out ] [--size ] [--platform

] +# +# Defaults: +# --name debian-bookworm +# --kernel-ref generic-6.12 +# --distro debian +# --arch x86_64 +# --platform linux/amd64 +# --out /dist/-.tar.zst +# +# Environment overrides: +# BANGER_BIN path to banger binary (default build/bin/banger) +# BANGER_VSOCK_AGENT_BIN path to companion (default build/bin/banger-vsock-agent) + +set -euo pipefail + +log() { printf '[make-golden-bundle] %s\n' "$*" >&2; } +die() { log "$*"; exit 1; } + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +DOCKERFILE="$REPO_ROOT/images/golden/Dockerfile" +CONTEXT="$REPO_ROOT/images/golden" + +NAME="debian-bookworm" +KERNEL_REF="generic-6.12" +DISTRO="debian" +ARCH="x86_64" +DESCRIPTION="" +OUT="" +# 4G is a deliberate over-allocation for the golden image: it leaves +# room for first-boot apt-installs of sshd on derived pulls and for +# the user's own apt-installs during sandbox use. +SIZE="4G" +PLATFORM="linux/amd64" + +while [[ $# -gt 0 ]]; do + case "$1" in + --name) NAME="${2:-}"; shift 2;; + --kernel-ref) KERNEL_REF="${2:-}"; shift 2;; + --distro) DISTRO="${2:-}"; shift 2;; + --arch) ARCH="${2:-}"; shift 2;; + -d|--description) DESCRIPTION="${2:-}"; shift 2;; + --out) OUT="${2:-}"; shift 2;; + --size) SIZE="${2:-}"; shift 2;; + --platform) PLATFORM="${2:-}"; shift 2;; + -h|--help) + sed -n '2,/^set -euo/p' "$0" | sed 's/^# \?//' | sed '$d' + exit 0 + ;; + *) die "unknown option: $1";; + esac +done + +for tool in docker zstd sha256sum; do + command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool" +done +[[ -f "$DOCKERFILE" ]] || die "dockerfile missing: $DOCKERFILE" + +BANGER_BIN="${BANGER_BIN:-$REPO_ROOT/build/bin/banger}" +[[ -x "$BANGER_BIN" ]] || die "banger binary not executable: $BANGER_BIN (run 'make build' or set BANGER_BIN)" +VSOCK_AGENT="${BANGER_VSOCK_AGENT_BIN:-$REPO_ROOT/build/bin/banger-vsock-agent}" +[[ -x "$VSOCK_AGENT" ]] || die "banger-vsock-agent not executable: $VSOCK_AGENT (run 'make build')" + +if [[ -z "$OUT" ]]; then + OUT="$REPO_ROOT/dist/${NAME}-${ARCH}.tar.zst" +fi +mkdir -p "$(dirname "$OUT")" + +DOCKER_TAG="banger-golden:${NAME}" + +log "building $DOCKER_TAG (platform=$PLATFORM)" +docker build --platform "$PLATFORM" -t "$DOCKER_TAG" -f "$DOCKERFILE" "$CONTEXT" + +log "creating docker container (not started)" +CONTAINER_ID="$(docker create "$DOCKER_TAG")" +cleanup() { docker rm -f "$CONTAINER_ID" >/dev/null 2>&1 || true; } +trap cleanup EXIT + +log "piping container filesystem into banger internal make-bundle" +SIZE_FLAG=() +[[ -n "$SIZE" ]] && SIZE_FLAG=(--size "$SIZE") +DESC_FLAG=() +[[ -n "$DESCRIPTION" ]] && DESC_FLAG=(--description "$DESCRIPTION") +KERNEL_REF_FLAG=() +[[ -n "$KERNEL_REF" ]] && KERNEL_REF_FLAG=(--kernel-ref "$KERNEL_REF") + +export BANGER_VSOCK_AGENT_BIN="$VSOCK_AGENT" +docker export "$CONTAINER_ID" | \ + "$BANGER_BIN" internal make-bundle \ + --rootfs-tar - \ + --name "$NAME" \ + --distro "$DISTRO" \ + --arch "$ARCH" \ + "${KERNEL_REF_FLAG[@]}" \ + "${DESC_FLAG[@]}" \ + "${SIZE_FLAG[@]}" \ + --out "$OUT" + +SHA256="$(sha256sum "$OUT" | awk '{print $1}')" +SIZE_BYTES="$(stat -c '%s' "$OUT")" +HUMAN="$(numfmt --to=iec --suffix=B "$SIZE_BYTES" 2>/dev/null || echo "${SIZE_BYTES}B")" + +log "bundle: $OUT" +log "sha256: $SHA256" +log "size: $HUMAN ($SIZE_BYTES bytes)" +printf '%s\n' "$OUT" diff --git a/scripts/make-rootfs-alpine.sh b/scripts/make-rootfs-alpine.sh deleted file mode 100755 index a09d907..0000000 --- a/scripts/make-rootfs-alpine.sh +++ /dev/null @@ -1,722 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[make-rootfs-alpine] %s\n' "$*" -} - -usage() { - cat <<'EOF' -Usage: ./scripts/make-rootfs-alpine.sh [--out ] [--size ] [--release ] [--mirror ] [--arch ] - -Build an experimental Alpine Linux rootfs image plus a matching /root work-seed. - -Defaults: - --out ./build/manual/rootfs-alpine.ext4 - --size 2G - --release 3.23.3 - --mirror https://dl-cdn.alpinelinux.org/alpine - --arch x86_64 - -This path is experimental and local-only. If ./build/manual/alpine-kernel exists -it uses the staged Alpine kernel modules from that directory. It does not change -the default Debian image flow. -EOF -} - -parse_size() { - local raw="$1" - if [[ "$raw" =~ ^([0-9]+)([KMG])?$ ]]; then - local num="${BASH_REMATCH[1]}" - local unit="${BASH_REMATCH[2]}" - case "$unit" in - K) printf '%s\n' $((num * 1024)) ;; - M|"") printf '%s\n' $((num * 1024 * 1024)) ;; - G) printf '%s\n' $((num * 1024 * 1024 * 1024)) ;; - esac - return 0 - fi - return 1 -} - -require_command() { - local name="$1" - command -v "$name" >/dev/null 2>&1 || { - log "required command not found: $name" - exit 1 - } -} - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; build it first with 'make build' or set BANGER_BIN" - exit 1 -} - -find_latest_module_dir() { - local root="$1" - if [[ ! -d "$root" ]]; then - return 1 - fi - find "$root" -mindepth 1 -maxdepth 1 -type d | sort | tail -n 1 -} - -resolve_release_branch() { - local release="$1" - printf 'v%s\n' "${release%.*}" -} - -load_package_preset() { - local preset="$1" - local -n out="$2" - mapfile -t out < <("$BANGER_BIN" internal packages "$preset") - (( ${#out[@]} > 0 )) -} - -write_rootfs_manifest_metadata() { - local rootfs_path="$1" - local manifest_hash="$2" - printf '%s\n' "$manifest_hash" > "${rootfs_path}.packages.sha256" -} - -install_root_authorized_key() { - local public_key - public_key="$(ssh-keygen -y -f "$SSH_KEY")" - sudo mkdir -p "$ROOT_MOUNT/root/.ssh" - printf '%s\n' "$public_key" | sudo tee "$ROOT_MOUNT/root/.ssh/authorized_keys" >/dev/null - sudo chmod 700 "$ROOT_MOUNT/root/.ssh" - sudo chmod 600 "$ROOT_MOUNT/root/.ssh/authorized_keys" -} - -ensure_sshd_include() { - local cfg="$ROOT_MOUNT/etc/ssh/sshd_config" - local tmp_cfg="$TMP_DIR/sshd_config" - local include_line="Include /etc/ssh/sshd_config.d/*.conf" - - sudo mkdir -p "$ROOT_MOUNT/etc/ssh/sshd_config.d" - if sudo test -f "$cfg"; then - sudo cat "$cfg" > "$tmp_cfg" - else - : > "$tmp_cfg" - fi - - if ! grep -Eq '^[[:space:]]*Include[[:space:]]+/etc/ssh/sshd_config\.d/\*\.conf([[:space:]]|$)' "$tmp_cfg"; then - { - printf '%s\n' "$include_line" - cat "$tmp_cfg" - } > "${tmp_cfg}.new" - mv "${tmp_cfg}.new" "$tmp_cfg" - sudo install -m 0644 "$tmp_cfg" "$cfg" - fi -} - -normalize_root_shell() { - local passwd="$ROOT_MOUNT/etc/passwd" - local shells="$ROOT_MOUNT/etc/shells" - local wanted_shell="/bin/bash" - local tmp_passwd="$TMP_DIR/passwd" - local root_shell="" - - if [[ ! -x "$ROOT_MOUNT$wanted_shell" ]]; then - log "required root shell is missing from the Alpine image: $wanted_shell" - exit 1 - fi - if [[ ! -f "$shells" ]]; then - log "Alpine image is missing /etc/shells" - exit 1 - fi - if ! sudo grep -Fxq "$wanted_shell" "$shells"; then - log "Alpine image does not allow $wanted_shell in /etc/shells" - exit 1 - fi - - sudo cat "$passwd" > "$tmp_passwd" - awk -F: -v OFS=: -v shell="$wanted_shell" ' - $1 == "root" { - $7 = shell - found = 1 - } - { print } - END { - if (!found) { - exit 1 - } - } - ' "$tmp_passwd" > "${tmp_passwd}.new" || { - log "failed to rewrite root shell in /etc/passwd" - exit 1 - } - mv "${tmp_passwd}.new" "$tmp_passwd" - sudo install -m 0644 "$tmp_passwd" "$passwd" - - root_shell="$(sudo awk -F: '$1 == "root" { print $7 }' "$passwd")" - if [[ "$root_shell" != "$wanted_shell" ]]; then - log "root shell normalization failed: expected $wanted_shell, got ${root_shell:-}" - exit 1 - fi -} - -configure_root_bash_prompt() { - local bashrc="$ROOT_MOUNT/root/.bashrc" - local bash_profile="$ROOT_MOUNT/root/.bash_profile" - local profile_prompt="$ROOT_MOUNT/etc/profile.d/banger-bash-prompt.sh" - - sudo mkdir -p "$ROOT_MOUNT/root" "$ROOT_MOUNT/etc/profile.d" - cat <<'EOF' | sudo tee "$bashrc" >/dev/null -# banger: default interactive prompt for experimental Alpine guests -case "$-" in - *i*) ;; - *) return ;; -esac - -if [ -z "${BANGER_MISE_ACTIVATED:-}" ] && [ -x '/usr/local/bin/mise' ]; then - export BANGER_MISE_ACTIVATED=1 - eval "$(/usr/local/bin/mise activate bash)" -fi - -PS1='\u@\h:\w\$ ' -EOF - cat <<'EOF' | sudo tee "$bash_profile" >/dev/null -if [ -f ~/.bashrc ]; then - . ~/.bashrc -fi -EOF - cat <<'EOF' | sudo tee "$profile_prompt" >/dev/null -case "$-" in - *i*) ;; - *) return 0 2>/dev/null || exit 0 ;; -esac - -if [ -n "${BASH_VERSION:-}" ]; then - PS1='\u@\h:\w\$ ' -fi -EOF - sudo chmod 0644 "$bashrc" "$bash_profile" "$profile_prompt" -} - -install_guest_network_bootstrap() { - sudo mkdir -p "$ROOT_MOUNT/usr/local/libexec" - sudo install -m 0755 "$GUESTNET_BOOTSTRAP_SCRIPT" "$ROOT_MOUNT/usr/local/libexec/banger-network-bootstrap" -} - -install_openrc_services() { - local initd_dir="$ROOT_MOUNT/etc/init.d" - - sudo mkdir -p "$initd_dir" - - cat <<'EOF' | sudo tee "$initd_dir/banger-network" >/dev/null -#!/sbin/openrc-run -description="Banger guest network bootstrap" - -depend() { - need localmount - before sshd docker banger-opencode - provide net -} - -start() { - ebegin "Configuring guest network" - /usr/local/libexec/banger-network-bootstrap - eend $? -} -EOF - - cat <<'EOF' | sudo tee "$initd_dir/banger-docker-preflight" >/dev/null -#!/sbin/openrc-run -description="Banger Docker kernel preflight" - -depend() { - after modules - before docker -} - -start() { - ebegin "Preparing Docker kernel state" - for module in nf_tables nft_chain_nat veth br_netfilter overlay; do - modprobe "$module" 2>/dev/null || true - done - if command -v sysctl >/dev/null 2>&1; then - sysctl -p /etc/sysctl.d/99-docker.conf >/dev/null 2>&1 || true - fi - eend 0 -} -EOF - - cat <<'EOF' | sudo tee "$initd_dir/banger-vsock-agent" >/dev/null -#!/sbin/openrc-run -description="Banger vsock agent" -pidfile="/run/${RC_SVCNAME}.pid" -command="/usr/local/bin/banger-vsock-agent" - -depend() { - need localmount - before banger-network sshd docker banger-opencode -} - -start_pre() { - modprobe vsock 2>/dev/null || true - modprobe vmw_vsock_virtio_transport 2>/dev/null || true -} - -start() { - ebegin "Starting ${RC_SVCNAME}" - start-stop-daemon --start --exec "$command" --background --make-pidfile --pidfile "$pidfile" - eend $? -} - -stop() { - ebegin "Stopping ${RC_SVCNAME}" - start-stop-daemon --stop --exec "$command" --pidfile "$pidfile" - eend $? -} -EOF - - cat <<'EOF' | sudo tee "$initd_dir/banger-opencode" >/dev/null -#!/sbin/openrc-run -description="Banger opencode server" -pidfile="/run/${RC_SVCNAME}.pid" -command="/usr/local/bin/opencode" -command_args="serve --hostname 0.0.0.0 --port 4096" - -depend() { - need localmount - after banger-network -} - -start() { - ebegin "Starting ${RC_SVCNAME}" - HOME=/root start-stop-daemon --start --exec "$command" --background --make-pidfile --pidfile "$pidfile" --chdir /root -- $command_args - eend $? -} - -stop() { - ebegin "Stopping ${RC_SVCNAME}" - start-stop-daemon --stop --exec "$command" --pidfile "$pidfile" - eend $? -} -EOF - - sudo chmod 0755 \ - "$initd_dir/banger-network" \ - "$initd_dir/banger-docker-preflight" \ - "$initd_dir/banger-vsock-agent" \ - "$initd_dir/banger-opencode" -} - -configure_docker_bootstrap() { - local modules_conf="$ROOT_MOUNT/etc/modules-load.d/docker-netfilter.conf" - local sysctl_conf="$ROOT_MOUNT/etc/sysctl.d/99-docker.conf" - - sudo mkdir -p "$ROOT_MOUNT/etc/modules-load.d" "$ROOT_MOUNT/etc/sysctl.d" - cat <<'EOF' | sudo tee "$modules_conf" >/dev/null -nf_tables -nft_chain_nat -veth -br_netfilter -overlay -EOF - cat <<'EOF' | sudo tee "$sysctl_conf" >/dev/null -net.bridge.bridge-nf-call-iptables = 1 -net.bridge.bridge-nf-call-ip6tables = 1 -net.ipv4.ip_forward = 1 -EOF - sudo chmod 0644 "$modules_conf" "$sysctl_conf" -} - -configure_vsock_modules() { - local modules_conf="$ROOT_MOUNT/etc/modules-load.d/banger-vsock.conf" - - sudo mkdir -p "$ROOT_MOUNT/etc/modules-load.d" - cat <<'EOF' | sudo tee "$modules_conf" >/dev/null -vsock -vmw_vsock_virtio_transport -EOF - sudo chmod 0644 "$modules_conf" -} - -configure_apk_repositories() { - local repositories="$ROOT_MOUNT/etc/apk/repositories" - - sudo mkdir -p "$ROOT_MOUNT/etc/apk" - cat </dev/null -$APK_RELEASE_URL/main -$APK_RELEASE_URL/community -EOF - sudo chmod 0644 "$repositories" - if [[ -r /etc/resolv.conf ]]; then - sudo install -m 0644 /etc/resolv.conf "$ROOT_MOUNT/etc/resolv.conf" - fi -} - -build_alpine_initramfs() { - local kernel_version="$1" - local guest_output="/boot/initramfs-${kernel_version}.img" - local stage_output="$MANUAL_DIR/alpine-kernel/boot/initramfs-${kernel_version}.img" - local mkinitfs_dir="$ROOT_MOUNT/etc/mkinitfs" - local mkinitfs_conf="$mkinitfs_dir/mkinitfs.conf" - - sudo mkdir -p "$mkinitfs_dir" "$ROOT_MOUNT/boot" "$MANUAL_DIR/alpine-kernel/boot" - cat <<'EOF' | sudo tee "$mkinitfs_conf" >/dev/null -features="ata base ide scsi usb virtio ext4 nvme" -EOF - sudo chmod 0644 "$mkinitfs_conf" - - log "building Alpine initramfs for kernel $kernel_version" - sudo env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ - chroot "$ROOT_MOUNT" /bin/sh -se <&2 - exit 1 -fi -ln -snf /root/.local/share/mise/shims/opencode /usr/local/bin/opencode -EOF - - cat <<'EOF' | sudo tee "$profile_mise" >/dev/null -if [ -n "${BASH_VERSION:-}" ] && [ -z "${BANGER_MISE_ACTIVATED:-}" ] && [ -x '/usr/local/bin/mise' ]; then - export BANGER_MISE_ACTIVATED=1 - eval "$(/usr/local/bin/mise activate bash)" -fi -EOF - sudo chmod 0644 "$profile_mise" -} - -enable_openrc_services() { - sudo env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin chroot "$ROOT_MOUNT" /bin/sh -se <<'EOF' -set -eu - -add_service() { - local service="$1" - local runlevel="$2" - if [ ! -x "/etc/init.d/$service" ]; then - echo "missing OpenRC service: $service" >&2 - exit 1 - fi - rc-update add "$service" "$runlevel" >/dev/null -} - -for service in devfs dmesg mdev; do - add_service "$service" sysinit -done -for service in hwdrivers modules sysctl hostname bootmisc cgroups; do - add_service "$service" boot -done -for service in banger-network sshd banger-docker-preflight docker banger-vsock-agent banger-opencode; do - add_service "$service" default -done -for service in mount-ro killprocs; do - add_service "$service" shutdown -done -EOF -} - -cleanup() { - if [[ "${SYS_MOUNTED:-0}" == "1" ]] && command -v mountpoint >/dev/null 2>&1 && mountpoint -q "$ROOT_MOUNT/sys"; then - sudo umount "$ROOT_MOUNT/sys" || true - fi - if [[ "${PROC_MOUNTED:-0}" == "1" ]] && command -v mountpoint >/dev/null 2>&1 && mountpoint -q "$ROOT_MOUNT/proc"; then - sudo umount "$ROOT_MOUNT/proc" || true - fi - if [[ "${DEVPTS_MOUNTED:-0}" == "1" ]] && command -v mountpoint >/dev/null 2>&1 && mountpoint -q "$ROOT_MOUNT/dev/pts"; then - sudo umount "$ROOT_MOUNT/dev/pts" || true - fi - if [[ "${DEV_MOUNTED:-0}" == "1" ]] && command -v mountpoint >/dev/null 2>&1 && mountpoint -q "$ROOT_MOUNT/dev"; then - sudo umount "$ROOT_MOUNT/dev" || true - fi - if [[ -n "${ROOT_MOUNT:-}" ]] && command -v mountpoint >/dev/null 2>&1 && mountpoint -q "$ROOT_MOUNT"; then - sudo umount "$ROOT_MOUNT" || true - fi - if [[ "${BUILD_DONE:-0}" != "1" ]]; then - rm -f "${OUT_ROOTFS:-}" "${WORK_SEED:-}" "${OUT_ROOTFS:-}.packages.sha256" - fi - if [[ -n "${TMP_DIR:-}" && -d "${TMP_DIR:-}" ]]; then - rm -rf "$TMP_DIR" - fi -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -MANUAL_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -BANGER_BIN="$(resolve_banger_bin)" -SSH_KEY="$("$BANGER_BIN" internal ssh-key-path)" -OUT_ROOTFS="$MANUAL_DIR/rootfs-alpine.ext4" -SIZE_SPEC="2G" -RELEASE="${ALPINE_RELEASE:-3.23.3}" -MIRROR="https://dl-cdn.alpinelinux.org/alpine" -ARCH="x86_64" -MISE_VERSION="v2025.12.0" -MISE_INSTALL_PATH="/usr/local/bin/mise" -OPENCODE_TOOL="github:anomalyco/opencode" -GUESTNET_BOOTSTRAP_SCRIPT="$REPO_ROOT/internal/guestnet/assets/bootstrap.sh" -MODULES_DIR="" -ALPINE_KERNEL_MODULES_DIR="$(find_latest_module_dir "$MANUAL_DIR/alpine-kernel/lib/modules" || true)" -VSOCK_AGENT="$("$BANGER_BIN" internal vsock-agent-path)" -if [[ -n "$ALPINE_KERNEL_MODULES_DIR" ]]; then - MODULES_DIR="$ALPINE_KERNEL_MODULES_DIR" -fi - -while [[ $# -gt 0 ]]; do - case "$1" in - --out) - OUT_ROOTFS="${2:-}" - shift 2 - ;; - --size) - SIZE_SPEC="${2:-}" - shift 2 - ;; - --release) - RELEASE="${2:-}" - shift 2 - ;; - --mirror) - MIRROR="${2:-}" - shift 2 - ;; - --arch) - ARCH="${2:-}" - shift 2 - ;; - -h|--help) - usage - exit 0 - ;; - *) - log "unknown option: $1" - usage - exit 1 - ;; - esac -done - -if [[ "$ARCH" != "x86_64" ]]; then - log "unsupported arch: $ARCH" - log "this experimental builder currently supports only x86_64" - exit 1 -fi - -if [[ -z "$MODULES_DIR" || ! -d "$MODULES_DIR" ]]; then - log "modules dir not found; run 'make alpine-kernel' first" - exit 1 -fi -if [[ ! -x "$VSOCK_AGENT" ]]; then - log "vsock agent not found or not executable: $VSOCK_AGENT" - log "run 'make build'" - exit 1 -fi -if [[ ! -f "$GUESTNET_BOOTSTRAP_SCRIPT" ]]; then - log "guest network bootstrap script not found: $GUESTNET_BOOTSTRAP_SCRIPT" - exit 1 -fi -if [[ -e "$OUT_ROOTFS" ]]; then - log "output rootfs already exists: $OUT_ROOTFS" - exit 1 -fi - -require_command curl -require_command tar -require_command sudo -require_command mkfs.ext4 -require_command ssh-keygen -require_command mount -require_command umount -require_command install -require_command find -require_command awk -require_command sed -require_command sha256sum -require_command truncate -require_command mountpoint -require_command chroot -require_command cp - -ALPINE_PACKAGES=() -if ! load_package_preset alpine ALPINE_PACKAGES; then - log "alpine package preset is empty" - exit 1 -fi -if ! PACKAGES_HASH="$(printf '%s\n' "${ALPINE_PACKAGES[@]}" | sha256sum | awk '{print $1}')"; then - log "failed to hash package preset" - exit 1 -fi -if ! SIZE_BYTES="$(parse_size "$SIZE_SPEC")"; then - log "invalid size: $SIZE_SPEC" - exit 1 -fi - -if [[ "$OUT_ROOTFS" == *.ext4 ]]; then - WORK_SEED="${OUT_ROOTFS%.ext4}.work-seed.ext4" -else - WORK_SEED="${OUT_ROOTFS}.work-seed" -fi - -BRANCH="$(resolve_release_branch "$RELEASE")" -RELEASE_DIR="$MIRROR/$BRANCH/releases/$ARCH" -MINIROOTFS_URL="$RELEASE_DIR/alpine-minirootfs-$RELEASE-$ARCH.tar.gz" -MINIROOTFS_SHA256_URL="$MINIROOTFS_URL.sha256" -APK_RELEASE_URL="$MIRROR/$BRANCH" - -TMP_DIR="$(mktemp -d -t banger-alpine-rootfs-XXXXXX)" -MINIROOTFS_ARCHIVE="$TMP_DIR/alpine-minirootfs.tar.gz" -ROOT_MOUNT="$TMP_DIR/rootfs" -BUILD_DONE=0 -DEV_MOUNTED=0 -DEVPTS_MOUNTED=0 -PROC_MOUNTED=0 -SYS_MOUNTED=0 -trap cleanup EXIT - -mkdir -p "$ROOT_MOUNT" - -log "downloading Alpine minirootfs from $MINIROOTFS_URL" -curl -fsSL "$MINIROOTFS_URL" -o "$MINIROOTFS_ARCHIVE" -expected_sha="$(curl -fsSL "$MINIROOTFS_SHA256_URL" | awk '{print $1}')" -actual_sha="$(sha256sum "$MINIROOTFS_ARCHIVE" | awk '{print $1}')" -if [[ -z "$expected_sha" ]]; then - log "failed to read SHA256 from $MINIROOTFS_SHA256_URL" - exit 1 -fi -if [[ "$expected_sha" != "$actual_sha" ]]; then - log "sha256 mismatch for $MINIROOTFS_URL" - log "expected: $expected_sha" - log "actual: $actual_sha" - exit 1 -fi - -log "creating $OUT_ROOTFS ($SIZE_SPEC)" -truncate -s "$SIZE_BYTES" "$OUT_ROOTFS" -mkfs.ext4 -F -m 0 -L banger-alpine-root "$OUT_ROOTFS" >/dev/null -sudo mount -o loop "$OUT_ROOTFS" "$ROOT_MOUNT" - -log "unpacking Alpine minirootfs" -sudo tar -xzf "$MINIROOTFS_ARCHIVE" -C "$ROOT_MOUNT" -configure_apk_repositories -mount_chroot_support - -log "installing Alpine packages into the rootfs" -sudo env HOME=/root PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin \ - chroot "$ROOT_MOUNT" /bin/sh -se <] [--size ] [--mirror ] [--arch ] - -Build an experimental Void Linux rootfs image plus a matching /root work-seed. - -Defaults: - --out ./build/manual/rootfs-void.ext4 - --size 2G - --mirror https://repo-default.voidlinux.org - --arch x86_64 - -This path is experimental and local-only. If ./build/manual/void-kernel exists -it uses the staged Void kernel modules from that directory. It does not change -the default Debian image flow. -EOF -} - -parse_size() { - local raw="$1" - if [[ "$raw" =~ ^([0-9]+)([KMG])?$ ]]; then - local num="${BASH_REMATCH[1]}" - local unit="${BASH_REMATCH[2]}" - case "$unit" in - K) printf '%s\n' $((num * 1024)) ;; - M|"") printf '%s\n' $((num * 1024 * 1024)) ;; - G) printf '%s\n' $((num * 1024 * 1024 * 1024)) ;; - esac - return 0 - fi - return 1 -} - -require_command() { - local name="$1" - command -v "$name" >/dev/null 2>&1 || { - log "required command not found: $name" - exit 1 - } -} - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; build it first with 'make build' or set BANGER_BIN" - exit 1 -} - -normalize_mirror() { - local mirror="${1%/}" - mirror="${mirror%/current}" - mirror="${mirror%/static}" - printf '%s\n' "$mirror" -} - -find_latest_module_dir() { - local root="$1" - if [[ ! -d "$root" ]]; then - return 1 - fi - find "$root" -mindepth 1 -maxdepth 1 -type d | sort | tail -n 1 -} - -find_static_binary() { - local name="$1" - find "$STATIC_DIR" -type f \( -name "$name" -o -name "$name.static" \) -perm -u+x | sort | head -n 1 -} - -find_static_keys_dir() { - find "$STATIC_DIR" -type d -path '*/var/db/xbps/keys' | sort | head -n 1 -} - -load_package_preset() { - local preset="$1" - local -n out="$2" - mapfile -t out < <("$BANGER_BIN" internal packages "$preset") - (( ${#out[@]} > 0 )) -} - -write_rootfs_manifest_metadata() { - local rootfs_path="$1" - local manifest_hash="$2" - printf '%s\n' "$manifest_hash" > "${rootfs_path}.packages.sha256" -} - -install_root_authorized_key() { - local public_key - public_key="$(ssh-keygen -y -f "$SSH_KEY")" - sudo mkdir -p "$ROOT_MOUNT/root/.ssh" - printf '%s\n' "$public_key" | sudo tee "$ROOT_MOUNT/root/.ssh/authorized_keys" >/dev/null - sudo chmod 700 "$ROOT_MOUNT/root/.ssh" - sudo chmod 600 "$ROOT_MOUNT/root/.ssh/authorized_keys" -} - -ensure_sshd_include() { - local cfg="$ROOT_MOUNT/etc/ssh/sshd_config" - local tmp_cfg="$TMP_DIR/sshd_config" - local include_line="Include /etc/ssh/sshd_config.d/*.conf" - - sudo mkdir -p "$ROOT_MOUNT/etc/ssh/sshd_config.d" - if sudo test -f "$cfg"; then - sudo cat "$cfg" > "$tmp_cfg" - else - : > "$tmp_cfg" - fi - - if ! grep -Eq '^[[:space:]]*Include[[:space:]]+/etc/ssh/sshd_config\.d/\*\.conf([[:space:]]|$)' "$tmp_cfg"; then - { - printf '%s\n' "$include_line" - cat "$tmp_cfg" - } > "${tmp_cfg}.new" - mv "${tmp_cfg}.new" "$tmp_cfg" - sudo install -m 0644 "$tmp_cfg" "$cfg" - fi -} - -install_vsock_service() { - local service_dir="$ROOT_MOUNT/etc/sv/banger-vsock-agent" - local run_path="$service_dir/run" - local finish_path="$service_dir/finish" - - sudo mkdir -p "$service_dir" - cat <<'EOF' | sudo tee "$run_path" >/dev/null -#!/bin/sh -modprobe vsock 2>/dev/null || true -modprobe vmw_vsock_virtio_transport 2>/dev/null || true -exec /usr/local/bin/banger-vsock-agent -EOF - cat <<'EOF' | sudo tee "$finish_path" >/dev/null -#!/bin/sh -exit 0 -EOF - sudo chmod 0755 "$run_path" "$finish_path" - sudo mkdir -p "$ROOT_MOUNT/etc/runit/runsvdir/default" - sudo ln -snf /etc/sv/banger-vsock-agent "$ROOT_MOUNT/etc/runit/runsvdir/default/banger-vsock-agent" -} - -install_opencode_service() { - local service_dir="$ROOT_MOUNT/etc/sv/banger-opencode" - local run_path="$service_dir/run" - local finish_path="$service_dir/finish" - - sudo mkdir -p "$service_dir" - cat <<'EOF' | sudo tee "$run_path" >/dev/null -#!/bin/sh -set -e -export HOME=/root -cd /root -exec /usr/local/bin/opencode serve --hostname 0.0.0.0 --port 4096 -EOF - cat <<'EOF' | sudo tee "$finish_path" >/dev/null -#!/bin/sh -exit 0 -EOF - sudo chmod 0755 "$run_path" "$finish_path" - sudo mkdir -p "$ROOT_MOUNT/etc/runit/runsvdir/default" - sudo ln -snf /etc/sv/banger-opencode "$ROOT_MOUNT/etc/runit/runsvdir/default/banger-opencode" -} - -install_guest_network_bootstrap() { - sudo mkdir -p "$ROOT_MOUNT/usr/local/libexec" "$ROOT_MOUNT/etc/runit/core-services" - sudo install -m 0755 "$GUESTNET_BOOTSTRAP_SCRIPT" "$ROOT_MOUNT/usr/local/libexec/banger-network-bootstrap" - sudo install -m 0644 "$GUESTNET_VOID_CORE_SERVICE" "$ROOT_MOUNT/etc/runit/core-services/20-banger-network.sh" -} - -configure_docker_bootstrap() { - local modules_conf="$ROOT_MOUNT/etc/modules-load.d/docker-netfilter.conf" - local sysctl_conf="$ROOT_MOUNT/etc/sysctl.d/99-docker.conf" - local service_dir="$ROOT_MOUNT/etc/sv/docker" - local run_path="$service_dir/run" - local orig_run_path="$service_dir/run.orig" - local preflight_path="$ROOT_MOUNT/usr/local/bin/banger-docker-preflight" - - sudo mkdir -p "$ROOT_MOUNT/etc/modules-load.d" "$ROOT_MOUNT/etc/sysctl.d" "$ROOT_MOUNT/usr/local/bin" - cat <<'EOF' | sudo tee "$modules_conf" >/dev/null -nf_tables -nft_chain_nat -veth -br_netfilter -overlay -EOF - cat <<'EOF' | sudo tee "$sysctl_conf" >/dev/null -net.bridge.bridge-nf-call-iptables = 1 -net.bridge.bridge-nf-call-ip6tables = 1 -net.ipv4.ip_forward = 1 -EOF - cat <<'EOF' | sudo tee "$preflight_path" >/dev/null -#!/bin/sh -for module in nf_tables nft_chain_nat veth br_netfilter overlay; do - modprobe "$module" 2>/dev/null || true -done -if command -v sysctl >/dev/null 2>&1; then - sysctl --load /etc/sysctl.d/99-docker.conf >/dev/null 2>&1 || true -fi -EOF - - if [[ ! -f "$run_path" ]]; then - log "Void rootfs is missing /etc/sv/docker/run after docker install" - exit 1 - fi - sudo install -m 0755 "$run_path" "$orig_run_path" - cat <<'EOF' | sudo tee "$run_path" >/dev/null -#!/bin/sh -set -e -/usr/local/bin/banger-docker-preflight -exec /etc/sv/docker/run.orig -EOF - sudo chmod 0644 "$modules_conf" "$sysctl_conf" - sudo chmod 0755 "$preflight_path" "$run_path" "$orig_run_path" -} - -enable_sshd_service() { - if [[ ! -d "$ROOT_MOUNT/etc/sv/sshd" ]]; then - log "Void rootfs is missing /etc/sv/sshd after openssh install" - exit 1 - fi - sudo mkdir -p "$ROOT_MOUNT/etc/runit/runsvdir/default" - sudo ln -snf /etc/sv/sshd "$ROOT_MOUNT/etc/runit/runsvdir/default/sshd" -} - -enable_docker_service() { - if [[ ! -d "$ROOT_MOUNT/etc/sv/docker" ]]; then - log "Void rootfs is missing /etc/sv/docker after docker install" - exit 1 - fi - sudo mkdir -p "$ROOT_MOUNT/etc/runit/runsvdir/default" - sudo ln -snf /etc/sv/docker "$ROOT_MOUNT/etc/runit/runsvdir/default/docker" -} - -normalize_root_shell() { - local passwd="$ROOT_MOUNT/etc/passwd" - local shells="$ROOT_MOUNT/etc/shells" - local wanted_shell="/bin/bash" - local tmp_passwd="$TMP_DIR/passwd" - local root_shell="" - - if [[ ! -x "$ROOT_MOUNT$wanted_shell" ]]; then - log "required root shell is missing from the Void image: $wanted_shell" - exit 1 - fi - if [[ ! -f "$shells" ]]; then - log "Void image is missing /etc/shells" - exit 1 - fi - if ! sudo grep -Fxq "$wanted_shell" "$shells"; then - log "Void image does not allow $wanted_shell in /etc/shells" - exit 1 - fi - - sudo cat "$passwd" > "$tmp_passwd" - awk -F: -v OFS=: -v shell="$wanted_shell" ' - $1 == "root" { - $7 = shell - found = 1 - } - { print } - END { - if (!found) { - exit 1 - } - } - ' "$tmp_passwd" > "${tmp_passwd}.new" || { - log "failed to rewrite root shell in /etc/passwd" - exit 1 - } - mv "${tmp_passwd}.new" "$tmp_passwd" - sudo install -m 0644 "$tmp_passwd" "$passwd" - - root_shell="$(sudo awk -F: '$1 == "root" { print $7 }' "$passwd")" - if [[ "$root_shell" != "$wanted_shell" ]]; then - log "root shell normalization failed: expected $wanted_shell, got ${root_shell:-}" - exit 1 - fi -} - -configure_root_bash_prompt() { - local bashrc="$ROOT_MOUNT/root/.bashrc" - local bash_profile="$ROOT_MOUNT/root/.bash_profile" - local profile_prompt="$ROOT_MOUNT/etc/profile.d/banger-bash-prompt.sh" - - sudo mkdir -p "$ROOT_MOUNT/root" "$ROOT_MOUNT/etc/profile.d" - cat <<'EOF' | sudo tee "$bashrc" >/dev/null -# banger: default interactive prompt for experimental Void guests -case "$-" in - *i*) ;; - *) return ;; -esac - -if [ -z "${BANGER_MISE_ACTIVATED:-}" ] && [ -x '/usr/local/bin/mise' ]; then - export BANGER_MISE_ACTIVATED=1 - eval "$(/usr/local/bin/mise activate bash)" -fi - -PS1='\u@\h:\w\$ ' -EOF - cat <<'EOF' | sudo tee "$bash_profile" >/dev/null -if [ -f ~/.bashrc ]; then - . ~/.bashrc -fi -EOF - cat <<'EOF' | sudo tee "$profile_prompt" >/dev/null -case "$-" in - *i*) ;; - *) return 0 2>/dev/null || exit 0 ;; -esac - -if [ -n "${BASH_VERSION:-}" ]; then - PS1='\u@\h:\w\$ ' -fi -EOF - sudo chmod 0644 "$bashrc" "$bash_profile" "$profile_prompt" -} - -install_mise_and_opencode() { - local profile_mise="$ROOT_MOUNT/etc/profile.d/mise.sh" - - sudo mkdir -p "$ROOT_MOUNT/etc/profile.d" - if [[ -r /etc/resolv.conf ]]; then - sudo install -m 0644 /etc/resolv.conf "$ROOT_MOUNT/etc/resolv.conf" - fi - - sudo env \ - HOME=/root \ - PATH=/usr/local/bin:/usr/bin:/bin \ - chroot "$ROOT_MOUNT" /bin/bash -se <&2 - exit 1 -fi -ln -snf /root/.local/share/mise/shims/opencode /usr/local/bin/opencode -EOF - - cat <<'EOF' | sudo tee "$profile_mise" >/dev/null -if [ -n "${BASH_VERSION:-}" ] && [ -z "${BANGER_MISE_ACTIVATED:-}" ] && [ -x '/usr/local/bin/mise' ]; then - export BANGER_MISE_ACTIVATED=1 - eval "$(/usr/local/bin/mise activate bash)" -fi -EOF - sudo chmod 0644 "$profile_mise" -} - -cleanup() { - if [[ -n "${ROOT_MOUNT:-}" ]] && command -v mountpoint >/dev/null 2>&1 && mountpoint -q "$ROOT_MOUNT"; then - sudo umount "$ROOT_MOUNT" || true - fi - if [[ "${BUILD_DONE:-0}" != "1" ]]; then - rm -f "${OUT_ROOTFS:-}" "${WORK_SEED:-}" "${OUT_ROOTFS:-}.packages.sha256" - fi - if [[ -n "${TMP_DIR:-}" && -d "${TMP_DIR:-}" ]]; then - rm -rf "$TMP_DIR" - fi -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -MANUAL_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -BANGER_BIN="$(resolve_banger_bin)" -SSH_KEY="$("$BANGER_BIN" internal ssh-key-path)" -OUT_ROOTFS="$MANUAL_DIR/rootfs-void.ext4" -SIZE_SPEC="2G" -MIRROR="https://repo-default.voidlinux.org" -ARCH="x86_64" -MISE_VERSION="v2025.12.0" -MISE_INSTALL_PATH="/usr/local/bin/mise" -OPENCODE_TOOL="github:anomalyco/opencode" -GUESTNET_BOOTSTRAP_SCRIPT="$REPO_ROOT/internal/guestnet/assets/bootstrap.sh" -GUESTNET_VOID_CORE_SERVICE="$REPO_ROOT/internal/guestnet/assets/void-core-service.sh" -MODULES_DIR="" -VOID_KERNEL_MODULES_DIR="$(find_latest_module_dir "$MANUAL_DIR/void-kernel/lib/modules" || true)" -VSOCK_AGENT="$("$BANGER_BIN" internal vsock-agent-path)" -if [[ -n "$VOID_KERNEL_MODULES_DIR" ]]; then - MODULES_DIR="$VOID_KERNEL_MODULES_DIR" -fi - -while [[ $# -gt 0 ]]; do - case "$1" in - --out) - OUT_ROOTFS="${2:-}" - shift 2 - ;; - --size) - SIZE_SPEC="${2:-}" - shift 2 - ;; - --mirror) - MIRROR="${2:-}" - shift 2 - ;; - --arch) - ARCH="${2:-}" - shift 2 - ;; - -h|--help) - usage - exit 0 - ;; - *) - log "unknown option: $1" - usage - exit 1 - ;; - esac -done - -MIRROR="$(normalize_mirror "$MIRROR")" -REPO_URL="$MIRROR/current" -STATIC_ARCHIVE_URL="$MIRROR/static/xbps-static-latest.x86_64-musl.tar.xz" - -if [[ "$ARCH" != "x86_64" ]]; then - log "unsupported arch: $ARCH" - log "this experimental builder currently supports only x86_64-glibc" - exit 1 -fi - -if [[ -z "$MODULES_DIR" || ! -d "$MODULES_DIR" ]]; then - log "modules dir not found; run 'make void-kernel' first" - exit 1 -fi -if [[ ! -x "$VSOCK_AGENT" ]]; then - log "vsock agent not found or not executable: $VSOCK_AGENT" - log "run 'make build'" - exit 1 -fi -if [[ ! -f "$GUESTNET_BOOTSTRAP_SCRIPT" ]]; then - log "guest network bootstrap script not found: $GUESTNET_BOOTSTRAP_SCRIPT" - exit 1 -fi -if [[ ! -f "$GUESTNET_VOID_CORE_SERVICE" ]]; then - log "guest network core-service shim not found: $GUESTNET_VOID_CORE_SERVICE" - exit 1 -fi -if [[ -e "$OUT_ROOTFS" ]]; then - log "output rootfs already exists: $OUT_ROOTFS" - exit 1 -fi - -require_command curl -require_command tar -require_command sudo -require_command mkfs.ext4 -require_command ssh-keygen -require_command mount -require_command umount -require_command install -require_command find -require_command awk -require_command sed -require_command sha256sum -require_command truncate -require_command mountpoint - -VOID_PACKAGES=() -if ! load_package_preset void VOID_PACKAGES; then - log "void package preset is empty" - exit 1 -fi -if ! PACKAGES_HASH="$(printf '%s\n' "${VOID_PACKAGES[@]}" | sha256sum | awk '{print $1}')"; then - log "failed to hash package preset" - exit 1 -fi -if ! SIZE_BYTES="$(parse_size "$SIZE_SPEC")"; then - log "invalid size: $SIZE_SPEC" - exit 1 -fi - -if [[ "$OUT_ROOTFS" == *.ext4 ]]; then - WORK_SEED="${OUT_ROOTFS%.ext4}.work-seed.ext4" -else - WORK_SEED="${OUT_ROOTFS}.work-seed" -fi - -TMP_DIR="$(mktemp -d -t banger-void-rootfs-XXXXXX)" -STATIC_DIR="$TMP_DIR/static" -ROOT_MOUNT="$TMP_DIR/rootfs" -STATIC_ARCHIVE="$TMP_DIR/xbps-static.tar.xz" -BUILD_DONE=0 -trap cleanup EXIT - -mkdir -p "$STATIC_DIR" "$ROOT_MOUNT" - -log "downloading static XBPS from $STATIC_ARCHIVE_URL" -curl -fsSL "$STATIC_ARCHIVE_URL" -o "$STATIC_ARCHIVE" -tar -xf "$STATIC_ARCHIVE" -C "$STATIC_DIR" - -XBPS_INSTALL="$(find_static_binary xbps-install)" -XBPS_QUERY="$(find_static_binary xbps-query)" -STATIC_KEYS_DIR="$(find_static_keys_dir)" - -if [[ -z "$XBPS_INSTALL" || ! -x "$XBPS_INSTALL" ]]; then - log "failed to locate xbps-install in the static archive" - exit 1 -fi -if [[ -z "$STATIC_KEYS_DIR" || ! -d "$STATIC_KEYS_DIR" ]]; then - log "failed to locate Void repository keys in the static archive" - exit 1 -fi - -log "creating $OUT_ROOTFS ($SIZE_SPEC)" -truncate -s "$SIZE_BYTES" "$OUT_ROOTFS" -mkfs.ext4 -F -m 0 -L banger-void-root "$OUT_ROOTFS" >/dev/null -sudo mount -o loop "$OUT_ROOTFS" "$ROOT_MOUNT" -sudo mkdir -p "$ROOT_MOUNT/var/db/xbps/keys" -sudo cp -a "$STATIC_KEYS_DIR/." "$ROOT_MOUNT/var/db/xbps/keys/" - -log "installing Void packages into the rootfs" -sudo env XBPS_ARCH="$ARCH" "$XBPS_INSTALL" -S -y -r "$ROOT_MOUNT" -R "$REPO_URL" "${VOID_PACKAGES[@]}" - -if [[ -n "$XBPS_QUERY" && -x "$XBPS_QUERY" ]]; then - log "installed package set:" - sudo env XBPS_ARCH="$ARCH" "$XBPS_QUERY" -r "$ROOT_MOUNT" -l | awk '/^ii/ {print " " $2}' || true -fi - -if [[ -n "$VOID_KERNEL_MODULES_DIR" ]]; then - log "copying staged Void kernel modules into the guest" -else - log "copying bundled kernel modules into the guest" -fi -sudo mkdir -p "$ROOT_MOUNT/lib/modules" -sudo cp -a "$MODULES_DIR" "$ROOT_MOUNT/lib/modules/" - -log "installing the guest-side vsock agent" -sudo mkdir -p "$ROOT_MOUNT/usr/local/bin" -sudo install -m 0755 "$VSOCK_AGENT" "$ROOT_MOUNT/usr/local/bin/banger-vsock-agent" - -log "preparing SSH and runit services" -install_guest_network_bootstrap -ensure_sshd_include -enable_sshd_service -install_vsock_service -configure_docker_bootstrap -enable_docker_service -normalize_root_shell -configure_root_bash_prompt -log "installing mise and opencode" -install_mise_and_opencode -install_opencode_service -install_root_authorized_key -sudo touch "$ROOT_MOUNT/etc/fstab" "$ROOT_MOUNT/etc/hostname" -sudo chroot "$ROOT_MOUNT" /usr/bin/ssh-keygen -A - -log "removing bulky caches, docs, and stale installer artifacts from the experimental image" -sudo rm -rf \ - "$ROOT_MOUNT/var/cache/xbps" \ - "$ROOT_MOUNT/usr/share/doc" \ - "$ROOT_MOUNT/usr/share/info" \ - "$ROOT_MOUNT/usr/share/man" -sudo rm -f \ - "$ROOT_MOUNT/root/get-docker" \ - "$ROOT_MOUNT/root/get-docker.sh" \ - "$ROOT_MOUNT/root/.cache/opencode" \ - "$ROOT_MOUNT/tmp/get-docker" \ - "$ROOT_MOUNT/tmp/get-docker.sh" -sudo rm -rf \ - "$ROOT_MOUNT/root/.cache/mise" \ - "$ROOT_MOUNT/root/.local/share/mise/downloads" \ - "$ROOT_MOUNT/root/.local/share/mise/tmp" - -sudo umount "$ROOT_MOUNT" - -write_rootfs_manifest_metadata "$OUT_ROOTFS" "$PACKAGES_HASH" - -log "building work-seed $WORK_SEED" -"$BANGER_BIN" internal work-seed --rootfs "$OUT_ROOTFS" --out "$WORK_SEED" - -BUILD_DONE=1 -log "built experimental Void rootfs: $OUT_ROOTFS" -log "built experimental Void work-seed: $WORK_SEED" -log "use examples/void-exp.config.toml as the local config override template" diff --git a/scripts/make-rootfs.sh b/scripts/make-rootfs.sh deleted file mode 100755 index 2c4c405..0000000 --- a/scripts/make-rootfs.sh +++ /dev/null @@ -1,99 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[make-rootfs] %s\n' "$*" -} - -usage() { - cat <<'EOF' -Usage: ./scripts/make-rootfs.sh --kernel [--initrd ] [--modules

] [--size ] [--base-rootfs ] - -Builds build/manual/rootfs-docker.ext4 using scripts/customize.sh. If ---base-rootfs is omitted, the first existing file is used: - ./build/manual/rootfs-base.ext4 - ./ubuntu-noble-rootfs/rootfs.ext4 - ./ubuntu-lts/rootfs.ext4 -EOF -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -MANUAL_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -OUT_ROOTFS="$MANUAL_DIR/rootfs-docker.ext4" -SIZE_SPEC="6G" -BASE_ROOTFS="" -KERNEL_PATH="" -INITRD_PATH="" -MODULES_DIR="" - -while [[ $# -gt 0 ]]; do - case "$1" in - --size) - SIZE_SPEC="${2:-}" - shift 2 - ;; - --base-rootfs) - BASE_ROOTFS="${2:-}" - shift 2 - ;; - --kernel) - KERNEL_PATH="${2:-}" - shift 2 - ;; - --initrd) - INITRD_PATH="${2:-}" - shift 2 - ;; - --modules) - MODULES_DIR="${2:-}" - shift 2 - ;; - -h|--help) - usage - exit 0 - ;; - *) - log "unknown option: $1" - usage - exit 1 - ;; - esac -done - -if [[ -z "$BASE_ROOTFS" ]]; then - if [[ -f "$MANUAL_DIR/rootfs-base.ext4" ]]; then - BASE_ROOTFS="$MANUAL_DIR/rootfs-base.ext4" - elif [[ -f "$REPO_ROOT/ubuntu-noble-rootfs/rootfs.ext4" ]]; then - BASE_ROOTFS="$REPO_ROOT/ubuntu-noble-rootfs/rootfs.ext4" - elif [[ -f "$REPO_ROOT/ubuntu-lts/rootfs.ext4" ]]; then - BASE_ROOTFS="$REPO_ROOT/ubuntu-lts/rootfs.ext4" - else - log "no base rootfs found; pass --base-rootfs" - exit 1 - fi -fi - -if [[ -z "$KERNEL_PATH" ]]; then - log "kernel path is required; pass --kernel" - exit 1 -fi - -mkdir -p "$MANUAL_DIR" - -log "building $OUT_ROOTFS from $BASE_ROOTFS" -args=( - "$SCRIPT_DIR/customize.sh" - "$BASE_ROOTFS" - --out "$OUT_ROOTFS" - --size "$SIZE_SPEC" - --kernel "$KERNEL_PATH" - --docker -) -if [[ -n "$INITRD_PATH" ]]; then - args+=(--initrd "$INITRD_PATH") -fi -if [[ -n "$MODULES_DIR" ]]; then - args+=(--modules "$MODULES_DIR") -fi -exec "${args[@]}" diff --git a/scripts/make-void-kernel.sh b/scripts/make-void-kernel.sh deleted file mode 100755 index 372b456..0000000 --- a/scripts/make-void-kernel.sh +++ /dev/null @@ -1,386 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[make-void-kernel] %s\n' "$*" -} - -usage() { - cat <<'EOF' -Usage: ./scripts/make-void-kernel.sh [--out-dir ] [--mirror ] [--arch ] [--kernel-package ] [--print-register-flags] - -Download and stage a Void Linux kernel under ./build/manual/void-kernel for -the -experimental Void guest flow. - -Defaults: - --out-dir ./build/manual/void-kernel - --mirror https://repo-default.voidlinux.org - --arch x86_64 - --kernel-package linux6.12 - -The staged output contains: - boot/vmlinux- Firecracker-usable kernel extracted from vmlinuz - boot/vmlinuz- Raw distro boot image from the Void package - boot/initramfs-.img Matching initramfs generated with dracut - boot/config- Void kernel config - lib/modules// Matching kernel modules tree - -If --print-register-flags is passed, the script does not download anything. It -prints the banger image register flags for an existing staged Void kernel. -EOF -} - -require_command() { - local name="$1" - command -v "$name" >/dev/null 2>&1 || { - log "required command not found: $name" - exit 1 - } -} - -normalize_mirror() { - local mirror="${1%/}" - mirror="${mirror%/current}" - mirror="${mirror%/static}" - printf '%s\n' "$mirror" -} - -find_static_binary() { - local name="$1" - find "$STATIC_DIR" -type f \( -name "$name" -o -name "$name.static" \) -perm -u+x | sort | head -n 1 -} - -find_static_keys_dir() { - find "$STATIC_DIR" -type d -path '*/var/db/xbps/keys' | sort | head -n 1 -} - -find_latest_matching() { - local dir="$1" - local pattern="$2" - if [[ ! -d "$dir" ]]; then - return 1 - fi - find "$dir" -maxdepth 1 -type f -name "$pattern" | sort | tail -n 1 -} - -find_latest_module_dir() { - local root="$1" - if [[ ! -d "$root" ]]; then - return 1 - fi - find "$root" -mindepth 1 -maxdepth 1 -type d | sort | tail -n 1 -} - -print_register_flags() { - local kernel="" - local initrd="" - local modules="" - - kernel="$(find_latest_matching "$OUT_DIR/boot" 'vmlinux-*' || true)" - initrd="$(find_latest_matching "$OUT_DIR/boot" 'initramfs-*' || true)" - modules="$(find_latest_module_dir "$OUT_DIR/lib/modules" || true)" - - if [[ -z "$kernel" || -z "$modules" ]]; then - log "staged Void kernel not found under $OUT_DIR" - exit 1 - fi - - printf -- '--kernel %q ' "$kernel" - if [[ -n "$initrd" ]]; then - printf -- '--initrd %q ' "$initrd" - fi - printf -- '--modules %q\n' "$modules" -} - -check_elf() { - local path="$1" - readelf -h "$path" >/dev/null 2>&1 -} - -ensure_stage_root_layout() { - mkdir -p "$STAGE_ROOT/usr" - - if [[ ! -e "$STAGE_ROOT/bin" ]]; then - ln -snf usr/bin "$STAGE_ROOT/bin" - fi - if [[ ! -e "$STAGE_ROOT/sbin" ]]; then - ln -snf usr/bin "$STAGE_ROOT/sbin" - fi - if [[ ! -e "$STAGE_ROOT/usr/sbin" ]]; then - ln -snf bin "$STAGE_ROOT/usr/sbin" - fi - if [[ ! -e "$STAGE_ROOT/lib" ]]; then - ln -snf usr/lib "$STAGE_ROOT/lib" - fi - if [[ ! -e "$STAGE_ROOT/lib64" ]]; then - ln -snf usr/lib "$STAGE_ROOT/lib64" - fi - if [[ ! -e "$STAGE_ROOT/usr/lib64" ]]; then - ln -snf lib "$STAGE_ROOT/usr/lib64" - fi - if [[ -x "$STAGE_ROOT/usr/bin/udevd" ]]; then - mkdir -p "$STAGE_ROOT/usr/lib/udev" "$STAGE_ROOT/usr/lib/systemd" - if [[ ! -e "$STAGE_ROOT/usr/lib/udev/udevd" ]]; then - ln -snf ../../bin/udevd "$STAGE_ROOT/usr/lib/udev/udevd" - fi - if [[ ! -e "$STAGE_ROOT/usr/lib/systemd/systemd-udevd" ]]; then - ln -snf ../../bin/udevd "$STAGE_ROOT/usr/lib/systemd/systemd-udevd" - fi - fi -} - -sync_host_dracut_tree() { - if [[ ! -d /usr/lib/dracut ]]; then - log "host dracut support files not found under /usr/lib/dracut" - exit 1 - fi - rm -rf "$STAGE_ROOT/usr/lib/dracut" - mkdir -p "$STAGE_ROOT/usr/lib" - cp -a /usr/lib/dracut "$STAGE_ROOT/usr/lib/dracut" -} - -build_initramfs() { - local kver="$1" - local modules_dir="$2" - local out="$3" - local config_dir="$TMP_DIR/dracut.conf.d" - local tmpdir="$TMP_DIR/dracut-tmp" - local force_drivers="virtio virtio_ring virtio_mmio virtio_blk virtio_net virtio_console ext4 vsock vmw_vsock_virtio_transport" - - mkdir -p "$config_dir" "$tmpdir" - ensure_stage_root_layout - sync_host_dracut_tree - - log "generating initramfs for kernel $kver with host dracut against the staged Void sysroot" - env dracutbasedir="/usr/lib/dracut" dracut \ - --force \ - --kver "$kver" \ - --sysroot "$STAGE_ROOT" \ - --kmoddir "$modules_dir" \ - --conf /dev/null \ - --confdir "$config_dir" \ - --tmpdir "$tmpdir" \ - --no-hostonly \ - --filesystems "ext4" \ - --force-drivers "$force_drivers" \ - --gzip \ - "$out" -} - -extract_vmlinux() { - local image="$1" - local out="$2" - local tmp="$TMP_DIR/vmlinux.extract" - - if check_elf "$image"; then - install -m 0644 "$image" "$out" - return 0 - fi - - try_decompress() { - local header="$1" - local marker="$2" - local command="$3" - local pos="" - - while IFS= read -r pos; do - [[ -n "$pos" ]] || continue - pos="${pos%%:*}" - tail -c+"$pos" "$image" | eval "$command" >"$tmp" 2>/dev/null || true - if check_elf "$tmp"; then - install -m 0644 "$tmp" "$out" - return 0 - fi - done < <(tr "$header\n$marker" "\n$marker=" < "$image" | grep -abo "^$marker" || true) - - return 1 - } - - try_decompress '\037\213\010' "xy" "gunzip" && return 0 - try_decompress '\3757zXZ\000' "abcde" "unxz" && return 0 - try_decompress "BZh" "xy" "bunzip2" && return 0 - try_decompress '\135\000\000\000' "xxx" "unlzma" && return 0 - try_decompress '\002!L\030' "xxx" "lz4 -d" && return 0 - try_decompress '(\265/\375' "xxx" "unzstd" && return 0 - - return 1 -} - -resolve_kernel_package_file() { - local escaped_name="" - escaped_name="$(printf '%s\n' "$KERNEL_PACKAGE" | sed 's/[.[\*^$()+?{|]/\\&/g')" - - curl -fsSL "$REPO_URL/" | - grep -o "${escaped_name}-[0-9][^\" >]*\\.${ARCH}\\.xbps" | - sort -u | - tail -n 1 -} - -cleanup() { - if [[ -n "${TMP_DIR:-}" && -d "${TMP_DIR:-}" ]]; then - rm -rf "$TMP_DIR" - fi -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -MANUAL_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -OUT_DIR="$MANUAL_DIR/void-kernel" -MIRROR="https://repo-default.voidlinux.org" -ARCH="x86_64" -KERNEL_PACKAGE="linux6.12" -PRINT_REGISTER_FLAGS=0 - -while [[ $# -gt 0 ]]; do - case "$1" in - --out-dir) - OUT_DIR="${2:-}" - shift 2 - ;; - --mirror) - MIRROR="${2:-}" - shift 2 - ;; - --arch) - ARCH="${2:-}" - shift 2 - ;; - --kernel-package) - KERNEL_PACKAGE="${2:-}" - shift 2 - ;; - --print-register-flags) - PRINT_REGISTER_FLAGS=1 - shift - ;; - -h|--help) - usage - exit 0 - ;; - *) - log "unknown option: $1" - usage - exit 1 - ;; - esac -done - -MIRROR="$(normalize_mirror "$MIRROR")" -REPO_URL="$MIRROR/current" -STATIC_ARCHIVE_URL="$MIRROR/static/xbps-static-latest.x86_64-musl.tar.xz" - -if [[ "$PRINT_REGISTER_FLAGS" == "1" ]]; then - print_register_flags - exit 0 -fi - -if [[ "$ARCH" != "x86_64" ]]; then - log "unsupported arch: $ARCH" - log "this experimental downloader currently supports only x86_64" - exit 1 -fi -mkdir -p "$RUNTIME_DIR" -if [[ -e "$OUT_DIR" ]]; then - log "output directory already exists: $OUT_DIR" - log "remove it first if you want to re-stage a different Void kernel" - exit 1 -fi - -require_command curl -require_command tar -require_command cp -require_command find -require_command grep -require_command cut -require_command readelf -require_command file -require_command install -require_command tail -require_command xz -require_command gzip -require_command bzip2 -require_command dracut - -TMP_DIR="$(mktemp -d -t banger-void-kernel-XXXXXX)" -STATIC_DIR="$TMP_DIR/static" -STAGE_ROOT="$TMP_DIR/root" -STAGE_OUT="$TMP_DIR/out" -STATIC_ARCHIVE="$TMP_DIR/xbps-static.tar.xz" -trap cleanup EXIT - -mkdir -p "$STATIC_DIR" "$STAGE_ROOT/var/db/xbps/keys" "$STAGE_OUT/boot" "$STAGE_OUT/lib/modules" - -log "downloading static XBPS from $STATIC_ARCHIVE_URL" -curl -fsSL "$STATIC_ARCHIVE_URL" -o "$STATIC_ARCHIVE" -tar -xf "$STATIC_ARCHIVE" -C "$STATIC_DIR" - -XBPS_INSTALL="$(find_static_binary xbps-install)" -STATIC_KEYS_DIR="$(find_static_keys_dir)" -if [[ -z "$XBPS_INSTALL" || ! -x "$XBPS_INSTALL" ]]; then - log "failed to locate xbps-install in the static archive" - exit 1 -fi -if [[ -z "$STATIC_KEYS_DIR" || ! -d "$STATIC_KEYS_DIR" ]]; then - log "failed to locate Void repository keys in the static archive" - exit 1 -fi - -cp -a "$STATIC_KEYS_DIR/." "$STAGE_ROOT/var/db/xbps/keys/" - -KERNEL_PACKAGE_FILE="$(resolve_kernel_package_file)" -if [[ -z "$KERNEL_PACKAGE_FILE" ]]; then - log "failed to resolve a package file for $KERNEL_PACKAGE in $REPO_URL" - exit 1 -fi - -log "staging $KERNEL_PACKAGE_FILE into a temporary root" -env XBPS_ARCH="$ARCH" "$XBPS_INSTALL" -S -y -U -r "$STAGE_ROOT" -R "$REPO_URL" linux-base "$KERNEL_PACKAGE" dracut eudev >/dev/null - -VMLINUX_RAW="$(find_latest_matching "$STAGE_ROOT/boot" 'vmlinuz-*' || true)" -KERNEL_CONFIG="$(find_latest_matching "$STAGE_ROOT/boot" 'config-*' || true)" -MODULES_DIR="$(find_latest_module_dir "$STAGE_ROOT/usr/lib/modules" || true)" -KERNEL_VERSION="$(basename "$MODULES_DIR")" -INITRAMFS_NAME="initramfs-${KERNEL_VERSION}.img" -INITRAMFS_RAW="$STAGE_OUT/boot/$INITRAMFS_NAME" - -if [[ -z "$VMLINUX_RAW" || -z "$KERNEL_CONFIG" || -z "$MODULES_DIR" ]]; then - log "staged Void kernel is missing expected boot artifacts" - exit 1 -fi -if [[ ! -x "$STAGE_ROOT/usr/bin/udevd" ]]; then - log "staged Void sysroot is missing /usr/bin/udevd after package install" - exit 1 -fi - -VMLINUX_BASE="$(basename "$VMLINUX_RAW")" -VMLINUX_OUT="$STAGE_OUT/boot/vmlinux-${VMLINUX_BASE#vmlinuz-}" -install -m 0644 "$VMLINUX_RAW" "$STAGE_OUT/boot/$VMLINUX_BASE" -install -m 0644 "$KERNEL_CONFIG" "$STAGE_OUT/boot/$(basename "$KERNEL_CONFIG")" -build_initramfs "$KERNEL_VERSION" "$MODULES_DIR" "$INITRAMFS_RAW" -cp -a "$MODULES_DIR" "$STAGE_OUT/lib/modules/" - -log "extracting Firecracker kernel from $(basename "$VMLINUX_RAW")" -if ! extract_vmlinux "$VMLINUX_RAW" "$VMLINUX_OUT"; then - log "failed to extract an uncompressed vmlinux from $VMLINUX_RAW" - log "raw kernel image type: $(file -b "$VMLINUX_RAW")" - exit 1 -fi - -cat >"$STAGE_OUT/metadata.json" <. +# Public URLs in the manifest are ${BASE_URL}/${BUCKET_PATH}// +# (BASE_URL is the bucket's public custom domain, so the bucket name +# itself is implicit there). +# +# Prerequisites: +# * cosign in PATH (https://github.com/sigstore/cosign) +# * rclone in PATH, configured with a remote named ${RCLONE_REMOTE} +# that targets the R2 account hosting ${RCLONE_BUCKET}, which is +# publicly served at ${BASE_URL}. +# * A cosign keypair already generated. The public key MUST already +# be embedded in internal/updater/verify_signature.go's +# BangerReleasePublicKey constant — running this script with a +# placeholder key would publish a release no installed banger can +# verify. +# +# Output (under build/release//): +# banger--linux-amd64.tar.gz +# SHA256SUMS +# SHA256SUMS.sig +# manifest.json (the freshly-mutated copy uploaded to the bucket) + +set -euo pipefail + +log() { printf '[publish-banger-release] %s\n' "$*" >&2; } +die() { log "$*"; exit 1; } + +if [[ $# -lt 1 ]]; then + die "usage: $0 (e.g. $0 v0.1.0)" +fi + +VERSION="$1" +case "$VERSION" in + v*.*.*) ;; + *) die "version must look like vX.Y.Z, got $VERSION" ;; +esac + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" + +COSIGN_KEY="${COSIGN_KEY:-cosign.key}" +RCLONE_REMOTE="${RCLONE_REMOTE:-releases}" +RCLONE_BUCKET="${RCLONE_BUCKET:-releases}" +BUCKET_PATH="${BUCKET_PATH:-banger}" +BASE_URL="${BASE_URL:-https://releases.thaloco.com}" +SKIP_UPLOAD="${SKIP_UPLOAD:-0}" + +RCLONE_DEST_BASE="$RCLONE_REMOTE:$RCLONE_BUCKET/$BUCKET_PATH" + +command -v cosign >/dev/null || die "cosign not in PATH" +command -v rclone >/dev/null || die "rclone not in PATH" +command -v sha256sum >/dev/null || die "sha256sum not in PATH" +command -v jq >/dev/null || die "jq not in PATH" + +[[ -f "$COSIGN_KEY" ]] || die "cosign key not found at $COSIGN_KEY (override with COSIGN_KEY=...)" + +cd "$REPO_ROOT" + +OUT_DIR="$REPO_ROOT/build/release/$VERSION" +TARBALL_NAME="banger-$VERSION-linux-amd64.tar.gz" +TARBALL_PATH="$OUT_DIR/$TARBALL_NAME" + +log "preparing $OUT_DIR" +rm -rf "$OUT_DIR" +mkdir -p "$OUT_DIR" + +log "building binaries with version=$VERSION" +COMMIT="$(git rev-parse HEAD)" +BUILT_AT="$(date -u +%Y-%m-%dT%H:%M:%SZ)" +LDFLAGS="-X banger/internal/buildinfo.Version=$VERSION \ + -X banger/internal/buildinfo.Commit=$COMMIT \ + -X banger/internal/buildinfo.BuiltAt=$BUILT_AT" + +BUILD_STAGE="$OUT_DIR/stage" +mkdir -p "$BUILD_STAGE" +go build -ldflags "$LDFLAGS" -o "$BUILD_STAGE/banger" ./cmd/banger +go build -ldflags "$LDFLAGS" -o "$BUILD_STAGE/bangerd" ./cmd/bangerd +CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \ + go build -ldflags "$LDFLAGS" -o "$BUILD_STAGE/banger-vsock-agent" \ + ./cmd/banger-vsock-agent + +log "tarring → $TARBALL_PATH" +# -C into the stage dir so the tarball's root entries are bare +# basenames (banger, bangerd, banger-vsock-agent) — the updater's +# StageTarball validator rejects anything else. +tar -czf "$TARBALL_PATH" -C "$BUILD_STAGE" \ + banger bangerd banger-vsock-agent + +log "computing SHA256SUMS" +( + cd "$OUT_DIR" + sha256sum "$TARBALL_NAME" > SHA256SUMS + cat SHA256SUMS +) >&2 + +log "cosign sign-blob → SHA256SUMS.sig" +# Flag rationale (cosign v3.x): +# --use-signing-config=false bypasses the new signing-config flow that +# otherwise insists on bundle output + Rekor. +# --tlog-upload=false skip the public transparency log; banger's +# trust model is "embedded public key", not +# "Rekor lookup", so the log adds nothing. +# --new-bundle-format=false emit a bare base64 ASN.1 DER signature, +# which is what internal/updater consumes +# via crypto/ecdsa.VerifyASN1. +# These flags also work on cosign v2.x, so the script is forward- and +# backward-compatible across the v2→v3 boundary. +# If COSIGN_PASSWORD is set in the environment, cosign uses it. +# Otherwise cosign prompts on the terminal — which is what we want +# for a password-protected offline key. Don't pre-set it to empty: +# that suppresses the prompt and makes cosign try to decrypt with +# the empty password, failing with "decryption failed". +cosign sign-blob --yes \ + --key "$COSIGN_KEY" \ + --use-signing-config=false \ + --tlog-upload=false \ + --new-bundle-format=false \ + --output-signature "$OUT_DIR/SHA256SUMS.sig" \ + "$OUT_DIR/SHA256SUMS" + +log "verifying signature against the embedded public key" +EMBEDDED_PUB="$OUT_DIR/embedded-pubkey.pem" +# verify_signature.go embeds the PEM inside a Go raw-string literal, so the +# BEGIN line is prefixed with `var ... = ` + backtick and the END line has a +# trailing backtick. Strip those so the result is a clean PEM. +sed -n '/-----BEGIN PUBLIC KEY-----/,/-----END PUBLIC KEY-----/p' \ + "$REPO_ROOT/internal/updater/verify_signature.go" \ + | sed -E 's/.*(-----BEGIN PUBLIC KEY-----)/\1/; s/(-----END PUBLIC KEY-----).*/\1/' \ + > "$EMBEDDED_PUB" +if grep -q PLACEHOLDER "$EMBEDDED_PUB"; then + die "BangerReleasePublicKey is the placeholder in verify_signature.go; replace it with cosign.pub before publishing" +fi +cosign verify-blob \ + --key "$EMBEDDED_PUB" \ + --insecure-ignore-tlog \ + --signature "$OUT_DIR/SHA256SUMS.sig" \ + "$OUT_DIR/SHA256SUMS" + +# install.sh embeds its own copy of the public key for end-user +# verification (curl|bash trust path). Make sure the two copies didn't +# drift; a release with mismatched keys would either reject all +# `banger update` calls or all `install.sh | bash` runs. +log "checking install.sh embedded key matches verify_signature.go" +INSTALL_PUB="$OUT_DIR/install-script-pubkey.pem" +sed -n "/-----BEGIN PUBLIC KEY-----/,/-----END PUBLIC KEY-----/p" \ + "$REPO_ROOT/scripts/install.sh" \ + | sed -E "s/.*(-----BEGIN PUBLIC KEY-----)/\\1/; s/(-----END PUBLIC KEY-----).*/\\1/" \ + > "$INSTALL_PUB" +diff -q "$EMBEDDED_PUB" "$INSTALL_PUB" >/dev/null \ + || die "scripts/install.sh embedded key differs from internal/updater/verify_signature.go; sync them before publishing" + +# Build the manifest. Pull the existing manifest from the bucket so +# we don't lose previous release entries, append this one, bump +# latest_stable, write back. +log "fetching existing manifest" +PREV_MANIFEST="$OUT_DIR/manifest.previous.json" +if curl -fsSL "$BASE_URL/$BUCKET_PATH/manifest.json" -o "$PREV_MANIFEST" 2>/dev/null; then + log " found previous manifest" +else + log " no previous manifest (first release); seeding" + printf '{"schema_version":1,"latest_stable":"","releases":[]}' > "$PREV_MANIFEST" +fi + +NEW_MANIFEST="$OUT_DIR/manifest.json" +RELEASED_AT="$(date -u +%Y-%m-%dT%H:%M:%SZ)" +jq --arg version "$VERSION" \ + --arg tarball_url "$BASE_URL/$BUCKET_PATH/$VERSION/$TARBALL_NAME" \ + --arg sums_url "$BASE_URL/$BUCKET_PATH/$VERSION/SHA256SUMS" \ + --arg sig_url "$BASE_URL/$BUCKET_PATH/$VERSION/SHA256SUMS.sig" \ + --arg released_at "$RELEASED_AT" \ + ' + .schema_version = 1 + | .latest_stable = $version + | .releases = ( + (.releases // []) + | map(select(.version != $version)) + | . + [{ + "version": $version, + "tarball_url": $tarball_url, + "sha256sums_url": $sums_url, + "sha256sums_sig_url": $sig_url, + "released_at": $released_at + }] + ) + ' "$PREV_MANIFEST" > "$NEW_MANIFEST" + +log "manifest:" +jq '.' "$NEW_MANIFEST" >&2 + +if [[ "$SKIP_UPLOAD" == "1" ]]; then + log "SKIP_UPLOAD=1, not uploading. Artifacts staged under $OUT_DIR" + exit 0 +fi + +log "uploading to $RCLONE_DEST_BASE/$VERSION/" +rclone copy "$TARBALL_PATH" "$RCLONE_DEST_BASE/$VERSION/" +rclone copy "$OUT_DIR/SHA256SUMS" "$RCLONE_DEST_BASE/$VERSION/" +rclone copy "$OUT_DIR/SHA256SUMS.sig" "$RCLONE_DEST_BASE/$VERSION/" + +log "uploading manifest" +rclone copy "$NEW_MANIFEST" "$RCLONE_DEST_BASE/" + +# install.sh lives at the bucket root (unversioned) so the +# `curl ... install.sh | bash` URL stays stable across releases. The +# script reads manifest.json to find the current latest_stable, so as +# long as install.sh's logic doesn't break, it keeps working for older +# releases too. +log "uploading install.sh" +rclone copy "$REPO_ROOT/scripts/install.sh" "$RCLONE_DEST_BASE/" + +log "done. verify with:" +log " curl -fsSL $BASE_URL/$BUCKET_PATH/manifest.json | jq ." +log " curl -fsSL $BASE_URL/$BUCKET_PATH/install.sh | head -20" +log " banger update --check" diff --git a/scripts/publish-golden-image.sh b/scripts/publish-golden-image.sh new file mode 100755 index 0000000..5348e71 --- /dev/null +++ b/scripts/publish-golden-image.sh @@ -0,0 +1,161 @@ +#!/usr/bin/env bash +# publish-golden-image.sh +# +# Build the banger golden-image bundle, upload it to R2, and patch +# internal/imagecat/catalog.json with the resulting URL + sha256 + +# size. Mirrors publish-kernel.sh for kernelcat. +# +# Usage: +# scripts/publish-golden-image.sh [--name ] [--kernel-ref ] \ +# [--distro ] [--arch ] [--description "..."] \ +# [--size ] [--platform

] [--skip-upload] +# +# Environment overrides: +# RCLONE_REMOTE rclone remote to upload through (default: banger-images) +# RCLONE_BUCKET R2 bucket name (default: banger-images) +# BASE_URL public URL prefix for the bucket (default: https://images.thaloco.com) + +set -euo pipefail + +log() { printf '[publish-golden-image] %s\n' "$*" >&2; } +die() { log "$*"; exit 1; } + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +CATALOG_FILE="$REPO_ROOT/internal/imagecat/catalog.json" + +RCLONE_REMOTE="${RCLONE_REMOTE:-banger-images}" +RCLONE_BUCKET="${RCLONE_BUCKET:-banger-images}" +BASE_URL="${BASE_URL:-https://images.thaloco.com}" + +NAME="debian-bookworm" +KERNEL_REF="generic-6.12" +DISTRO="debian" +ARCH="x86_64" +DESCRIPTION="" +SIZE="" +PLATFORM="linux/amd64" +SKIP_UPLOAD=0 + +while [[ $# -gt 0 ]]; do + case "$1" in + --name) NAME="${2:-}"; shift 2;; + --kernel-ref) KERNEL_REF="${2:-}"; shift 2;; + --distro) DISTRO="${2:-}"; shift 2;; + --arch) ARCH="${2:-}"; shift 2;; + -d|--description) DESCRIPTION="${2:-}"; shift 2;; + --size) SIZE="${2:-}"; shift 2;; + --platform) PLATFORM="${2:-}"; shift 2;; + --skip-upload) SKIP_UPLOAD=1; shift;; + -h|--help) + sed -n '2,/^set -euo/p' "$0" | sed 's/^# \?//' | sed '$d' + exit 0 + ;; + *) die "unknown option: $1";; + esac +done + +for tool in jq sha256sum stat; do + command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool" +done +[[ -f "$CATALOG_FILE" ]] || die "catalog file not found: $CATALOG_FILE" +if [[ "$SKIP_UPLOAD" -ne 1 ]]; then + for tool in rclone curl; do + command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool" + done +fi + +STAGE="$(mktemp -d)" +trap 'rm -rf "$STAGE"' EXIT +# Build to a temp name; the content-addressed final name is chosen +# after sha256 is computed. +BUILD_OUT="$STAGE/build.tar.zst" + +log "building bundle via make-golden-bundle.sh" +SIZE_FLAG=() +[[ -n "$SIZE" ]] && SIZE_FLAG=(--size "$SIZE") +"$SCRIPT_DIR/make-golden-bundle.sh" \ + --name "$NAME" \ + --kernel-ref "$KERNEL_REF" \ + --distro "$DISTRO" \ + --arch "$ARCH" \ + --description "$DESCRIPTION" \ + --platform "$PLATFORM" \ + "${SIZE_FLAG[@]}" \ + --out "$BUILD_OUT" + +SHA256="$(sha256sum "$BUILD_OUT" | awk '{print $1}')" +SIZE_BYTES="$(stat -c '%s' "$BUILD_OUT")" +HUMAN="$(numfmt --to=iec --suffix=B "$SIZE_BYTES" 2>/dev/null || echo "${SIZE_BYTES}B")" + +# Content-addressed filename: every rebuild lives at a unique URL, so +# stale CDN caches can never serve the wrong bytes for the URL the +# catalog points at. First 12 hex chars of sha256 is plenty of +# collision margin for this workload. +SHA_PREFIX="${SHA256:0:12}" +TARBALL_NAME="${NAME}-${ARCH}-${SHA_PREFIX}.tar.zst" +OUT="$STAGE/$TARBALL_NAME" +mv "$BUILD_OUT" "$OUT" + +log "bundle ready: $TARBALL_NAME ($HUMAN, sha256 $SHA256)" + +if [[ "$SKIP_UPLOAD" -eq 1 ]]; then + KEEP="$REPO_ROOT/dist/$TARBALL_NAME" + mkdir -p "$(dirname "$KEEP")" + cp -f "$OUT" "$KEEP" + log "--skip-upload set; catalog not patched" + log "bundle kept at: $KEEP" + exit 0 +fi + +log "uploading to $RCLONE_REMOTE:$RCLONE_BUCKET/$TARBALL_NAME" +# --s3-no-check-bucket skips the HeadBucket preflight; --no-check-dest +# skips the HeadObject preflight. Both fail with 403 on R2 tokens that +# only have PutObject + GetObject but not Head* — a common scoped-token +# setup. +rclone copyto \ + --s3-no-check-bucket \ + --no-check-dest \ + "$OUT" "$RCLONE_REMOTE:$RCLONE_BUCKET/$TARBALL_NAME" + +URL="$BASE_URL/$TARBALL_NAME" +log "verifying $URL is reachable" +HEAD_STATUS="$(curl -fsSI -o /dev/null -w '%{http_code}' "$URL" || true)" +if [[ "$HEAD_STATUS" != "200" ]]; then + die "uploaded tarball is not publicly reachable at $URL (HTTP $HEAD_STATUS); check bucket public-access config" +fi + +log "patching $CATALOG_FILE" +NEW_ENTRY="$(jq -n \ + --arg name "$NAME" \ + --arg distro "$DISTRO" \ + --arg arch "$ARCH" \ + --arg kref "$KERNEL_REF" \ + --arg url "$URL" \ + --arg sha "$SHA256" \ + --argjson size "$SIZE_BYTES" \ + --arg desc "$DESCRIPTION" \ + '{ + name: $name, + distro: $distro, + arch: $arch, + kernel_ref: $kref, + tarball_url: $url, + tarball_sha256: $sha, + size_bytes: $size, + description: $desc + } | with_entries(select(.value != null and .value != ""))')" + +CATALOG_TMP="$(mktemp)" +jq --arg name "$NAME" --argjson new "$NEW_ENTRY" ' + .version = (.version // 1) + | .entries = (((.entries // []) | map(select(.name != $name))) + [$new]) + | .entries |= sort_by(.name) +' "$CATALOG_FILE" > "$CATALOG_TMP" +mv "$CATALOG_TMP" "$CATALOG_FILE" + +log "done" +log "next steps:" +log " git diff -- $CATALOG_FILE" +log " git add $CATALOG_FILE && git commit -m 'imagecat: publish $NAME'" +log " make build # rebuild banger so the new catalog is embedded" diff --git a/scripts/publish-kernel.sh b/scripts/publish-kernel.sh new file mode 100755 index 0000000..c627936 --- /dev/null +++ b/scripts/publish-kernel.sh @@ -0,0 +1,139 @@ +#!/usr/bin/env bash +# publish-kernel.sh +# +# Package an entry from the local banger kernel catalog as a tar.zst, +# upload it to the public R2 bucket, and patch internal/kernelcat/catalog.json +# with the resulting URL + sha256 + size. Run after `banger kernel import`. +# +# Usage: +# scripts/publish-kernel.sh [--description "..."] +# +# Environment overrides: +# RCLONE_REMOTE rclone remote to upload through (default: r2) +# RCLONE_BUCKET R2 bucket name (default: banger-kernels) +# BASE_URL public URL prefix for the bucket (default: https://kernels.thaloco.com) +# BANGER_KERNELS_DIR local catalog directory (default: ~/.local/state/banger/kernels) + +set -euo pipefail + +log() { printf '[publish-kernel] %s\n' "$*" >&2; } +die() { log "$*"; exit 1; } + +usage() { + cat < [--description ""] + +Reads the locally-imported kernel at \$BANGER_KERNELS_DIR//, packages +it as -.tar.zst, uploads to R2, and updates +internal/kernelcat/catalog.json. + +Run \`banger kernel import --from --distro --arch \` +first. +EOF +} + +RCLONE_REMOTE="${RCLONE_REMOTE:-r2}" +RCLONE_BUCKET="${RCLONE_BUCKET:-banger-kernels}" +BASE_URL="${BASE_URL:-https://kernels.thaloco.com}" +BANGER_KERNELS_DIR="${BANGER_KERNELS_DIR:-$HOME/.local/state/banger/kernels}" + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +CATALOG_FILE="$REPO_ROOT/internal/kernelcat/catalog.json" + +NAME="" +DESCRIPTION="" +while [[ $# -gt 0 ]]; do + case "$1" in + -d|--description) DESCRIPTION="${2:-}"; shift 2;; + -h|--help) usage; exit 0;; + --) shift; break;; + -*) die "unknown flag: $1";; + *) + if [[ -z "$NAME" ]]; then + NAME="$1"; shift + else + die "unexpected positional arg: $1" + fi + ;; + esac +done +[[ -n "$NAME" ]] || { usage; exit 1; } + +for tool in jq rclone tar zstd sha256sum stat curl; do + command -v "$tool" >/dev/null 2>&1 || die "missing required tool: $tool" +done +[[ -f "$CATALOG_FILE" ]] || die "catalog file not found: $CATALOG_FILE" + +SRC="$BANGER_KERNELS_DIR/$NAME" +[[ -d "$SRC" ]] || die "$SRC does not exist; run 'banger kernel import $NAME --from

' first" +[[ -f "$SRC/vmlinux" ]] || die "$SRC/vmlinux missing" +[[ -f "$SRC/manifest.json" ]] || die "$SRC/manifest.json missing" + +DISTRO="$(jq -r '.distro // ""' "$SRC/manifest.json")" +ARCH="$(jq -r '.arch // ""' "$SRC/manifest.json")" +KERNEL_VERSION="$(jq -r '.kernel_version // ""' "$SRC/manifest.json")" +[[ -n "$ARCH" ]] || ARCH="x86_64" + +STAGE="$(mktemp -d)" +trap 'rm -rf "$STAGE"' EXIT + +TARBALL_NAME="${NAME}-${ARCH}.tar.zst" +TARBALL="$STAGE/$TARBALL_NAME" + +INCLUDES=(vmlinux) +[[ -f "$SRC/initrd.img" ]] && INCLUDES+=(initrd.img) +[[ -d "$SRC/modules" ]] && INCLUDES+=(modules) + +log "packaging ${INCLUDES[*]} from $SRC" +( cd "$SRC" && tar -cf - "${INCLUDES[@]}" ) | zstd -19 --long -T0 -q -o "$TARBALL" + +SHA256="$(sha256sum "$TARBALL" | awk '{print $1}')" +SIZE="$(stat -c '%s' "$TARBALL")" +HUMAN_SIZE="$(numfmt --to=iec --suffix=B "$SIZE" 2>/dev/null || echo "${SIZE}B")" +log "tarball $TARBALL_NAME: $HUMAN_SIZE, sha256 $SHA256" + +log "uploading to $RCLONE_REMOTE:$RCLONE_BUCKET/$TARBALL_NAME" +rclone copyto "$TARBALL" "$RCLONE_REMOTE:$RCLONE_BUCKET/$TARBALL_NAME" + +URL="$BASE_URL/$TARBALL_NAME" +log "verifying $URL is reachable" +HEAD_STATUS="$(curl -fsSI -o /dev/null -w '%{http_code}' "$URL" || true)" +if [[ "$HEAD_STATUS" != "200" ]]; then + die "uploaded tarball is not publicly reachable at $URL (HTTP $HEAD_STATUS); check bucket public-access config" +fi + +log "patching $CATALOG_FILE" +NEW_ENTRY="$(jq -n \ + --arg name "$NAME" \ + --arg distro "$DISTRO" \ + --arg arch "$ARCH" \ + --arg kver "$KERNEL_VERSION" \ + --arg url "$URL" \ + --arg sha "$SHA256" \ + --argjson size "$SIZE" \ + --arg desc "$DESCRIPTION" \ + '{ + name: $name, + distro: $distro, + arch: $arch, + kernel_version: $kver, + tarball_url: $url, + tarball_sha256: $sha, + size_bytes: $size, + description: $desc + } | with_entries(select(.value != null and .value != ""))')" + +CATALOG_TMP="$(mktemp)" +jq --arg name "$NAME" --argjson new "$NEW_ENTRY" ' + .version = (.version // 1) + | .entries = (((.entries // []) | map(select(.name != $name))) + [$new]) + | .entries |= sort_by(.name) +' "$CATALOG_FILE" > "$CATALOG_TMP" +mv "$CATALOG_TMP" "$CATALOG_FILE" + +log "done" +log "next steps:" +log " git diff -- $CATALOG_FILE" +log " git add $CATALOG_FILE && git commit -m 'kernel catalog: add/update $NAME'" +log " make build # rebuild banger so the new catalog is embedded" diff --git a/scripts/register-alpine-image.sh b/scripts/register-alpine-image.sh deleted file mode 100755 index f72e7bc..0000000 --- a/scripts/register-alpine-image.sh +++ /dev/null @@ -1,92 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[register-alpine-image] %s\n' "$*" >&2 -} - -find_latest_matching() { - local dir="$1" - local pattern="$2" - if [[ ! -d "$dir" ]]; then - return 1 - fi - find "$dir" -maxdepth 1 -type f -name "$pattern" | sort | tail -n 1 -} - -find_latest_module_dir() { - local root="$1" - if [[ ! -d "$root" ]]; then - return 1 - fi - find "$root" -mindepth 1 -maxdepth 1 -type d | sort | tail -n 1 -} - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; build it first with 'make build' or set BANGER_BIN" - exit 1 -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -RUNTIME_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -IMAGE_NAME="${ALPINE_IMAGE_NAME:-alpine}" -BANGER_BIN="$(resolve_banger_bin)" -ROOTFS="$RUNTIME_DIR/rootfs-alpine.ext4" -WORK_SEED="$RUNTIME_DIR/rootfs-alpine.work-seed.ext4" - -if [[ ! -f "$ROOTFS" ]]; then - log "missing Alpine rootfs: $ROOTFS" - exit 1 -fi -if [[ ! -f "$WORK_SEED" ]]; then - log "missing Alpine work-seed: $WORK_SEED" - exit 1 -fi - -args=( - image register - --name "$IMAGE_NAME" - --rootfs "$ROOTFS" - --work-seed "$WORK_SEED" - --docker -) - -if [[ ! -d "$RUNTIME_DIR/alpine-kernel" ]]; then - log "missing staged Alpine kernel artifacts: $RUNTIME_DIR/alpine-kernel" - log "run 'make alpine-kernel' before registering $IMAGE_NAME" - exit 1 -fi - -kernel="$(find_latest_matching "$RUNTIME_DIR/alpine-kernel/boot" 'vmlinux-*' || true)" -if [[ -z "$kernel" ]]; then - kernel="$(find_latest_matching "$RUNTIME_DIR/alpine-kernel/boot" 'vmlinuz-*' || true)" -fi -initrd="$(find_latest_matching "$RUNTIME_DIR/alpine-kernel/boot" 'initramfs-*' || true)" -modules="$(find_latest_module_dir "$RUNTIME_DIR/alpine-kernel/lib/modules" || true)" - -if [[ -z "$kernel" || -z "$initrd" || -z "$modules" ]]; then - log "staged Alpine kernel is incomplete; expected kernel, initramfs, and modules under $RUNTIME_DIR/alpine-kernel" - exit 1 -fi - -log "using staged Alpine kernel artifacts from $RUNTIME_DIR/alpine-kernel" -args+=(--kernel "$kernel" --initrd "$initrd" --modules "$modules") - -"$BANGER_BIN" "${args[@]}" diff --git a/scripts/register-void-image.sh b/scripts/register-void-image.sh deleted file mode 100755 index 3ffb8a2..0000000 --- a/scripts/register-void-image.sh +++ /dev/null @@ -1,88 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[register-void-image] %s\n' "$*" >&2 -} - -find_latest_matching() { - local dir="$1" - local pattern="$2" - if [[ ! -d "$dir" ]]; then - return 1 - fi - find "$dir" -maxdepth 1 -type f -name "$pattern" | sort | tail -n 1 -} - -find_latest_module_dir() { - local root="$1" - if [[ ! -d "$root" ]]; then - return 1 - fi - find "$root" -mindepth 1 -maxdepth 1 -type d | sort | tail -n 1 -} - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; build it first with 'make build' or set BANGER_BIN" - exit 1 -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -RUNTIME_DIR="${BANGER_MANUAL_DIR:-$REPO_ROOT/build/manual}" -IMAGE_NAME="${VOID_IMAGE_NAME:-void-exp}" -BANGER_BIN="$(resolve_banger_bin)" -ROOTFS="$RUNTIME_DIR/rootfs-void.ext4" -WORK_SEED="$RUNTIME_DIR/rootfs-void.work-seed.ext4" - -if [[ ! -f "$ROOTFS" ]]; then - log "missing Void rootfs: $ROOTFS" - exit 1 -fi -if [[ ! -f "$WORK_SEED" ]]; then - log "missing Void work-seed: $WORK_SEED" - exit 1 -fi - -args=( - image register - --name "$IMAGE_NAME" - --rootfs "$ROOTFS" - --work-seed "$WORK_SEED" -) - -if [[ ! -d "$RUNTIME_DIR/void-kernel" ]]; then - log "missing staged Void kernel artifacts: $RUNTIME_DIR/void-kernel" - log "run 'make void-kernel' before registering $IMAGE_NAME" - exit 1 -fi - -kernel="$(find_latest_matching "$RUNTIME_DIR/void-kernel/boot" 'vmlinux-*' || true)" -initrd="$(find_latest_matching "$RUNTIME_DIR/void-kernel/boot" 'initramfs-*' || true)" -modules="$(find_latest_module_dir "$RUNTIME_DIR/void-kernel/lib/modules" || true)" - -if [[ -z "$kernel" || -z "$initrd" || -z "$modules" ]]; then - log "staged Void kernel is incomplete; expected vmlinux, initramfs, and modules under $RUNTIME_DIR/void-kernel" - exit 1 -fi - -log "using staged Void kernel artifacts from $RUNTIME_DIR/void-kernel" -args+=(--kernel "$kernel" --initrd "$initrd" --modules "$modules") - -"$BANGER_BIN" "${args[@]}" diff --git a/scripts/repro-restart-bug.sh b/scripts/repro-restart-bug.sh new file mode 100755 index 0000000..acf1a9e --- /dev/null +++ b/scripts/repro-restart-bug.sh @@ -0,0 +1,151 @@ +#!/usr/bin/env bash +# +# scripts/repro-restart-bug.sh — minimal reproducer for the +# stop-then-start bug. +# +# Symptom: after `vm create X` → `vm stop X` → `vm start X`, the store +# reports `state=running` but `vm ssh X` returns `not_running` +# because the daemon's `vmAlive(vm)` check returns false. Seen +# reliably on Debian-bookworm default image. +# +# This script: +# 1. Builds instrumented binaries (reuses $(SMOKE_BIN_DIR)) +# 2. Points banger at an isolated XDG so it doesn't touch the +# invoking user's real install +# 3. Runs the create→stop→start sequence +# 4. Asserts `vm ssh -- true` works post-restart +# 5. On failure, dumps the daemon log (vm.start trace), firecracker +# log (guest kernel output), pgrep state (is firecracker +# actually running?), api-sock presence, and handles.json +# +# Exit 0 = bug is fixed. Exit 1 = bug still reproduces. +# +# Run directly (builds binaries on demand): +# ./scripts/repro-restart-bug.sh + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +cd "$REPO_ROOT" + +log() { printf '[repro] %s\n' "$*" >&2; } +die() { printf '[repro] FAIL: %s\n' "$*" >&2; exit 1; } + +# Reuse smoke binaries if present; otherwise build them. They're +# instrumented with -cover, but that's harmless for this test. +make smoke-build >/dev/null + +BIN_DIR="$REPO_ROOT/build/smoke/bin" +for bin in banger bangerd banger-vsock-agent; do + [[ -x "$BIN_DIR/$bin" ]] || die "missing $BIN_DIR/$bin; run make smoke-build" +done + +BANGER="$BIN_DIR/banger" +VMNAME=repro-restart + +# Isolated XDG root, torn down at exit. Unlike smoke.sh we do NOT +# persist across runs — we want a clean slate so the very first +# image pull also exercises the second-start path. +WORKDIR="$(mktemp -d -t banger-repro-XXXXXX)" +trap 'rm -rf "$WORKDIR"' EXIT + +export XDG_CONFIG_HOME="$WORKDIR/config" +export XDG_STATE_HOME="$WORKDIR/state" +export XDG_CACHE_HOME="$WORKDIR/cache" +export XDG_RUNTIME_DIR="$WORKDIR/runtime" +mkdir -p "$XDG_CONFIG_HOME" "$XDG_STATE_HOME" "$XDG_CACHE_HOME" "$XDG_RUNTIME_DIR" +chmod 0700 "$XDG_RUNTIME_DIR" + +export BANGER_DAEMON_BIN="$BIN_DIR/bangerd" +export BANGER_VSOCK_AGENT_BIN="$BIN_DIR/banger-vsock-agent" +export GOCOVERDIR="$WORKDIR/covdata" +mkdir -p "$GOCOVERDIR" + +# Refuse to run if the user's real daemon has :42069 bound — we'd +# fail for the wrong reason. +if command -v ss >/dev/null 2>&1 && ss -Huln 2>/dev/null | awk '{print $4}' | grep -q '[:.]42069$'; then + die 'port 127.0.0.1:42069 already bound; stop your real banger daemon first' +fi + +"$BANGER" daemon stop >/dev/null 2>&1 || true + +LOG_PATH="$XDG_STATE_HOME/banger/bangerd.log" + +diag() { + printf '\n[repro] === DIAGNOSTICS ===\n' >&2 + printf '[repro] vm show:\n' >&2 + "$BANGER" vm show "$VMNAME" >&2 2>/dev/null || true + local vmdir + vmdir="$("$BANGER" vm show "$VMNAME" 2>/dev/null | awk -F'"' '/"vm_dir"/ {print $4}')" + local apisock + apisock="$("$BANGER" vm show "$VMNAME" 2>/dev/null | awk -F'"' '/"api_sock_path"/ {print $4}')" + + printf '\n[repro] handles.json:\n' >&2 + [[ -n "$vmdir" && -f "$vmdir/handles.json" ]] && cat "$vmdir/handles.json" >&2 || echo ' (missing)' >&2 + + printf '\n[repro] pgrep by apiSock (%s):\n' "$apisock" >&2 + [[ -n "$apisock" ]] && (pgrep -af "$apisock" >&2 || echo ' (none)' >&2) || echo ' (no apisock)' >&2 + + printf '\n[repro] apiSock present:\n' >&2 + if [[ -n "$apisock" ]]; then + if [[ -S "$apisock" ]]; then + sudo -n ls -la "$apisock" >&2 2>/dev/null || ls -la "$apisock" >&2 2>/dev/null || echo ' (cannot stat)' >&2 + else + echo ' NOT PRESENT' >&2 + fi + fi + + printf '\n[repro] daemon log — last vm.start trace:\n' >&2 + if [[ -f "$LOG_PATH" ]]; then + # Dump everything from the most recent "operation started" for vm.start. + awk ' + /"operation":"vm\.start"/ && /"msg":"operation started"/ { lastStart=NR } + { lines[NR]=$0 } + END { + if (lastStart) for (i=lastStart; i<=NR; i++) print lines[i] + } + ' "$LOG_PATH" | tail -40 >&2 + else + echo ' (daemon log missing)' >&2 + fi + + printf '\n[repro] firecracker.log tail (guest kernel output):\n' >&2 + [[ -n "$vmdir" && -f "$vmdir/firecracker.log" ]] && tail -30 "$vmdir/firecracker.log" >&2 || echo ' (missing)' >&2 + + printf '\n' >&2 +} + +log "create $VMNAME" +"$BANGER" vm create --name "$VMNAME" >/dev/null || die "create failed" +log 'wait for initial ssh' +deadline=$(( $(date +%s) + 90 )) +while (( $(date +%s) < deadline )); do + "$BANGER" vm ssh "$VMNAME" -- true >/dev/null 2>&1 && break + sleep 1 +done +"$BANGER" vm ssh "$VMNAME" -- true >/dev/null 2>&1 || { diag; die 'initial ssh never came up'; } + +log 'stop' +"$BANGER" vm stop "$VMNAME" >/dev/null || { diag; die 'stop failed'; } + +log 'start (this is where the bug manifests)' +"$BANGER" vm start "$VMNAME" >/dev/null || { diag; die 'start failed'; } + +log 'assert vm ssh succeeds post-restart (60s budget)' +deadline=$(( $(date +%s) + 60 )) +while (( $(date +%s) < deadline )); do + if "$BANGER" vm ssh "$VMNAME" -- true >/dev/null 2>&1; then + log 'PASS — bug appears fixed' + "$BANGER" vm delete "$VMNAME" >/dev/null 2>&1 || true + "$BANGER" daemon stop >/dev/null 2>&1 || true + exit 0 + fi + sleep 1 +done + +log 'FAIL — vm ssh never succeeded post-restart' +diag +"$BANGER" vm delete "$VMNAME" >/dev/null 2>&1 || true +"$BANGER" daemon stop >/dev/null 2>&1 || true +exit 1 diff --git a/scripts/verify.sh b/scripts/verify.sh deleted file mode 100755 index 8546ef6..0000000 --- a/scripts/verify.sh +++ /dev/null @@ -1,334 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -log() { - printf '[verify] %s\n' "$*" -} - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" -DAEMON_LOG="${XDG_STATE_HOME:-$HOME/.local/state}/banger/bangerd.log" -OPENCODE_PORT=4096 - -resolve_banger_bin() { - if [[ -n "${BANGER_BIN:-}" ]]; then - printf '%s\n' "$BANGER_BIN" - return - fi - if [[ -x "$REPO_ROOT/build/bin/banger" ]]; then - printf '%s\n' "$REPO_ROOT/build/bin/banger" - return - fi - if [[ -x "$REPO_ROOT/banger" ]]; then - printf '%s\n' "$REPO_ROOT/banger" - return - fi - if command -v banger >/dev/null 2>&1; then - command -v banger - return - fi - log "banger binary not found; run 'make build' or set BANGER_BIN" - exit 1 -} - -BANGER_BIN="$(resolve_banger_bin)" -SSH_KEY="$("$BANGER_BIN" internal ssh-key-path)" -if [[ ! -f "$SSH_KEY" ]]; then - log "ssh key not found: $SSH_KEY" - exit 1 -fi -SSH_COMMON_ARGS=( - -F /dev/null - -i "$SSH_KEY" - -o IdentitiesOnly=yes - -o BatchMode=yes - -o PreferredAuthentications=publickey - -o PasswordAuthentication=no - -o KbdInteractiveAuthentication=no - -o StrictHostKeyChecking=no - -o UserKnownHostsFile=/dev/null -) - -firecracker_running() { - local pid="$1" - local api_sock="$2" - local cmdline="" - - if [[ -z "$pid" || "$pid" -le 0 || -z "$api_sock" ]]; then - return 1 - fi - if [[ ! -r "/proc/$pid/cmdline" ]]; then - return 1 - fi - cmdline="$(cat "/proc/$pid/cmdline" 2>/dev/null | tr '\0' ' ' || true)" - [[ "$cmdline" == *firecracker* && "$cmdline" == *"$api_sock"* ]] -} - -pooled_tap() { - local tap="$1" - [[ "$tap" == tap-pool-* ]] -} - -wait_for_ssh() { - local guest_ip="$1" - local deadline="$2" - - while ((SECONDS < deadline)); do - if ssh "${SSH_COMMON_ARGS[@]}" -o ConnectTimeout=2 "root@${guest_ip}" "true" >/dev/null 2>&1; then - return 0 - fi - sleep 1 - done - - return 1 -} - -wait_for_tcp() { - local host="$1" - local port="$2" - local deadline="$3" - - while ((SECONDS < deadline)); do - if (exec 3<>/dev/tcp/"$host"/"$port") >/dev/null 2>&1; then - return 0 - fi - sleep 1 - done - - return 1 -} - -refresh_vm_metadata() { - if ! VM_JSON="$("$BANGER_BIN" vm show "$VM_NAME" 2>/dev/null)"; then - return 1 - fi - TAP="$(printf '%s\n' "$VM_JSON" | jq -r '.runtime.tap_device // empty')" - VM_DIR="$(printf '%s\n' "$VM_JSON" | jq -r '.runtime.vm_dir // empty')" - GUEST_IP="$(printf '%s\n' "$VM_JSON" | jq -r '.runtime.guest_ip // empty')" - API_SOCK="$(printf '%s\n' "$VM_JSON" | jq -r '.runtime.api_sock_path // empty')" - PID="$(printf '%s\n' "$VM_JSON" | jq -r '.runtime.pid // 0')" - VM_STATE="$(printf '%s\n' "$VM_JSON" | jq -r '.state // empty')" - LAST_ERROR="$(printf '%s\n' "$VM_JSON" | jq -r '.runtime.last_error // empty')" - return 0 -} - -wait_for_vm_ready() { - local deadline="$1" - - while ((SECONDS < deadline)); do - if ! refresh_vm_metadata; then - sleep 1 - continue - fi - if [[ "$VM_STATE" == "error" || -n "$LAST_ERROR" ]]; then - return 2 - fi - if [[ -n "$API_SOCK" && "${PID:-0}" -gt 0 ]] && ! firecracker_running "$PID" "$API_SOCK"; then - return 3 - fi - if [[ "$VM_STATE" == "running" && -n "$GUEST_IP" && -n "$TAP" && -n "$VM_DIR" && -n "$API_SOCK" && "${PID:-0}" -gt 0 ]]; then - if [[ -S "$API_SOCK" ]] && ip link show "$TAP" >/dev/null 2>&1; then - return 0 - fi - fi - sleep 1 - done - - return 1 -} - -dump_diagnostics() { - log "diagnostics for $VM_NAME" - "$BANGER_BIN" vm show "$VM_NAME" || true - if [[ "${PID:-0}" -gt 0 ]]; then - log "process state for pid $PID" - ps -fp "$PID" || true - fi - log "recent firecracker log" - "$BANGER_BIN" vm logs "$VM_NAME" 2>/dev/null | tail -n 200 || true - if [[ -f "$DAEMON_LOG" ]]; then - log "recent daemon log" - tail -n 200 "$DAEMON_LOG" || true - fi - if [[ -n "${TAP:-}" ]]; then - log "tap state for $TAP" - ip link show "$TAP" || true - fi - if [[ -n "${API_SOCK:-}" ]]; then - log "api socket $API_SOCK" - ls -l "$API_SOCK" 2>/dev/null || true - fi - if (( NAT_ENABLED )) && [[ -n "${UPLINK:-}" && -n "${GUEST_IP:-}" && -n "${TAP:-}" ]]; then - log "nat rules for ${GUEST_IP} via ${UPLINK}" - sudo iptables -t nat -S POSTROUTING | grep "${GUEST_IP}/32" || true - sudo iptables -S FORWARD | grep "$TAP" || true - fi -} - -usage() { - cat <<'EOF' -Usage: ./scripts/verify.sh [--nat] [--image ] - -Run a basic smoke test for the Go VM workflow. -Use --nat to additionally verify outbound NAT and host rule cleanup. -Use --image to verify a non-default image such as void-exp. -EOF -} - -NAT_ENABLED=0 -IMAGE_NAME="" -BOOT_TIMEOUT_SECS="${VERIFY_BOOT_TIMEOUT_SECS:-90}" -while [[ $# -gt 0 ]]; do - case "$1" in - --nat) - NAT_ENABLED=1 - shift - ;; - --image) - IMAGE_NAME="${2:-}" - if [[ -z "$IMAGE_NAME" ]]; then - usage - exit 1 - fi - shift 2 - ;; - *) - usage - exit 1 - ;; - esac -done - -VM_NAME="verify-$(date +%s)" -VM_JSON="" -TAP="" -VM_DIR="" -GUEST_IP="" -UPLINK="" -API_SOCK="" -PID="0" -VM_STATE="" -LAST_ERROR="" - -delete_vm() { - if [[ -n "${VM_NAME:-}" ]]; then - "$BANGER_BIN" vm delete "$VM_NAME" - fi -} - -cleanup() { - if [[ -n "${VM_NAME:-}" ]]; then - "$BANGER_BIN" vm delete "$VM_NAME" >/dev/null 2>&1 || true - fi -} - -trap cleanup EXIT - -log "starting VM" -CREATE_ARGS=("$BANGER_BIN" vm create --name "$VM_NAME") -if [[ -n "$IMAGE_NAME" ]]; then - CREATE_ARGS+=(--image "$IMAGE_NAME") -fi -if (( NAT_ENABLED )); then - CREATE_ARGS+=(--nat) -fi -"${CREATE_ARGS[@]}" >/dev/null - -BOOT_DEADLINE=$((SECONDS + BOOT_TIMEOUT_SECS)) - -log "waiting for VM runtime readiness" -if wait_for_vm_ready "$BOOT_DEADLINE"; then - : -else - status=$? - case "$status" in - 2) log "vm entered an error state before becoming ready" ;; - 3) log "firecracker exited before the guest became ready" ;; - *) log "vm did not become ready before timeout" ;; - esac - dump_diagnostics - exit 1 -fi - -if (( NAT_ENABLED )); then - UPLINK="$(ip route show default 2>/dev/null | awk '/default/ {print $5; exit}')" - if [[ -z "$UPLINK" ]]; then - log "failed to detect uplink interface" - exit 1 - fi - log "asserting NAT rules are installed" - sudo iptables -t nat -C POSTROUTING -s "${GUEST_IP}/32" -o "$UPLINK" -j MASQUERADE - sudo iptables -C FORWARD -i "$TAP" -o "$UPLINK" -j ACCEPT - sudo iptables -C FORWARD -i "$UPLINK" -o "$TAP" -m state --state RELATED,ESTABLISHED -j ACCEPT -fi - -log "asserting VM is reachable via SSH" -if ! wait_for_ssh "$GUEST_IP" "$BOOT_DEADLINE"; then - log "ssh did not become ready for ${GUEST_IP}" - dump_diagnostics - exit 1 -fi -ssh "${SSH_COMMON_ARGS[@]}" "root@${GUEST_IP}" "uname -a" >/dev/null - -log "asserting opencode is available and listening in the guest" -ssh "${SSH_COMMON_ARGS[@]}" "root@${GUEST_IP}" "command -v opencode >/dev/null 2>&1 && ss -H -lntp | awk '\$4 ~ /:${OPENCODE_PORT}\$/ { found = 1 } END { exit found ? 0 : 1 }'" >/dev/null - -log "asserting opencode server is reachable from the host" -if ! wait_for_tcp "$GUEST_IP" "$OPENCODE_PORT" "$BOOT_DEADLINE"; then - log "opencode server did not become reachable at ${GUEST_IP}:${OPENCODE_PORT}" - dump_diagnostics - exit 1 -fi - -log "asserting opencode port is reported by banger vm ports" -if ! "$BANGER_BIN" vm ports "$VM_NAME" | grep -F ":${OPENCODE_PORT}" >/dev/null 2>&1; then - log "banger vm ports did not report ${OPENCODE_PORT}" - dump_diagnostics - exit 1 -fi - -if (( NAT_ENABLED )); then - log "asserting VM has outbound network access" - ssh "${SSH_COMMON_ARGS[@]}" "root@${GUEST_IP}" "curl -fsS https://example.com >/dev/null" >/dev/null -fi - -log "cleaning up VM" -if ! delete_vm; then - log "vm delete failed for $VM_NAME" - dump_diagnostics - exit 1 -fi - -log "asserting cleanup success" -if "$BANGER_BIN" vm show "$VM_NAME" >/dev/null 2>&1; then - log "vm still exists after delete: $VM_NAME" - exit 1 -fi -if ip link show "$TAP" >/dev/null 2>&1; then - if pooled_tap "$TAP"; then - log "tap returned to idle pool: $TAP" - else - log "tap still exists: $TAP" - exit 1 - fi -fi -if [[ -d "$VM_DIR" ]]; then - log "vm dir still exists: $VM_DIR" - exit 1 -fi -if (( NAT_ENABLED )); then - if sudo iptables -t nat -C POSTROUTING -s "${GUEST_IP}/32" -o "$UPLINK" -j MASQUERADE 2>/dev/null; then - log "nat rule still exists for ${GUEST_IP}" - exit 1 - fi - if sudo iptables -C FORWARD -i "$TAP" -o "$UPLINK" -j ACCEPT 2>/dev/null; then - log "forward-out rule still exists for ${TAP}" - exit 1 - fi - if sudo iptables -C FORWARD -i "$UPLINK" -o "$TAP" -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null; then - log "forward-in rule still exists for ${TAP}" - exit 1 - fi -fi - -log "ok"