* model.ParseSize / FormatSizeBytes: pinned with table tests in
internal/model/types_test.go (TestParseSize 22 cases,
TestFormatSizeBytes 11 cases, TestParseSizeFormatRoundTrip 7
boundaries). Fixed the long-suffix regression: "4GiB", "512MiB",
"4KiB" now parse correctly (parser strips trailing IB before
inspecting the unit byte). Pinned current behaviour for
no-suffix input ("1024" treated as MiB) and FormatSizeBytes(0).
commands_image.go --size flag-help updated to show 4GiB now
that the parser accepts it.
* vm ports --json: matches the JSON-vs-table inconsistency between
vm stats (always JSON) and vm ports (always table). --json on
vm ports flips to the same printJSON path as vm stats. Default
table output unchanged. Other vm subcommands (show, stats,
logs, health, ping) didn't fit the identical pattern; left
alone.
* docs/oci-import.md architecture section moved to a new
docs/oci-import-internals.md (precedent: internal/daemon/
ARCHITECTURE.md). User-facing oci-import.md keeps a one-line
pointer for advanced reading.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A pre-release audit collected ~12 trivial-effort UX and code-hygiene
items. Rolling them up here so the v0.1.0 commit log isn't littered
with one-line tweaks.
CLI help / completion:
* commands_image.go: drop dangling reference to a `banger image
catalog` subcommand that doesn't exist; replace with a pointer
to `banger image list`.
* commands_image.go: --size flag example was "4GiB" but the parser
rejects that suffix. Change example to "4G". (Parser-side fix
is in a separate concern.)
* commands_image.go + completion.go: image pull now wires a
catalog completer (falls back to local image names since there's
no image-catalog RPC yet); image show / delete / promote already
completed local names.
* commands_kernel.go + completion.go: kernel pull now wires a new
completeKernelCatalogNameOnlyAtPos0 backed by the kernel.catalog
RPC, so tab-complete suggests pullable kernels.
* commands_vm.go: vm stats and vm set now have Long + Example
blocks (peers all do); --from flag description updated to spell
out the relationship to --branch.
README:
* Define "golden image" inline at first use.
* Add a one-line Requirements block above Quick Start so users
hit the firecracker / KVM dependency before `make build`.
Code hygiene:
* dashIfEmpty / emptyDash were the same function. Deleted
emptyDash, retargeted three call sites.
* formatBytes (introduced today in image cache prune) duplicated
humanSize. Consolidated to humanSize, now with a space ("1.2
GiB" not "1.2GiB"). formatters_test.go expectations updated.
Logging chattiness:
* "operation started" (logger.go), "daemon request canceled"
(daemon.go), and "helper rpc completed" (roothelper.go) all
fired at INFO per RPC. Downgraded to DEBUG so routine shell
completions don't spam syslog.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OCI layer blobs accumulate forever — every pull writes layers to
~/.cache/banger/oci/blobs/sha256/<hex> via go-containerregistry's
filesystem cache, and nothing ever evicts them. The cache is purely
a re-pull-avoidance (every flattened image is independent of the
blobs that sourced it), so it's a perfect candidate for an opt-in
operator-driven prune.
New surface:
* api: ImageCachePruneParams{DryRun}, ImageCachePruneResult
{BytesFreed, BlobsFreed, DryRun, CacheDir}.
* daemon: ImageService.PruneOCICache walks layout.OCICacheDir for
a (bytes, blobs) tally, then — outside dry-run — atomically
renames the cache aside, recreates it empty, and rm -rf's the
aside dir. The rename-then-rm avoids leaving the cache in a
half-removed state if a pull starts mid-prune (the in-flight
pull's open files survive the rename via standard Linux
semantics; it just sees a fresh empty cache afterwards). Missing
cache dir is treated as zero — fresh installs that have never
pulled an OCI image don't error.
* dispatch: image.cache.prune RPC (paramHandler-wrapped, mirroring
every other image RPC). Documented-methods test list updated.
* cli: `banger image cache` group with a `prune` subcommand
(--dry-run flag). Output is a single line: "freed 1.2 GiB
across 47 blob(s) in /var/cache/banger/oci" or "would free …".
formatBytes helper for the size pretty-print.
docs/oci-import.md: replaced the "Tech debt: cache eviction" bullet
with a "Cache lifecycle" section describing the new command and
the in-flight-pull caveat.
Tests: PruneOCICache covers the happy path (real prune empties the
cache, recreates an empty dir, doesn't leak the .pruning- aside),
the dry-run path (returns size, leaves blobs intact), and the
fresh-install path (cache dir absent → zero result, no error).
Smoke at JOBS=4 still green; live exercise against an empty cache
on a system install prints the expected zero summary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each VM's firecracker now runs inside a per-VM chroot dropped to the
registered owner UID via firecracker-jailer. Closes the broad ambient-
sudo escalation surface that survived Phase A: the helper still needs
caps for tap/bridge/dm/loop/iptables, but the VMM itself no longer
runs as root in the host root filesystem.
The host helper stages each chroot up front: hard-links the kernel
and (optional) initrd, mknods block-device drives + /dev/vhost-vsock,
copies in the firecracker binary (jailer opens it O_RDWR so a ro bind
fails with EROFS), and bind-mounts /usr/lib + /lib trees read-only so
the dynamic linker can resolve. Self-binds the chroot first so the
findmnt-guarded cleanup can recurse safely.
AF_UNIX sun_path is 108 bytes; the chroot path easily blows past that.
Daemon-side launch pre-symlinks the short request socket path to the
long chroot socket before Machine.Start so the SDK's poll/connect
sees the short path while the kernel resolves to the chroot socket.
--new-pid-ns is intentionally disabled — jailer's PID-namespace fork
makes the SDK see the parent exit and tear the API socket down too
early.
CapabilityBoundingSet for the helper expands to add CAP_FOWNER,
CAP_KILL, CAP_MKNOD, CAP_SETGID, CAP_SETUID, CAP_SYS_CHROOT alongside
the existing CAP_CHOWN/CAP_DAC_OVERRIDE/CAP_NET_ADMIN/CAP_NET_RAW/
CAP_SYS_ADMIN.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces three interconnected features for persistent VM workflows:
1. `banger vm exec <vm> -- <cmd>`: runs a command in the prepared
workspace, automatically cd-ing into the guest path and wrapping
via `mise exec --` so mise-managed tools are on PATH. Falls back
to a plain exec when mise isn't available. Exit code propagates
verbatim.
2. Workspace persistence: workspace.prepare now stores the guest path,
host source path, and HEAD commit into a new `workspace_json` column
on the vms table (migration 3). This state survives daemon restarts
and informs both dirty-checking and auto-prepare.
3. Dirty detection: `vm exec` compares the stored HEAD commit against
the current host repo HEAD. When stale it warns and, with
--auto-prepare, re-syncs the workspace before running.
Also:
- WORKSPACE column added to `banger ps` / `vm list`
- `banger vm` quick reference updated with `vm exec` entry
Adds three small but high-leverage presentation tweaks for v0.1:
1. internal/cli/style is a new ~70 LOC package with Pass/Fail/Warn/
Dim/Bold helpers. Each is TTY-gated and obeys NO_COLOR. No
external dep. Wired into the doctor PASS/FAIL/WARN status, the
"banger:" error prefix on stderr, and the dim 'ready in <elapsed>'
line.
2. internal/cli/errors translates rpc.ErrorResponse into user-facing
text. operation_failed becomes invisible (the message wins);
not_found, already_exists, bad_request, bad_version, unauthorized,
unknown_method get short labels; unknown codes pass through. The
daemon-attached op_id lands in dim parens — paste into
journalctl --grep to find the daemon log line that produced the
failure.
3. Tabwriter config converges on (0, 8, 2, ' ', 0) across every
list/table command. The vm prune confirmation table picked up the
right config; system install + system status switched from bare
"key: value\n" lines to tabular form. printVMSpecLine drops its
Unicode middle dot for an ASCII '|' so terminals without UTF-8
render cleanly.
Tests cover translateRPCError for every code, style helpers no-op
on non-TTY and under NO_COLOR. Smoke status greps switch from
"key: value" to "key value" to match the new format.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
systemd's Type=simple reports a unit "active" the moment its
ExecStart binary is exec()'d, which for bangerd happens well before
the daemon has read its config and bound /run/banger/bangerd.sock.
'banger system install' and 'banger system restart' both returned
inside that window, so the very next 'banger ...' command would hit
ensureDaemon, miss on a single ping, and exit with "service not
reachable; run sudo banger system restart" — the same restart that
had just succeeded. Smoke tripped over this on every run.
Add waitForDaemonReady: poll daemonPing for up to 15s after the
restart returns. Both the system install and restart paths now
block until the daemon is genuinely accepting RPCs, so the next
CLI invocation can talk to it without retrying.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Print '[vm create] ready in <elapsed>' to stderr once the create
operation completes successfully. Surfaces how long the full
create-to-ready cycle took (image resolve + work disk + boot +
guest agents + capability post-start), which the per-stage
progress lines don't add up to in any visible way.
Format adapts to scale: sub-second prints as 'NNNms', sub-minute
keeps one decimal ('4.7s'), longer prints as 'MmSSs'. Always
emitted (not gated on TTY) so logged and CI output carry the
number too.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 'docker' bit on model.Image was unused at runtime — every code
path that branched on it had been removed earlier, leaving only the
field, the SQL column, the --docker flag, and the
#feature:docker sentinel that BuildMetadataPackages emitted into a
hash file. None of those have callers anymore.
Strip the field from the model, the API params, the SQLite column,
the CLI flag, and BuildMetadataPackages's signature. Add migration
2 (drop_images_docker) so existing installs lose the column on next
daemon start. ALTER TABLE ... DROP COLUMN is fine: SQLite has
supported it since 3.35 (2021).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Frontier models tend to discover a CLI by running --help, scanning
the Long description, and inferring the dominant workflow from the
examples. Today's banger help reads like a man page index — every
verb has a one-line Short and nothing else. This rewrites the
groups (banger, vm, vm workspace, image, kernel, system,
ssh-config) so each landing page answers "what is this for, what's
the 80% command, what comes next" in three to ten lines, with
runnable examples.
Also disambiguates the near-twin lifecycle commands so a model
reading the subcommand index can tell stop/kill/delete apart at a
glance:
start Start a stopped VM
stop Stop a running VM gracefully
restart Stop then start a VM
kill Force-kill a VM (use when 'vm stop' hangs)
delete Stop a VM and remove its disks (irreversible)
vm create / vm ssh / vm logs / vm show pick up Long descriptions
and examples for the same reason. No behaviour changes; help text
only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the supported systemd path to two services: an owner-user bangerd for
orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop,
and Firecracker ownership. This removes repeated sudo from daily vm and image
flows without leaving the general daemon running as root.
Add install metadata, system install/status/restart/uninstall commands, and a
system-owned runtime layout. Keep user SSH/config material in the owner home,
lock file_sync to the owner home, and move daemon known_hosts handling out of
the old root-owned control path.
Route privileged lifecycle steps through typed privilegedOps calls, harden the
two systemd units, and rewrite smoke plus docs around the supported service
model.
Verified with make build, make test, make lint, and make smoke on the
supported systemd host path.
A VM name flows into five places that all have narrower grammars
than "arbitrary string":
- the guest's /etc/hostname (vm_disk.patchRootOverlay)
- the guest's /etc/hosts (same)
- the <name>.vm DNS record (vmdns.RecordName)
- the kernel command line (system.BuildBootArgs*)
- VM-dir file-path fragments (layout.VMsDir/<id>, etc.)
Nothing in the chain was validating the input. A name with
whitespace, newline, dot, slash, colon, or = would produce broken
hostnames, weird DNS labels, smuggled kernel cmdline tokens, or
(in the worst case) surprising traversal through the on-disk
layout. Not host shell injection — we already avoid shelling out
with the raw name — but a real correctness and supportability bug.
New: model.ValidateVMName. Rules:
- 1..63 chars (DNS label max per RFC 1123; also a comfortable
/etc/hostname cap)
- lowercase ASCII letters, digits, '-' only
- no leading or trailing '-'
- no normalization — the name is the user-visible identifier
(store key, `ssh <name>.vm`, `vm show`); silently rewriting
"MyVM" → "myvm" would hand the user back something different
than they typed
Called from two places:
- internal/cli/commands_vm.go vmCreateParamsFromFlags — rejects
bad `--name` values before any RPC. Empty name still passes
through so the daemon can generate one.
- internal/daemon/vm_create.go reserveVM — defense in depth for
any non-CLI RPC caller (SDK, direct JSON over the socket).
Tests:
- internal/model/vm_name_test.go — exhaustive character-class
matrix (space, newline, tab, dot, slash, colon, equals, quote,
control chars, unicode letters, uppercase, leading/trailing
hyphen, over-length, max-length-exact, digits-only).
- internal/cli TestVMCreateParamsFromFlagsRejectsInvalidName —
CLI wire-through + empty-name passthrough.
- internal/daemon TestReserveVMRejectsInvalidName — daemon
defense-in-depth (including `box/../evil` path-traversal).
- scripts/smoke.sh — end-to-end rejection + no-leaked-row
assertion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
banger hasn't shipped a public release — every "legacy", "pre-opt-in",
"previously", "migration note", "no longer" reference in the tree is
pinning against a state no real user's install has ever been in.
That scaffolding has weight: it's a coordinate system future readers
have to decode, and it keeps dead code alive.
Removed (code):
- internal/daemon/ssh_client_config.go
- vmSSHConfigIncludeBegin / vmSSHConfigIncludeEnd constants and
every `removeManagedBlock(existing, vm...)` call they enabled
(legacy inline `Host *.vm` block scrub)
- cleanupLegacySSHConfigDir (+ its caller in syncVMSSHClientConfig)
— wiped a pre-opt-in sibling file under $ConfigDir/ssh
- sameDirOrParent + resolvePathForComparison — only ever used
by cleanupLegacySSHConfigDir
- the "also check legacy marker" fallback in
UserSSHIncludeInstalled / UninstallUserSSHInclude
- internal/store/migrations.go
- migrateDropDeadImageColumns (migration 2) + its slice entry
- dropColumnIfExists (orphaned after the above)
- addColumnIfMissing + the whole "columns added across the pre-
versioning lifetime" block at the end of migrateBaseline —
subsumed into the baseline CREATE TABLE
- `packages_path TEXT` column on the images table (the
throwaway migration 2 dropped it, but there was never any
reader)
- internal/daemon/vm.go
- vmDNSRecordName local wrapper — was justified as "avoid
pulling vmdns into every file"; three of four callers already
imported vmdns directly, so inline the one stray call
- internal/cli/cli_test.go
- TestLegacyRemovedCommandIsRejected (`tui` subcommand never
shipped)
Removed / simplified (tests):
- ssh_client_config_test.go: dropped TestSameDirOrParentHandlesSymlinks,
TestSyncVMSSHClientConfigPreservesUserKeyInLegacyDir,
TestSyncVMSSHClientConfigNarrowsCleanupToLegacyFile,
TestSyncVMSSHClientConfigLeavesUnexpectedLegacyContents,
TestInstallUserSSHIncludeMigratesLegacyInlineBlock, plus the
"legacy posture" regression strings in the remaining happy-path
test; TestUninstallUserSSHIncludeRemovesBothMarkerBlocks collapsed
to a single-block test
- migrations_test.go: dropped TestMigrateDropDeadImageColumns_AcrossInstallPaths,
TestDropColumnIfExistsIsIdempotent; TestOpenReadOnlyDoesNotRunMigrations
simplified to test against the baseline marker
Removed (docs):
- README.md "**Migration note.**" blockquote about the SSH-key path move
- docs/advanced.md parenthetical "(the old behaviour)"
Reworded (comments):
- Dropped "Previously this file also contained LogLevel DEBUG3..."
history from vm_disk.go's sshdGuestConfig doc
- Dropped "Call sites that previously read vm.Runtime.{PID,...}"
from vm_handles.go; now documents the current contract
- Dropped "Pre-v0.1 the defaults are" scaffolding in doctor_test.go
- Dropped "no longer does its own git inspection" phrasing in vm_run.go
- Dropped the "(also cleans up legacy inline block from pre-opt-in
builds)" aside on the `ssh-config` CLI docstring
- Renamed test var `legacyKey` → `existingKey` in vm_test.go; its
purpose was "pre-existing authorized_keys line," not banger-legacy
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
--readonly ran `chmod -R a-w` over the workspace after copying, but
every banger guest boots as root, and root bypasses DAC mode checks.
So a user running `vm workspace prepare ... --readonly` got the
mode bits set to 0444 but `echo x >> file` in the guest still
succeeded. The flag promised enforcement it couldn't deliver.
The feature also doesn't match the product model: workspaces are
prepared precisely so the guest CAN edit them, and `workspace
export` exists to pull those edits back as a patch. A
"read-only workspace" contradicts that loop.
Removed:
- CLI flag `--readonly` on `vm workspace prepare`
- api.VMWorkspacePrepareParams.ReadOnly field
- model.WorkspacePrepareResult.ReadOnly field
- daemon chmod dispatch in prepareVMWorkspaceGuestIO
- smoke scenario pinning the (advisory) mode-bit behavior
- misleading "exportbox-readonly" VM name in an unrelated export
test (the test is about not mutating the real git index;
renamed to exportbox-noindex-mutation)
If real enforcement becomes a user need later, the right primitive
is `chattr +i` (immutable bit — root CAN'T write) or a ro bind-mount.
Reintroducing a new flag is cheaper than debugging what the current
one actually guarantees.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five new smoke scenarios layered on top of the existing bare + workspace
vm-run tests:
- exit-code propagation: `sh -c 'exit 42'` must rc=42
- workspace dry-run: --dry-run lists tracked files without a VM
- workspace --include-untracked: opt-in ships files outside the git
index (regression guard on the security-default flip from review 4)
- concurrent vm runs: two --rm invocations in parallel both succeed
(stresses per-VM locks, createVMMu reservation window, tap pool)
- invalid spec rejection: --vcpu 0 must fail with no VM row left
behind (the "cleanup on partial failure" path the review flagged)
The exit-code scenario caught a real bug on first run:
`banger vm run --rm -- sh -c 'exit 42'` returned rc=0, not 42.
Root cause in internal/cli/ssh.go's sshCommandArgs: extra args were
appended to the ssh argv verbatim, relying on ssh(1)'s implicit
space-join to deliver the remote command. That works for single
tokens (echo hello) but re-tokenises multi-word commands on the
remote side: `ssh host sh -c 'exit 42'` becomes remote
`sh -c exit 42`, where `42` is $0 for the already-completed `exit`,
and the exit code the user asked for is lost.
Fix: shell-quote every element of extra (`'sh'` `'-c'` `'exit 42'`)
and join them into a single trailing argv entry. ssh's space-join
then produces exactly the command the user typed on the remote
shell. TestSSHCommandArgs was updated to pin the quoting; the
existing TestRunVMRunCommandModePropagatesExitCode test needed a
one-word quote tweak (`false` → `'false'`).
Smoke run after fix passes all seven scenarios in ~2 min on warm
state. cmd/banger coverage jumped to 100% (the invalid-spec
scenario hits the error-reporting path that wasn't covered
before).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs in the untracked-skipped warning, both surfaced by review.
1. Wrong scope for subdir inputs. The helper accepted any path the
caller had (sourcePath, which may be a user-supplied subdirectory)
and ran `git -C <path> ls-files --others --exclude-standard`. Git
scopes that output to the cwd, so pointing `vm run ./repo/sub` at
a subdir silently underreported untracked files living elsewhere
in the repo — exactly the files the warning exists to surface.
Fix: resolve sourcePath to the repo root inside the helper via
`rev-parse --show-toplevel` before counting.
2. Inconsistent failure handling. The comment said the helper should
be silent when the count can't be determined; the body returned
the error. vm_run.go treated the error as non-fatal (logged a
warning, continued); workspace prepare and --dry-run aborted the
whole operation on the same helper failure. A courtesy notice
shouldn't kill the operation.
Fix: make the helper best-effort in signature and body — no error
return, swallows rev-parse + count failures, emits nothing when
there's nothing to say. All three callers lose their error
branches.
Regression tests:
- subdir input reports the root-level untracked file (the bug case)
- non-repo path produces silence, not a fatal error
- inspector whose runner errors on every call produces silence
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three test seams were still package-level mutable vars, which tests
had to swap before use. That's the classic path to flaky parallel
tests — two goroutines fighting over the same global fake. Push each
down to the struct that owns the behaviour.
internal/daemon/dns_routing.go
lookupExecutableFunc + vmDNSAddrFunc → fields on *HostNetwork,
defaulted at newHostNetwork time. dns_routing_test builds
HostNetwork{..., lookupExecutable: stub, vmDNSAddr: stub} inline,
no more t.Cleanup dance around package-level vars.
internal/daemon/preflight.go + doctor.go
vsockHostDevicePath (mutable string) → vsockHostDevice field on
*VMService, defaulted via defaultVsockHostDevice constant in
newVMService. Preflight reads s.vsockHostDevice; doctor reads
d.vm.vsockHostDevice. Logger test sets d.vm.vsockHostDevice = tmp
after wireServices.
internal/daemon/workspace/workspace.go
HostCommandOutputFunc → *Inspector struct with a Runner field.
Every git-using helper (GitOutput, GitTrimmedOutput,
GitResolvedConfigValue, RunHostCommand, ListSubmodules,
ListOverlayPaths, CountUntrackedPaths, InspectRepo,
ImportRepoToGuest, PrepareRepoCopy) is now a method on *Inspector.
NewInspector() wraps the real host runner for production;
WorkspaceService holds one via repoInspector, CLI deps holds one
too. cli_test.go's submodule-rejection test builds its own
Inspector with a scripted Runner instead of patching a global.
Pure helpers (FinalizeScript, ResolveSourcePath, ParsePrepareMode,
ShellQuote, FormatStepError, GitFileURL, ParseNullSeparatedOutput)
stay free functions since they don't touch the host.
Sentinel: grep for HostCommandOutputFunc, lookupExecutableFunc,
vmDNSAddrFunc, vsockHostDevicePath is now empty across internal/.
make lint test green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Workspace-mode vm run and vm workspace prepare used to copy both
tracked AND untracked non-ignored files into the guest. That silently
catches local .env files, scratch notes, credentials, and any other
working-tree state a developer hasn't explicitly gitignored — a real
data-exposure footgun given the golden image ships Docker and the
usual dev tooling.
Flip the default to tracked-only. Users who actually want the fuller
set opt in with --include-untracked (documented in both commands'
help). Gitignored files are still always excluded regardless of the
flag.
Add --dry-run to both vm run and vm workspace prepare. Dry-run
inspects the repo CLI-side (no VM created, no daemon RPC needed since
the daemon is always local and the inspection is a pure git read),
prints the exact file list + mode, and exits. A byte-level preview of
what would land in the guest.
When running real (non-dry) and untracked files exist in the repo but
are being skipped under the new default, print a one-line notice
pointing to --include-untracked so users aren't surprised when the
guest is missing something they expected.
Signature changes:
- ListOverlayPaths takes an includeUntracked bool (tracked always;
untracked gated by flag).
- InspectRepo takes the same flag and passes it through.
- VMWorkspacePrepareParams gains IncludeUntracked.
- WorkspaceService.workspaceInspectRepo seam signature widened to
match (4 callers in tests updated).
New workspace package tests cover both modes and verify that
gitignored files never leak regardless of the flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this change, every daemon.Open() wrote a Host *.vm stanza into
~/.ssh/config in a marker-fenced block. That's a real footgun for users
who manage their SSH config declaratively (chezmoi, dotfiles, NixOS):
banger was mutating host state outside its own directory on every
daemon start, easy to miss and hard to audit.
New contract: the daemon only ever writes its own ssh_config file at
~/.config/banger/ssh_config. ~/.ssh/config is untouched unless the user
opts in. `banger vm ssh <name>` still works out of the box — the
shortcut only matters for plain `ssh sandbox.vm` from any terminal.
The opt-in surface is `banger ssh-config`:
banger ssh-config # prints path + include-line +
# install/uninstall hints
banger ssh-config --install # adds `Include <bangerConfig>` to
# ~/.ssh/config inside a marker-fenced
# block; idempotent; migrates any
# legacy inline Host *.vm block from
# pre-opt-in builds
banger ssh-config --uninstall # removes the new Include block AND
# any legacy inline block
Doctor gains a gentle warn-level note when banger's ssh_config exists
but the user hasn't wired it in — not a fail, since the shortcut is
convenience and `banger vm ssh` covers the essential case.
Tests cover: daemon writes banger file and does NOT touch ~/.ssh/config,
Install adds the block, Install is idempotent, Install migrates the
legacy inline block cleanly (removing it, preserving unrelated
entries, adding the new Include block), Uninstall removes both marker
variants, Uninstall is a no-op when ~/.ssh/config is absent, and
UserSSHIncludeInstalled detects both marker shapes.
README reframes the feature as optional convenience.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cuts the daemon-managed guest-session machinery (start/list/show/
logs/stop/kill/attach/send). The feature shipped aimed at agent-
orchestration workflows (programmatic stdin piping into a long-lived
guest process) that aren't driving any concrete user today, and the
~2.3K LOC of daemon surface area — attach bridge, FIFO keepalive,
controller registry, sessionstream framing, SQLite persistence — was
locking in an API we'd have to keep through v0.1.0.
Anything session-flavoured that people actually need today can be
done with `vm ssh + tmux` or `vm run -- cmd`.
Deleted:
- internal/cli/commands_vm_session.go
- internal/daemon/{guest_sessions,session_lifecycle,session_attach,session_stream,session_controller}.go
- internal/daemon/session/ (guest-session helpers package)
- internal/sessionstream/ (framing package)
- internal/daemon/guest_sessions_test.go
- internal/store/guest_session_test.go
- GuestSession* types from internal/{api,model}
- Store UpsertGuestSession/GetGuestSession/ListGuestSessionsByVM/DeleteGuestSession + scanner helpers
- guest.session.* RPC dispatch entries
- 5 CLI session tests, 2 completion tests, 2 printer tests
Extracted:
- ShellQuote + FormatStepError lifted to internal/daemon/workspace/util.go
(only non-session consumer); workspace package now self-contained
- internal/daemon/guest_ssh.go keeps guestSSHClient + dialGuest +
waitForGuestSSH — still used by workspace prepare/export
- internal/daemon/fake_firecracker_test.go preserves the test helper
that used to live in guest_sessions_test.go
Store schema: CREATE TABLE guest_sessions and its column migrations
removed. Existing dev DBs keep an orphan table (harmless, pre-v0.1.0).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLI: introduce internal/cli.deps which owns every RPC/SSH/host-command
seam the tree used to reach through mutable package vars. Command
builders, orchestrators, and the completion helpers become methods on
*deps. Tests construct their own deps per case, so fakes no longer leak
across cases and tests are free to run in parallel.
Daemon: move workspaceInspectRepoFunc + workspaceImportFunc onto the
Daemon struct (workspaceInspectRepo / workspaceImport), mirroring the
existing guestWaitForSSH / guestDial pattern. Workspace-prepare tests
drop t.Parallel() guards now that they no longer mutate process-wide
state.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure code motion — banger.go 3508→240 LOC, same-package
decomposition keeps all identifiers visible without export changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CLI carried a full second copy of the workspace import
implementation that `vm run` never actually used:
- importVMRunRepoToGuest (no callers — the live flow calls the
daemon's PrepareVMWorkspace RPC instead)
- prepareVMRunRepoCopy, vmRunCheckoutCommit, vmRunCheckoutScript,
gitFileURL, runHostCommand (all reachable only from the dead
importVMRunRepoToGuest)
Plus a duplicated repo-inspection surface that shadowed the
daemon's:
- inspectVMRunRepo ran every git query the daemon re-ran during
workspace.prepare (HEAD, branch, identity, origin, overlay list)
- gitOutput / gitTrimmedOutput / gitResolvedConfigValue /
parseNullSeparatedOutput / listSubmodules / listOverlayPaths /
resolveVMRunSourcePath — all identical to the exported
workspace.* versions
- vmRunRepoSpec — same fields as workspace.RepoSpec
Replaced with a single minimal preflight:
func vmRunPreflightRepo(ctx, rawPath) (absPath, err error)
The preflight only checks what the user can fix locally before
banger creates a VM (path exists, sits in a non-bare git repo, no
submodules). The daemon's workspace.prepare RPC does the full
inspection — and returns RepoRoot + RepoName in the response, which
the CLI now threads into the tooling harness instead of computing
them a second time.
Signature changes:
runVMRun(ctx, ..., *vmRunRepo, ...) // was: *vmRunRepoSpec
startVMRunToolingHarness(ctx, client, repoRoot, repoName, progress)
// was: (ctx, client, spec, progress)
vmRunToolingHarnessScript(plan) // was: (spec, plan)
vmRunToolingHarnessLaunchScript(repoName) // was: (spec)
Tests: the CLI-side git-inspection tests are replaced by a single
TestVMRunPreflightRejectsSubmodules that exercises the preflight.
Everything else (tooling harness script, progress renderer, SSH args,
runVMRun flows) keeps working. The shallow-copy / checkout-script
tests are gone — that code now lives only in
internal/daemon/workspace and is tested there.
Also fixed a latent bug the refactor exposed: vm run's --from flag
defaults to "HEAD", which the daemon reads as "from without branch"
and rejects. CLI now scrubs fromRef when branchName is empty.
Live verified: `banger vm run --name X . -- cmd` boots, workspace
materialises at /root/repo with matching HEAD, exit code propagates.
Guest host-key verification was off in all three SSH paths:
* Go SSH (internal/guest/ssh.go) used ssh.InsecureIgnoreHostKey
* `banger vm ssh` passed StrictHostKeyChecking=no
+ UserKnownHostsFile=/dev/null
* `~/.ssh/config` Host *.vm shipped the same posture into the
user's global config
Now each path verifies against a banger-owned known_hosts file at
`~/.local/state/banger/ssh/known_hosts` with TOFU semantics:
* First dial to a VM pins the key.
* Subsequent dials require an exact match. A mismatch fails with
an explicit "possible MITM" error.
* `vm delete` removes the entries so a future VM reusing the IP
or name re-pins cleanly.
* The user's `~/.ssh/known_hosts` is untouched.
Changes:
internal/guest/known_hosts.go (new) — OpenSSH-compatible parser,
TOFUHostKeyCallback, RemoveKnownHosts. Process-wide mutex
around the file.
internal/guest/ssh.go — Dial and WaitForSSH grew a knownHostsPath
parameter threaded through the callback. Empty path keeps the
insecure callback (tests + throwaway tools only; documented).
internal/daemon/{guest_sessions,session_attach,session_lifecycle,
session_stream}.go — call sites pass d.layout.KnownHostsPath.
internal/daemon/ssh_client_config.go — the ~/.ssh/config Host *.vm
block now points at banger's known_hosts and uses
StrictHostKeyChecking=accept-new. Missing path → fail closed.
internal/daemon/vm_lifecycle.go — deleteVMLocked drops known_hosts
entries for the VM's IP and DNS name via removeVMKnownHosts.
internal/cli/banger.go — sshCommandArgs swaps StrictHostKeyChecking
no + /dev/null for banger's file + accept-new. Path resolution
failure falls through to StrictHostKeyChecking=yes.
internal/paths/paths.go — Layout gains SSHDir + KnownHostsPath;
Ensure creates SSHDir at 0700.
Tests (internal/guest/known_hosts_test.go): pin on first use, accept
matching key on second dial, reject mismatch, empty path skips
checking, RemoveKnownHosts drops the entry, re-pin works after
remove. Existing daemon + cli tests updated to assert the new
posture and regression-guard against the old flags.
Live verified: vm run writes the pin to banger's known_hosts at 0600
inside a 0700 dir; banger vm ssh + ssh root@<vm>.vm both succeed
using the pin; vm delete clears it.
The web UI shipped as "experimental" and was never finished — no nav
off the dashboard, no live updates, no settled design, never a
supported surface. It was opt-in by default already; leaving the code
in the tree for v0.1.0 only invited "does this work?" questions and
kept HostSummary/BangerSummary/SudoStatus types on the public RPC
surface that nothing else uses.
Removed:
internal/webui/ (all Go + templates + assets)
internal/daemon/web.go (server start / Layout / Config / ListVMs / ListImages)
internal/daemon/dashboard.go (DashboardSummary aggregator)
Simplified:
internal/api/types.go drop WebURL on PingResult, drop
HostSummary / SudoStatus / BangerSummary /
DashboardSummary / DashboardSummaryResult
internal/model/types.go drop DaemonConfig.WebListenAddr
internal/config/config.go drop web_listen_addr from fileConfig + Load
internal/daemon/daemon.go drop webListener / webServer / webURL fields +
startWebServer() call + ping WebURL population
internal/cli/banger.go `daemon status` output no longer branches on web
internal/daemon/{doc.go,ARCHITECTURE.md} drop web UI sections
README.md drop web_listen_addr config bullet + security paragraph
Tests updated to reflect the new shape. Coverage 57.3 -> 58.9% (the
webui package was largely untested; its removal lifts the ratio
without moving the numerator). `banger daemon status` output and
--help are web-free. Lint + full suite green.
Previously `banger vm workspace export` ran `git add -A` against the
guest's real `.git/index`, so the observation step left staged
changes behind that users never asked for. Reconnecting later (ssh,
another export) surfaced them and looked like phantom work.
Route `git add -A` through a throwaway index file instead:
tmp_idx=$(mktemp ...)
trap 'rm -f "$tmp_idx"' EXIT
git read-tree <ref> --index-output="$tmp_idx"
GIT_INDEX_FILE="$tmp_idx" git add -A
GIT_INDEX_FILE="$tmp_idx" git diff --cached <ref> --binary|--name-only
The real .git/index, working tree, and refs stay exactly as the user
left them. Same diff content — commits past <ref>, uncommitted edits,
and untracked files (minus .gitignore) all captured.
Regression test locks the invariant: every export script must route
add -A through GIT_INDEX_FILE and clean the temp index on exit. CLI
help text updated to say "non-mutating".
Replaces the static model.Default* constants that drove --vcpu / --memory
/ --disk-size with a three-layer resolver:
1. [vm_defaults] in ~/.config/banger/config.toml (if set)
2. host-derived heuristics (cpus/4 capped at 4; ram/8 capped at 8 GiB)
3. baked-in constants (floor)
Visibility:
- Every `vm run` / `vm create` prints a `spec:` line before progress
begins: `spec: 4 vcpu · 8192 MiB · 8G disk`. Matches the VM that
actually gets created because the CLI is now the single source of
truth — it resolves, populates the flag defaults, and forwards the
explicit values to the daemon.
- `banger doctor` adds a "vm defaults" check showing per-field
provenance (config|auto|builtin) and the config file path for
overrides.
- `--help` shows the resolved defaults (e.g. `--vcpu int (default 4)`
on an 8-core host).
No `banger config init` command, no first-run side effects, no writes
to the user's filesystem behind their back. Users who want explicit
control set the keys; everyone else gets sensible numbers that track
their hardware.
- WebListenAddr default is now "" (empty). The experimental web UI was
running on 127.0.0.1:7777 by default, which surprises users who never
opted in. Users who want it set `web_listen_addr = "127.0.0.1:7777"`
in config.toml.
- `make uninstall` stops the daemon (if any) and removes the installed
binaries. Preserves user data on disk but prints the paths so `rm -rf`
can follow for a full purge. Documented in README next to install.
- docs/kernel-catalog.md: replace the `void-6.12` and `alpine-3.23`
examples (never published) with `generic-6.12` (the only cataloged
kernel today). Updates the versioning-convention example too.
- `banger vm prune` sweeps every non-running VM (stopped, created,
error) with an interactive confirmation; -f/--force skips the prompt.
Partial failures report which VM failed and exit non-zero.
- list commands gain `ls` alias: vm list already had it; added to image
list, kernel list, and vm session list.
- delete commands gain `rm` alias: vm delete and image delete. kernel
rm already aliased delete/remove.
Uses new test seams (vmListFunc) plus the existing vmDeleteFunc so
prune unit-tests without touching the daemon socket.
Re-enable cobra's default `completion` subcommand (`banger completion
bash|zsh|fish|powershell`). Plus live resource-name suggestions that
hit the running daemon via the same RPC the real commands use:
vm start/stop/restart/delete/kill/set → completeVMNames (variadic)
vm ssh/show/logs/stats/ports/... → completeVMNameOnlyAtPos0
vm session list/start → completeVMNameOnlyAtPos0
vm session show/logs/stop/kill/attach/send → completeSessionNames (vm + session)
image show/delete/promote → completeImageNameOnlyAtPos0
kernel show/rm → completeKernelNameOnlyAtPos0
vm run/create --image, image pull/register --kernel-ref → flag-value completion
Design notes in internal/cli/completion.go: completers never auto-start
the daemon (ping-check, bail with NoFileComp on miss), so tab-completion
stays a zero-cost probe. Variadic completers exclude already-entered
args to avoid duplicate suggestions.
README: install recipes for bash / zsh / fish.
Bundle downloads can take 20–60s on a typical connection and the
CLI was going silent between "resolving daemon" and the final image
summary. Users wondered whether banger had wedged.
New `withHeartbeat` helper wraps an RPC call with a dot-every-2s
ticker on stderr. No-op when stderr isn't a terminal, so piped or
scripted invocations stay quiet. Wired into `image pull` and `kernel
pull`, the two commands that actually download bytes.
Example:
$ banger image pull debian-bookworm
[image pull] ..........
id name managed ...
Tests cover the non-TTY short-circuit and error propagation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `internal/opencode` package and the `opencodeCapability` that
consumed it were hard-wired to wait for opencode on guest port 4096
when an image shipped an initrd. After the prune commits (void /
alpine / customize.sh / image build all removed), nothing banger
produces today carries an initrd, so the capability's wait path was
unreachable: every startup short-circuited to the "direct-boot, skip
opencode" branch.
Same logic for `banger vm acp`: it SSHes to `opencode acp --cwd
<path>`, a binary the golden image no longer ships. Users who run
their own image with opencode can still invoke
`ssh vm -- opencode acp --cwd /root/repo` directly — no banger
scaffolding required.
Removed:
- internal/opencode/ (whole package, 255 LOC incl. tests)
- internal/daemon/opencode.go (opencodeCapability)
- cli `vm acp` command + its helpers (runVMACP, sshACPCommandArgs,
vmACPRemoteCommand) + their tests
- The opencodeCapability{} entry in registeredCapabilities() plus
the test that pinned its presence
- `wait_opencode` progress-stage label from the vm-create renderer
- Stale mentions in daemon/doc.go, README, and webui test fixtures
~480 lines gone, 12 added. `banger/internal` is now 25 packages
instead of 26.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The deferred --rm delete fires AFTER runSSHSession returns, but
runSSHSession prints "vm X is still running (stop with ...)" before
returning. Net effect: the user sees the reminder, then the VM gets
deleted behind it — misleading.
Thread a skipReminder bool into runSSHSession. `vm run` passes the
same value as removeOnExit; other callers (`vm ssh`) pass false.
Reinforced by a new assertion in the --rm happy-path test that the
reminder string never appears in stderr.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New `--rm` flag deletes the VM once the ssh session or `-- cmd`
exits, making `vm run` one-shot. Exit code from command mode still
propagates correctly.
Semantics:
- Create fails → no VM to delete, nothing to do.
- SSH-wait timeout → VM intentionally kept alive so `vm logs <name>`
shows why; the timeout error already pointed users at that. Even
with --rm, this path skips delete — a wedged sshd is exactly when
you want post-mortem access.
- Session/command ends (any exit code, any reason) → VM is deleted
via `vm.delete` RPC. Uses a fresh 10s context so Ctrl-C during the
session doesn't abort the cleanup.
New vmDeleteFunc seam at the top of banger.go alongside the other
RPC seams. Two tests cover the happy path (session ends cleanly →
delete fires with correct ref) and the skip-on-timeout path (ssh
wait errors → delete does NOT fire).
README updated with an ephemeral example and a note about the
timeout-skip behaviour.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before: `guestWaitForSSHFunc` loops forever bounded only by context
cancellation, so if sshd fails to start in the guest `vm run` hangs
indefinitely — which burned a long debugging session during the
golden-image bring-up.
After: the ssh wait gets its own 90s deadline. On guest-side timeout
the error names the VM, explains sshd is the likely suspect, points
at `banger vm logs <name>` for the console output, and notes the VM
is still alive for inspection (or `vm delete` to clean up). Parent
context cancellation (Ctrl-C, caller timeout) still surfaces as-is
without the hint.
`vmRunSSHTimeout` is a var rather than a const so tests can shrink
it; the new TestRunVMRunSSHTimeoutReturnsActionableError sets it to
50ms and asserts the error message contains the actionable bits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `image build` flow spun up a transient Firecracker VM, SSHed in,
and ran a large bash provisioning script to derive a new managed
image from an existing one. It overlapped heavily with the golden-
image Dockerfile flow (same mise/docker/tmux/opencode install logic
duplicated in Go as `imagemgr.BuildProvisionScript`) and had far more
machinery: async op state, RPC begin/status/cancel, webui form +
operation page, preflight checks, API types, tests. For custom
images, writing a Dockerfile is simpler and more reproducible.
Removed end-to-end:
- CLI `image build` subcommand + `absolutizeImageBuildPaths`.
- Daemon: BuildImage method, imagebuild.go (transient-VM orchestration),
image_build_ops.go (async begin/status/cancel), imagemgr/build.go
(the 247-line provisioning script generator and all its append*
helpers), validateImageBuildPrereqs + addImageBuildPrereqs.
- RPC dispatches for image.build / .begin / .status / .cancel.
- opstate registry `imageBuildOps`, daemon seam `imageBuild`,
background pruner call.
- API types: ImageBuildParams, ImageBuildOperation, ImageBuildBeginResult,
ImageBuildStatusParams, ImageBuildStatusResult; model type
ImageBuildRequest.
- Web UI: Backend interface methods, handlers, form, routes, template
branches (images.html build form, operation.html build branch,
dashboard.html Build button).
- Tests that directly exercised BuildImage.
Doctor polish (task C):
- Drop the "image build" preflight section entirely (its raison d'être
is gone).
- Default-image check now accepts "not local but in imagecat" as OK:
vm create auto-pulls on first use. Only flag when the image is
neither locally registered nor in the catalog.
Net: 24 files touched, 1,373 lines deleted, 25 added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The golden-image Dockerfile + catalog pipeline replaces the entire
manual rootfs-build stack. With that shipped, the per-distro shell
flows are dead code.
Removed:
- scripts/customize.sh, scripts/interactive.sh, scripts/verify.sh
- scripts/make-rootfs{,-void,-alpine}.sh
- scripts/register-{void,alpine}-image.sh
- scripts/make-{void,alpine}-kernel.sh
- internal/imagepreset/ (only consumer was `banger internal packages`,
which fed customize.sh)
- examples/{void,alpine}.config.toml
- Makefile targets: rootfs, rootfs-void, rootfs-alpine, void-kernel,
alpine-kernel, void-register, alpine-register, void-vm, alpine-vm,
verify-void, verify-alpine, plus the ALPINE_RELEASE / *_IMAGE_NAME
/ *_VM_NAME variables
The void-6.12 kernel catalog entry is also gone — golden image pairs
with generic-6.12 and nothing else in the catalog depended on it.
Consolidated: imagemgr now holds the small DebianBasePackages list +
package-hash helper inline, so the `image build --from-image` flow
(still supported) no longer pulls from a separate imagepreset package.
Net: 3,815 lines deleted, 59 added. No runtime functionality removed
beyond the `banger internal packages` subcommand (hidden, used only
by the deleted customize.sh).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One-command sandbox: `banger vm run` on a fresh host now Just Works.
No prior `banger image pull` or `banger kernel pull` needed.
Changes:
- Default `default_image_name` flips from "default" to "debian-bookworm"
so the golden image is the implicit target when `--image` is omitted.
- `CreateVM` resolves the image via a new `findOrAutoPullImage`: try
the local store first, and on miss fall back to the embedded imagecat
catalog + auto-pull. Emits a vm-create progress stage so the user
sees "pulling from image catalog" in the create output.
- `resolveKernelInputs` gains context + the same pattern via
`readOrAutoPullKernel`: try the local kernelcat, and on miss look up
the embedded kernelcat and auto-pull. Fires whenever a bundle's
manifest references a kernel the user hasn't pulled yet, not just
during image pull — any CreateVM with an image that needs a kernel
not yet local will resolve it.
- `--image` help text updated on both `vm run` and `vm create`.
Six tests cover local-hit-no-pull, auto-pull-on-miss, not-in-catalog
error propagation, and a non-ENOENT kernel read error does NOT trigger
a misleading "not in catalog" claim.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`docker create` drops /.dockerenv into the container's writable layer,
and `docker export` includes it in the tar. When systemd later boots
that rootfs it finds /.dockerenv and flags virtualization=docker,
which disables a bunch of udev device-unit behaviour (device units
never become active, mount units waiting on them hang forever).
Strip /.dockerenv (and /run/.containerenv for podman symmetry) from
the staging tree after FlattenTar and before BuildExt4 so systemd
correctly detects virtualization=kvm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PullImage now checks the embedded imagecat catalog first. If the
ref matches a catalog entry, it takes the bundle path:
1. Fetch the .tar.zst bundle into a staging dir (rootfs.ext4 +
manifest.json).
2. Strip manifest.json (staging-only metadata).
3. Stage kernel/initrd/modules alongside rootfs.ext4.
4. Publish the staging dir and upsert the image row.
Bundle rootfs is already flattened + ownership-fixed + agent-
injected at build time, so the daemon-side work is strictly I/O —
no flatten, no mkfs, no debugfs.
Kernel resolution in the bundle path: --kernel-ref > entry.kernel_ref
> --kernel/--initrd/--modules.
If the ref doesn't match a catalog entry, PullImage falls through
to the existing OCI path unchanged (extracted into pullFromOCI).
New test seam: d.bundleFetch. Six unit tests cover happy path,
--kernel-ref override, existing-name rejection, kernel-required
error, fetch-failure cleanup, and the catalog → OCI fallthrough.
CLI help updated: image pull now documents both forms and takes
<name-or-oci-ref> instead of requiring an OCI ref.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the OCI-push flow with a bundle-based one that mirrors the
kernel catalog (publish-kernel.sh / kernelcat).
- scripts/make-golden-bundle.sh: docker build → docker create → docker
export | banger internal make-bundle → .tar.zst. Defaults target
debian-bookworm / generic-6.12 / x86_64; pinned --size 4G to leave
headroom for first-boot installs and in-VM apt use.
- scripts/publish-golden-image.sh: rewritten to call make-golden-bundle,
rclone upload to R2 (banger-images bucket, images.thaloco.com), and
jq-patch internal/imagecat/catalog.json with URL / sha256 / size.
--skip-upload stops after bundle build and copies to dist/.
make-bundle default ext4 sizing also bumped from +25% to +50% headroom
(mkfs.ext4 needs room for inode tables, block-group metadata, journal,
and the default 5% reserved-blocks margin). The old 25% was too tight
for the ~950 MB golden rootfs and aborted with "Could not allocate
block".
End-to-end smoke (local): golden Dockerfile → 286 MB tar.zst bundle
with correct manifest, valid ext4, and all banger units + vsock agent
present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main.go previously unwrapped *any* error implementing `ExitCode() int`
into the process exit status, which matched *exec.ExitError too. So
whenever a CLI command ran a subprocess (mkfs.ext4, debugfs, ssh to a
daemon preflight, etc.) and that subprocess failed, the CLI would
silently exit with the subprocess's code — no error message printed.
Surfaced while bringing up `banger internal make-bundle`: mkfs.ext4
was failing on an undersized ext4 and the user saw only `EXIT=1`.
Fix: export the type as `cli.ExitCodeError` and unwrap against the
concrete type in main.go. The `ExitCode()` method is gone — only the
explicit wrap at the `vm run` command-mode call site produces this
error now.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New hidden subcommand that turns a `docker export`-style rootfs tar
into a banger bundle (`rootfs.ext4` + `manifest.json`, tar+zstd):
1. FlattenTar (new in imagepull) extracts the stream into a staging
dir while capturing per-file uid/gid/mode into a Metadata record.
2. imagepull.BuildExt4 produces the ext4 via `mkfs.ext4 -d`.
3. imagepull.ApplyOwnership re-applies the captured metadata with
`debugfs sif` so setuid/root-owned files keep their identity.
4. imagepull.InjectGuestAgents drops the vsock agent + network
bootstrap + first-boot service into the ext4.
5. manifest.json is written with name/distro/arch/kernel_ref.
6. Both files are packaged as .tar.zst with max compression.
Flags: --rootfs-tar (file or '-' for stdin), --name, --distro, --arch,
--kernel-ref, --description, --size, --out. Stdout prints bundle path,
sha256, and size so callers can patch the catalog.
Unit tests cover flag registration, required-arg validation, the
bundle tar round-trip, sha256HexFile, and dirSize. An end-to-end test
runs the full pipeline against a synthesized tiny rootfs tar; skips
gracefully when mkfs.ext4 / debugfs / companion binaries are missing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`vm run` now covers bare sandbox (no args), workspace sandbox (path),
and workspace+command (path -- cmd) in a single entry point. Replaces
the old print-next-steps-and-exit behaviour: bare and workspace modes
drop into interactive ssh, command mode execs via ssh and propagates
the remote exit code through banger's own exit status.
- path argument is optional; --branch / --from still require a path.
- workspace prep and mise tooling bootstrap only run when a path is
given; command mode skips the bootstrap.
- remote command exit status is wrapped as exitCodeError so main() can
propagate it instead of collapsing every failure to 1.
- README: promote vm run with three-mode examples; demote vm create
to a scripting primitive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
newImagePullCommand mirrors newImageRegisterCommand with a positional
<oci-ref> arg, the same kernel-ref / direct-paths flag set + mutual
exclusion, plus --size that parses human-friendly values via
model.ParseSize before crossing the RPC boundary.
Calls "image.pull" RPC, prints the resulting image summary on success.
Long help warns about the Phase A bootability gap (ownership not
preserved; suitable as `image build` base, not yet directly bootable).
CLI test confirms image pull is registered with the expected flags.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces the headline feature of the kernel catalog: pulling a kernel
bundle over HTTP without any local build step.
Catalog format (internal/kernelcat/catalog.go):
- Catalog { Version, Entries } + CatEntry { Name, Distro, Arch,
KernelVersion, TarballURL, TarballSHA256, SizeBytes, Description }.
- catalog.json is embedded via go:embed and ships with each banger
binary. It starts empty (Phase 5's CI pipeline will populate it).
- Lookup(name) returns the matching entry or os.ErrNotExist.
Fetch (internal/kernelcat/fetch.go):
- HTTP GET with streaming SHA256 over the response body.
- zstd-decode (github.com/klauspost/compress/zstd) -> tar extract into
<kernelsDir>/<name>/.
- Hardens against path-traversal tarball entries (members whose
normalised path escapes the target dir, and unsafe symlink
targets) and sha256-mismatch downloads; any failure removes the
partially-populated target dir.
- Regular files, directories, and safe symlinks are supported; other
tar types (hardlinks, devices, fifos) are silently skipped.
- After extraction, recomputes sha256 over the on-disk vmlinux and
writes the manifest with Source="pull:<url>".
Daemon methods (internal/daemon/kernels.go):
- KernelPull(ctx, {Name, Force}) - lookup in embedded catalog, refuse
overwrite unless Force, delegate to kernelcat.Fetch.
- KernelCatalog(ctx) - return the embedded catalog annotated per-entry
with whether it has been pulled locally.
RPC: kernel.pull, kernel.catalog dispatch cases.
CLI:
- `banger kernel pull <name> [--force]`.
- `banger kernel list --available` prints the catalog with a
pulled/available STATE column and a human-readable size.
Tests: fetch round-trip (extract + manifest + sha256), sha256 mismatch
rejection with cleanup, missing-vmlinux rejection, path-traversal
rejection, HTTP error propagation, catalog parsing, lookup,
pulled-status reconciliation. All 20 packages green.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
`banger kernel import <name> --from <dir>` copies a staged kernel
bundle into the local catalog. <dir> is the output of
`make void-kernel` or `make alpine-kernel` (build/manual/void-kernel/
or build/manual/alpine-kernel/).
kernelcat.DiscoverPaths locates artifacts under <dir>:
1. Prefers metadata.json (written by make-void-kernel.sh).
2. Falls back to globbing: boot/vmlinux-* or vmlinuz-* (Alpine
fallback), boot/initramfs-*, lib/modules/<latest>.
The daemon's KernelImport copies kernel + optional initrd via
system.CopyFilePreferClone and modules via system.CopyDirContents
(no-sudo mode — catalog lives under ~/.local/state), computes SHA256
over the kernel, and writes the manifest via kernelcat.WriteLocal.
While wiring this up, fixed a latent bug in system.CopyDirContents:
filepath.Join(sourceDir, ".") silently drops the trailing dot, so
`cp -a source source/contents target/` was copying the whole source
directory (including its basename) instead of just its contents.
Replaced the join with a manual "/." suffix. imagemgr.StageBootArtifacts
(the only existing caller) silently benefits.
scripts/register-void-image.sh and scripts/register-alpine-image.sh
are rewritten to use `banger kernel import … && banger image register
--kernel-ref …` instead of the find-and-pass-paths dance. Preserves
the same user-facing commands and env vars.
Tests cover: metadata.json preference, glob fallback, Alpine vmlinuz
fallback, kernel-missing error, round-trip copy into the catalog, and
the --from required flag.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
`banger image register --kernel-ref <name>` now substitutes for the
--kernel/--initrd/--modules triple. The daemon looks the name up via
kernelcat.ReadLocal under d.layout.KernelsDir, populates the three
paths from the resolved entry, then continues through the existing
validate/persist flow unchanged.
Passing both --kernel-ref and any of --kernel/--initrd/--modules is
rejected — at the CLI layer (before starting the daemon) and
defensively at the RPC layer. A missing catalog entry produces a clear
"run 'banger kernel list'" message.
Once registered, the image stores the resolved absolute paths, so
deleting the catalog entry later does not invalidate already-registered
images — managed image build still copies the kernel into its artifact
dir per imagemgr.StageBootArtifacts.
Tests cover: resolution success (absolute KernelPath populated from
catalog), mutual-exclusion rejection, and missing-entry error.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>