Commit graph

30 commits

Author SHA1 Message Date
759fa20602
docs: add release-process runbook
Captures the cut-and-publish workflow currently encoded only in
scripts/publish-banger-release.sh and the CHANGELOG patterns. Covers:

- Release artefacts + R2 paths + the install.sh-at-bucket-root
  contract.
- Trust model recap (cosign pubkey pinned in both verify_signature.go
  and scripts/install.sh; drift check enforced by the publish script).
- Pre-flight checklist: green smoke, CHANGELOG entry with the right
  Keep-a-Changelog headings, link-table bump, explicit callout when
  unit files changed (banger update swaps binaries, not units).
- Cut order: publish first, tag after, verify from a clean machine.
- Verification-release rule: any fix to runUpdate / unit templates /
  helper-daemon restart sequencing requires an immediate no-op +1
  release so a host on the buggy version can update to it and observe
  the fix live with the new binary in the driver seat. v0.1.3 and
  v0.1.5 are the existing examples.
- Patch vs minor: minor = exposed API/contract change (vsock guest-
  agent protocol, CLI flag removal, RPC shape, non-forward-compatible
  store schema); everything else is patch.
- Sibling catalogs: kernel + golden-image entries are go:embed-ed,
  so they piggyback on the next banger release.
- Mid-release recovery for signature drift, partial rclone, re-cut,
  and bad-tag cleanup (never reuse a version).

AGENTS.md gets a one-liner pointer so the maintainer guide surfaces
the runbook without duplicating it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:25:36 -03:00
dea655ce95
docs: fix paths from local to system install 2026-04-30 10:49:10 -03:00
fae28e3d8b
update: docs + publish script for the self-update feature
README gets a top-level Updating section; docs/privileges.md gains
a step-by-step trust-model writeup of `banger update`. The new
scripts/publish-banger-release.sh drives the manual release cut:
build, tar, sha256sum, cosign sign-blob, verify against the embedded
public key, jq-merge into manifest.json, rclone upload to the R2
bucket. Refuses outright if the embedded key is still the placeholder
so we can't accidentally publish an unverifiable release. Also folds
in gofmt drift accumulated across the updater package and a few
sibling files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:43:46 -03:00
f7a6832ebf
Merge model,cli,docs polish for v0.1.0
# Conflicts:
#	internal/cli/commands_image.go
2026-04-28 17:36:47 -03:00
d0997fd3b5
model,cli,docs: medium-effort polish for v0.1.0
* model.ParseSize / FormatSizeBytes: pinned with table tests in
    internal/model/types_test.go (TestParseSize 22 cases,
    TestFormatSizeBytes 11 cases, TestParseSizeFormatRoundTrip 7
    boundaries). Fixed the long-suffix regression: "4GiB", "512MiB",
    "4KiB" now parse correctly (parser strips trailing IB before
    inspecting the unit byte). Pinned current behaviour for
    no-suffix input ("1024" treated as MiB) and FormatSizeBytes(0).
    commands_image.go --size flag-help updated to show 4GiB now
    that the parser accepts it.
  * vm ports --json: matches the JSON-vs-table inconsistency between
    vm stats (always JSON) and vm ports (always table). --json on
    vm ports flips to the same printJSON path as vm stats. Default
    table output unchanged. Other vm subcommands (show, stats,
    logs, health, ping) didn't fit the identical pattern; left
    alone.
  * docs/oci-import.md architecture section moved to a new
    docs/oci-import-internals.md (precedent: internal/daemon/
    ARCHITECTURE.md). User-facing oci-import.md keeps a one-line
    pointer for advanced reading.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:36:03 -03:00
33639efe0c
docs: fix three security-sensitive doc/code mismatches
A pre-release audit caught three places where the docs misrepresent
the trust model. Each is a claim users would read while auditing
banger and reach the wrong conclusion.

  * docs/privileges.md:140, 194 — bridge default was documented as
    "banger0" but the code default (model.DefaultBridgeName) is
    "br-fc". A user following the manual-removal recipe would `ip
    link del banger0` against a non-existent interface.
  * docs/privileges.md:192 — uninstall recipe said "stop your VMs
    first via `banger vm stop --all`". That flag doesn't exist; vm
    stop is a per-name action. Replaced with the actual options:
    `banger vm prune` (bulk) or per-VM `banger vm stop <name>`.
  * docs/privileges.md:255 and README.md:78-79 — helper unit's
    CapabilityBoundingSet was listed as 5 caps; the actual set in
    commands_system.go:370 is 11 (we added FOWNER/KILL/MKNOD/SETGID/
    SETUID/SYS_CHROOT during Phase B and never updated the docs).
    Updated both lists; the "what's NOT included" rationale stays
    accurate against the new positive list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 17:30:58 -03:00
4d8dca6b72
image: add banger image cache prune for OCI cache cleanup
OCI layer blobs accumulate forever — every pull writes layers to
~/.cache/banger/oci/blobs/sha256/<hex> via go-containerregistry's
filesystem cache, and nothing ever evicts them. The cache is purely
a re-pull-avoidance (every flattened image is independent of the
blobs that sourced it), so it's a perfect candidate for an opt-in
operator-driven prune.

New surface:
  * api: ImageCachePruneParams{DryRun}, ImageCachePruneResult
    {BytesFreed, BlobsFreed, DryRun, CacheDir}.
  * daemon: ImageService.PruneOCICache walks layout.OCICacheDir for
    a (bytes, blobs) tally, then — outside dry-run — atomically
    renames the cache aside, recreates it empty, and rm -rf's the
    aside dir. The rename-then-rm avoids leaving the cache in a
    half-removed state if a pull starts mid-prune (the in-flight
    pull's open files survive the rename via standard Linux
    semantics; it just sees a fresh empty cache afterwards). Missing
    cache dir is treated as zero — fresh installs that have never
    pulled an OCI image don't error.
  * dispatch: image.cache.prune RPC (paramHandler-wrapped, mirroring
    every other image RPC). Documented-methods test list updated.
  * cli: `banger image cache` group with a `prune` subcommand
    (--dry-run flag). Output is a single line: "freed 1.2 GiB
    across 47 blob(s) in /var/cache/banger/oci" or "would free …".
    formatBytes helper for the size pretty-print.

docs/oci-import.md: replaced the "Tech debt: cache eviction" bullet
with a "Cache lifecycle" section describing the new command and
the in-flight-pull caveat.

Tests: PruneOCICache covers the happy path (real prune empties the
cache, recreates an empty dir, doesn't leak the .pruning- aside),
the dry-run path (returns size, leaves blobs intact), and the
fresh-install path (cache dir absent → zero result, no error).
Smoke at JOBS=4 still green; live exercise against an empty cache
on a system install prints the expected zero summary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:32:57 -03:00
182bccf8af
roothelper: pin bridge name + IP + CIDR to a banger-managed shape
priv.ensure_bridge / priv.create_tap accepted the daemon's network
config triple (BridgeName, BridgeIP, CIDR) and forwarded it straight
to `ip link` / `ip addr` / `ip link set master`. Argv-style exec
ruled out shell injection, but the kernel happily honours those
commands against any iface a compromised owner-uid daemon names —
including eth0/docker0/lo. Concretely:

  * priv.ensure_bridge could `ip link set <iface> up` against any
    host interface and `ip addr add` arbitrary IP/CIDR to it.
  * priv.create_tap could `ip link set <new-tap> master <iface>`,
    bridging the per-VM tap into the host's primary LAN so the
    guest sees host-local broadcast traffic.
  * priv.sync_resolver_routing / priv.clear_resolver_routing only
    enforced "name shaped like a Linux iface" — no banger constraint.

New validators (single chokepoint via validateNetworkConfig):
  * validateBangerBridgeName: name must equal "br-fc" or start with
    "br-fc-". Stops a compromised daemon from naming any host iface
    in these RPCs. Users with a custom bridge keep the prefix.
  * validateCIDRPrefix: numeric in [8, 32]. Wider prefixes would
    silently widen the bridge subnet beyond what the daemon intends.
  * validateNetworkConfig bundles bridge-name + validateIPv4 +
    validateCIDRPrefix so every helper RPC that takes the triple
    stays in lockstep.

Wired into methodEnsureBridge, methodCreateTap, and the resolver-
routing pair (replacing the older validateLinuxIfaceName-only check
with the stricter banger-bridge check).

docs/privileges.md updated: the helper-RPC table rows now spell out
the banger-managed bridge constraint, and the trust list includes
the new validators.

Tests: TestValidateBangerBridgeName (default + suffixed accepted,
host ifaces / wrong prefix / oversized rejected), TestValidate
CIDRPrefix (boundary + non-numeric + IPv6-style 64 rejected),
TestValidateNetworkConfig (happy path + each-field-bad cases).
Smoke at JOBS=4 still green — banger's defaults sail through the
new gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 16:19:28 -03:00
45826f0db0
docs: add config.md reference for the daemon TOML schema
README previously punted on the config schema with a "full key list in
internal/config/config.go" pointer. New docs/config.md walks every
TOML key the daemon reads — top-level, [vm_defaults], [[file_sync]] —
with type, default, and a one-sentence description per row, plus a
copy-pasteable example at the bottom.

Sourced 1:1 from internal/config/config.go's fileConfig (and the
defaults in load() + internal/model/types.go), so it stays accurate
as long as those structs are the schema source of truth.

README's existing config section now points at docs/config.md, and
the "Further reading" list gets it as the first bullet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:11:18 -03:00
7d7c15a370
docs: fix config-file path in privileges.md
The filesystem-mutations table referred to `~/.config/banger/banger.toml`,
but the daemon reads `~/.config/banger/config.toml` (per
internal/config/config.go and README.md). Bring privileges.md in line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:11:06 -03:00
853249dec2
roothelper: tighten input validation across privileged RPCs
Defence-in-depth pass over every helper method that touches the host
as root. Each fix narrows what a compromised owner-uid daemon could
ask the helper to do; many close concrete file-ownership and DoS
primitives that the previous validators didn't reach.

Path / identifier validation:
  * priv.fsck_snapshot now requires /dev/mapper/fc-rootfs-* (was
    "is the string non-empty"). e2fsck -fy on /dev/sda1 was the
    motivating exploit.
  * priv.kill_process and priv.signal_process now read
    /proc/<pid>/cmdline and require a "firecracker" substring before
    sending the signal. Killing arbitrary host PIDs (sshd, init, …)
    is no longer a one-RPC primitive.
  * priv.read_ext4_file and priv.write_ext4_files now require the
    image path to live under StateDir or be /dev/mapper/fc-rootfs-*.
  * priv.cleanup_dm_snapshot validates every non-empty Handles field:
    DM name fc-rootfs-*, DM device /dev/mapper/fc-rootfs-*, loops
    /dev/loopN.
  * priv.remove_dm_snapshot accepts only fc-rootfs-* names or
    /dev/mapper/fc-rootfs-* paths.
  * priv.ensure_nat now requires a parsable IPv4 address and a
    banger-prefixed tap.
  * priv.sync_resolver_routing and priv.clear_resolver_routing now
    require a Linux iface-name-shaped bridge name (1–15 chars, no
    whitespace/'/'/':') and, for sync, a parsable resolver address.

Symlink defence:
  * priv.ensure_socket_access now validates the socket path is under
    RuntimeDir and not a symlink. The fcproc layer's chown/chmod
    moves to unix.Open(O_PATH|O_NOFOLLOW) + Fchownat(AT_EMPTY_PATH)
    + Fchmodat via /proc/self/fd, so even a swap of the leaf into a
    symlink between validation and the syscall is refused. The
    local-priv (non-root) fallback uses `chown -h`.
  * priv.cleanup_jailer_chroot rejects symlinks at both the leaf
    (os.Lstat) and intermediate path components (filepath.EvalSymlinks
    + clean-equality). The umount sweep was rewritten from shell
    `umount --recursive --lazy` to direct unix.Unmount(MNT_DETACH |
    UMOUNT_NOFOLLOW) per child mount, deepest-first; the findmnt
    guard remains as the rm-rf safety net. Local-priv mode falls
    back to `sudo umount --lazy`.

Binary validation:
  * validateRootExecutable now opens with O_PATH|O_NOFOLLOW and
    Fstats through the resulting fd. Rejects path-level symlinks and
    narrows the TOCTOU window between validation and the SDK's exec
    to fork+exec time on a healthy host.

Daemon socket:
  * The owner daemon now reads SO_PEERCRED on every accepted
    connection and refuses any UID that isn't 0 or the registered
    owner. Filesystem perms (0600 + ownerUID) already enforced this;
    the check is belt-and-braces in case the socket FD is ever
    leaked to a non-owner process.

Docs:
  * docs/privileges.md walked end-to-end. Each helper RPC's
    Validation gate row reflects what the code actually enforces.
    New section "Running outside the system install" calls out the
    looser dev-mode trust model (NOPASSWD sudoers, helper hardening
    bypassed) so users don't deploy that path on shared hosts.
    Trust list updated to include every new validator.

Tests added: validators (DM-loop, DM-remove-target, DM-handles,
ext4-image-path, iface-name, IPv4, resolver-addr, not-symlink,
firecracker-PID, root-executable variants), the daemon's authorize
path (non-unix conn rejection + unix conn happy path), the umount2
ordering contract (deepest-first + --lazy on the sudo branch), and
positive/negative cases for the chown-no-follow fallback.

Verified end-to-end via `make smoke JOBS=4` on a KVM host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:39:41 -03:00
b0b1300314
docs: add the privilege model document
Explain what runs as the owner user vs root, every helper RPC method
and its validation gate, the on-disk paths banger writes, network
mutations, and how install/uninstall work end to end. The aim is to
give a reader enough information to grant or refuse the privileges
banger asks for during system install with their eyes open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:55:18 -03:00
59e48e830b
daemon: split owner daemon from root helper
Move the supported systemd path to two services: an owner-user bangerd for
orchestration and a narrow root helper for bridge/tap, NAT/resolver, dm/loop,
and Firecracker ownership. This removes repeated sudo from daily vm and image
flows without leaving the general daemon running as root.

Add install metadata, system install/status/restart/uninstall commands, and a
system-owned runtime layout. Keep user SSH/config material in the owner home,
lock file_sync to the owner home, and move daemon known_hosts handling out of
the old root-owned control path.

Route privileged lifecycle steps through typed privilegedOps calls, harden the
two systemd units, and rewrite smoke plus docs around the supported service
model.

Verified with make build, make test, make lint, and make smoke on the
supported systemd host path.
2026-04-26 12:43:17 -03:00
d743a8ba4b
daemon: persist teardown fallbacks and reject unsafe import paths
Preserve cleanup after daemon restarts and harden OCI and tar imports
against filenames that debugfs cannot encode safely.

Mirror tap, loop, and dm teardown identity onto VM.Runtime, teach
cleanup and reconcile to fall back to those persisted fields when
handles.json is missing or corrupt, and clear the recovery state on
stop, error, and delete paths.

Reject debugfs-hostile entry names during flattening and in
ApplyOwnership itself, then add regression coverage for corrupt
handles.json recovery and unsafe import paths.

Verified with targeted go tests, make lint-go, make lint-shell, and
make build.
2026-04-23 16:21:59 -03:00
700a1e6e60
cleanup: drop pre-v0.1 migration scaffolding + legacy-behavior refs
banger hasn't shipped a public release — every "legacy", "pre-opt-in",
"previously", "migration note", "no longer" reference in the tree is
pinning against a state no real user's install has ever been in.
That scaffolding has weight: it's a coordinate system future readers
have to decode, and it keeps dead code alive.

Removed (code):
- internal/daemon/ssh_client_config.go
    - vmSSHConfigIncludeBegin / vmSSHConfigIncludeEnd constants and
      every `removeManagedBlock(existing, vm...)` call they enabled
      (legacy inline `Host *.vm` block scrub)
    - cleanupLegacySSHConfigDir (+ its caller in syncVMSSHClientConfig)
      — wiped a pre-opt-in sibling file under $ConfigDir/ssh
    - sameDirOrParent + resolvePathForComparison — only ever used
      by cleanupLegacySSHConfigDir
    - the "also check legacy marker" fallback in
      UserSSHIncludeInstalled / UninstallUserSSHInclude
- internal/store/migrations.go
    - migrateDropDeadImageColumns (migration 2) + its slice entry
    - dropColumnIfExists (orphaned after the above)
    - addColumnIfMissing + the whole "columns added across the pre-
      versioning lifetime" block at the end of migrateBaseline —
      subsumed into the baseline CREATE TABLE
    - `packages_path TEXT` column on the images table (the
      throwaway migration 2 dropped it, but there was never any
      reader)
- internal/daemon/vm.go
    - vmDNSRecordName local wrapper — was justified as "avoid
      pulling vmdns into every file"; three of four callers already
      imported vmdns directly, so inline the one stray call
- internal/cli/cli_test.go
    - TestLegacyRemovedCommandIsRejected (`tui` subcommand never
      shipped)

Removed / simplified (tests):
- ssh_client_config_test.go: dropped TestSameDirOrParentHandlesSymlinks,
  TestSyncVMSSHClientConfigPreservesUserKeyInLegacyDir,
  TestSyncVMSSHClientConfigNarrowsCleanupToLegacyFile,
  TestSyncVMSSHClientConfigLeavesUnexpectedLegacyContents,
  TestInstallUserSSHIncludeMigratesLegacyInlineBlock, plus the
  "legacy posture" regression strings in the remaining happy-path
  test; TestUninstallUserSSHIncludeRemovesBothMarkerBlocks collapsed
  to a single-block test
- migrations_test.go: dropped TestMigrateDropDeadImageColumns_AcrossInstallPaths,
  TestDropColumnIfExistsIsIdempotent; TestOpenReadOnlyDoesNotRunMigrations
  simplified to test against the baseline marker

Removed (docs):
- README.md "**Migration note.**" blockquote about the SSH-key path move
- docs/advanced.md parenthetical "(the old behaviour)"

Reworded (comments):
- Dropped "Previously this file also contained LogLevel DEBUG3..."
  history from vm_disk.go's sshdGuestConfig doc
- Dropped "Call sites that previously read vm.Runtime.{PID,...}"
  from vm_handles.go; now documents the current contract
- Dropped "Pre-v0.1 the defaults are" scaffolding in doctor_test.go
- Dropped "no longer does its own git inspection" phrasing in vm_run.go
- Dropped the "(also cleans up legacy inline block from pre-opt-in
  builds)" aside on the `ssh-config` CLI docstring
- Renamed test var `legacyKey` → `existingKey` in vm_test.go; its
  purpose was "pre-existing authorized_keys line," not banger-legacy

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 13:56:32 -03:00
80ae4d6667
docs: resync package docs, AGENTS, and kernel-catalog with current code
Four drift fixes from a doc sweep.

internal/daemon/doc.go
  Replace the capability-hook description that still said "Hook
  methods take *Daemon; VMService reaches them through a
  capabilityHooks seam." Current reality: every capability is a
  plain struct carrying its own service pointers
  (workDiskCapability{vm,ws,store}, dnsCapability{net},
  natCapability{vm,net,logger}); wireServices builds the default
  list; no hook reaches *Daemon.

internal/daemon/ARCHITECTURE.md
  The VMService field list still claimed guestWaitForSSH and
  guestDial were "per-instance fields." Those were deleted as
  refactor residue. Update the note to say the seams live on
  *Daemon (reached by WorkspaceService via closures wired at
  construction) and document the vsockHostDevice field that
  replaced the old package-global vsockHostDevicePath.

AGENTS.md
  Drop the "experimental web UI" mention (removed) and the
  `session` subpackage (removed). Mention banger-vsock-agent as
  the third cmd/ binary while we're here — AGENTS hadn't listed
  it.

docs/kernel-catalog.md
  The trust-model section still read as if upstream kernel sources
  were fetched by HTTPS alone. Add a paragraph covering the PGP
  verification make-generic-kernel.sh now does against the
  detached .tar.sign and the three kernel.org release signing keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:01:11 -03:00
2a7f55f028
vm run: ship tracked files only by default; add --include-untracked + --dry-run
Workspace-mode vm run and vm workspace prepare used to copy both
tracked AND untracked non-ignored files into the guest. That silently
catches local .env files, scratch notes, credentials, and any other
working-tree state a developer hasn't explicitly gitignored — a real
data-exposure footgun given the golden image ships Docker and the
usual dev tooling.

Flip the default to tracked-only. Users who actually want the fuller
set opt in with --include-untracked (documented in both commands'
help). Gitignored files are still always excluded regardless of the
flag.

Add --dry-run to both vm run and vm workspace prepare. Dry-run
inspects the repo CLI-side (no VM created, no daemon RPC needed since
the daemon is always local and the inspection is a pure git read),
prints the exact file list + mode, and exits. A byte-level preview of
what would land in the guest.

When running real (non-dry) and untracked files exist in the repo but
are being skipped under the new default, print a one-line notice
pointing to --include-untracked so users aren't surprised when the
guest is missing something they expected.

Signature changes:
- ListOverlayPaths takes an includeUntracked bool (tracked always;
  untracked gated by flag).
- InspectRepo takes the same flag and passes it through.
- VMWorkspacePrepareParams gains IncludeUntracked.
- WorkspaceService.workspaceInspectRepo seam signature widened to
  match (4 callers in tests updated).

New workspace package tests cover both modes and verify that
gitignored files never leak regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 19:53:17 -03:00
2b6437d1b4
remove vm session feature
Cuts the daemon-managed guest-session machinery (start/list/show/
logs/stop/kill/attach/send). The feature shipped aimed at agent-
orchestration workflows (programmatic stdin piping into a long-lived
guest process) that aren't driving any concrete user today, and the
~2.3K LOC of daemon surface area — attach bridge, FIFO keepalive,
controller registry, sessionstream framing, SQLite persistence — was
locking in an API we'd have to keep through v0.1.0.

Anything session-flavoured that people actually need today can be
done with `vm ssh + tmux` or `vm run -- cmd`.

Deleted:
- internal/cli/commands_vm_session.go
- internal/daemon/{guest_sessions,session_lifecycle,session_attach,session_stream,session_controller}.go
- internal/daemon/session/ (guest-session helpers package)
- internal/sessionstream/ (framing package)
- internal/daemon/guest_sessions_test.go
- internal/store/guest_session_test.go
- GuestSession* types from internal/{api,model}
- Store UpsertGuestSession/GetGuestSession/ListGuestSessionsByVM/DeleteGuestSession + scanner helpers
- guest.session.* RPC dispatch entries
- 5 CLI session tests, 2 completion tests, 2 printer tests

Extracted:
- ShellQuote + FormatStepError lifted to internal/daemon/workspace/util.go
  (only non-session consumer); workspace package now self-contained
- internal/daemon/guest_ssh.go keeps guestSSHClient + dialGuest +
  waitForGuestSSH — still used by workspace prepare/export
- internal/daemon/fake_firecracker_test.go preserves the test helper
  that used to live in guest_sessions_test.go

Store schema: CREATE TABLE guest_sessions and its column migrations
removed. Existing dev DBs keep an orphan table (harmless, pre-v0.1.0).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:47:58 -03:00
78ff482bfa
release prep: opt-in web UI, make uninstall, fix stale kernel-catalog docs
- WebListenAddr default is now "" (empty). The experimental web UI was
  running on 127.0.0.1:7777 by default, which surprises users who never
  opted in. Users who want it set `web_listen_addr = "127.0.0.1:7777"`
  in config.toml.
- `make uninstall` stops the daemon (if any) and removes the installed
  binaries. Preserves user data on disk but prints the paths so `rm -rf`
  can follow for a full purge. Documented in README next to install.
- docs/kernel-catalog.md: replace the `void-6.12` and `alpine-3.23`
  examples (never published) with `generic-6.12` (the only cataloged
  kernel today). Updates the versioning-convention example too.
2026-04-19 12:43:58 -03:00
221fb03d68
cli QoL: vm prune, list→ls aliases, delete→rm aliases
- `banger vm prune` sweeps every non-running VM (stopped, created,
  error) with an interactive confirmation; -f/--force skips the prompt.
  Partial failures report which VM failed and exit non-zero.
- list commands gain `ls` alias: vm list already had it; added to image
  list, kernel list, and vm session list.
- delete commands gain `rm` alias: vm delete and image delete. kernel
  rm already aliased delete/remove.

Uses new test seams (vmListFunc) plus the existing vmDeleteFunc so
prune unit-tests without touching the daemon socket.
2026-04-19 12:17:46 -03:00
88425fb857
docs: DNS routing guide; README aimed at common users
Adds docs/dns-routing.md covering how `<vm>.vm` resolution works:
auto-configuration on systemd-resolved hosts (what the daemon
already does), and per-resolver recipes for dnsmasq /
NetworkManager+dnsmasq / /etc/resolv.conf / macOS `/etc/resolver/`
/ WSL. Plus verification via `dig @127.0.0.1 -p 42069` and
troubleshooting for the common failure modes.

README reshape: lead with the three things a common user needs —
quick start, what `vm run` does, where to put hostnames + image +
config — and push the rest to docs. `vm create` / OCI `image pull`
/ `image register` / workspace-and-session primitives are all still
documented, just under docs/advanced.md where they're not in the
first-time reader's way. Web UI and unnecessary implementation
notes dropped; the "further reading" section at the bottom
enumerates the five docs pages so nothing becomes hard to find.

README shrinks from 208 → 158 lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 17:24:50 -03:00
ac7974f5b9
Remove image build --from-image; doctor treats catalog images as OK
The `image build` flow spun up a transient Firecracker VM, SSHed in,
and ran a large bash provisioning script to derive a new managed
image from an existing one. It overlapped heavily with the golden-
image Dockerfile flow (same mise/docker/tmux/opencode install logic
duplicated in Go as `imagemgr.BuildProvisionScript`) and had far more
machinery: async op state, RPC begin/status/cancel, webui form +
operation page, preflight checks, API types, tests. For custom
images, writing a Dockerfile is simpler and more reproducible.

Removed end-to-end:
- CLI `image build` subcommand + `absolutizeImageBuildPaths`.
- Daemon: BuildImage method, imagebuild.go (transient-VM orchestration),
  image_build_ops.go (async begin/status/cancel), imagemgr/build.go
  (the 247-line provisioning script generator and all its append*
  helpers), validateImageBuildPrereqs + addImageBuildPrereqs.
- RPC dispatches for image.build / .begin / .status / .cancel.
- opstate registry `imageBuildOps`, daemon seam `imageBuild`,
  background pruner call.
- API types: ImageBuildParams, ImageBuildOperation, ImageBuildBeginResult,
  ImageBuildStatusParams, ImageBuildStatusResult; model type
  ImageBuildRequest.
- Web UI: Backend interface methods, handlers, form, routes, template
  branches (images.html build form, operation.html build branch,
  dashboard.html Build button).
- Tests that directly exercised BuildImage.

Doctor polish (task C):
- Drop the "image build" preflight section entirely (its raison d'être
  is gone).
- Default-image check now accepts "not local but in imagecat" as OK:
  vm create auto-pulls on first use. Only flag when the image is
  neither locally registered nor in the catalog.

Net: 24 files touched, 1,373 lines deleted, 25 added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:54:29 -03:00
6083e2dde5
Prune legacy void/alpine + customize.sh flows
The golden-image Dockerfile + catalog pipeline replaces the entire
manual rootfs-build stack. With that shipped, the per-distro shell
flows are dead code.

Removed:
- scripts/customize.sh, scripts/interactive.sh, scripts/verify.sh
- scripts/make-rootfs{,-void,-alpine}.sh
- scripts/register-{void,alpine}-image.sh
- scripts/make-{void,alpine}-kernel.sh
- internal/imagepreset/ (only consumer was `banger internal packages`,
  which fed customize.sh)
- examples/{void,alpine}.config.toml
- Makefile targets: rootfs, rootfs-void, rootfs-alpine, void-kernel,
  alpine-kernel, void-register, alpine-register, void-vm, alpine-vm,
  verify-void, verify-alpine, plus the ALPINE_RELEASE / *_IMAGE_NAME
  / *_VM_NAME variables

The void-6.12 kernel catalog entry is also gone — golden image pairs
with generic-6.12 and nothing else in the catalog depended on it.

Consolidated: imagemgr now holds the small DebianBasePackages list +
package-hash helper inline, so the `image build --from-image` flow
(still supported) no longer pulls from a separate imagepreset package.

Net: 3,815 lines deleted, 59 added. No runtime functionality removed
beyond the `banger internal packages` subcommand (hidden, used only
by the deleted customize.sh).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:39:53 -03:00
8029b2e1bc
docs: promote vm run + image catalog as the happy path
Lead the README with `banger vm run` (one command, auto-pull default
image + kernel from the catalogs), move `image register` / `image
build` / OCI-pull to a "power-user flows" section. Golden-image
content from customize.sh moves to the golden-image Dockerfile story.

New `docs/image-catalog.md` mirrors `docs/kernel-catalog.md` — the
bundle format, content-addressed filenames, publish flow, trust
model, R2 hosting. Cross-links with oci-import.md.

`docs/oci-import.md` refactored to document the OCI-pull path as the
fallthrough for arbitrary registry refs (it's the secondary path now
that the catalog covers the headline debian-bookworm case). Phase A
caveats removed — ownership fixup, agent injection, and first-boot
sshd install all landed.

AGENTS.md: promotes `vm run` as the smoke-test primitive, notes the
default-image auto-pull behaviour, and points at both catalog docs.

README shrinks 330 → 198 lines, mostly by removing the experimental
void/alpine sections (those flows still work as advanced scripts but
the README no longer advertises them).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:33:30 -03:00
8f4be112c2
Generic kernel + init= boot path for OCI-pulled images
Closes the full arc: banger kernel pull + image pull + vm create + vm ssh
now works end-to-end against docker.io/library/debian:bookworm with zero
manual image building.

Generic kernel:
 - New scripts/make-generic-kernel.sh builds vmlinux from upstream
   kernel.org sources using Firecracker's official minimal config
   (configs/firecracker-x86_64-6.1.config). All critical drivers
   (virtio_blk, virtio_net, ext4, vsock) compiled in — no modules,
   no initramfs needed.
 - Published as generic-6.12 in the catalog (kernels.thaloco.com).
 - catalog.json updated with the new entry.

Direct-boot init= override (vm_lifecycle.go):
 - For images without an initrd (direct-boot / OCI-pulled), banger now
   passes init=/usr/local/libexec/banger-first-boot on the kernel
   cmdline. The script runs as PID 1, mounts /proc /sys /dev /run,
   checks for systemd — if present execs it immediately; if not
   (container images), installs systemd-sysv + openssh-server via the
   guest's package manager, then execs systemd.
 - Also passes kernel-level ip= parameter via BuildBootArgsWithKernelIP
   so the kernel configures the network interface before init runs
   (container images don't ship iproute2, so the userspace bootstrap
   script can't call ip(8)).
 - Masks dev-ttyS0.device and dev-vdb.device systemd units that
   otherwise wait 90s for udev events that never fire in Firecracker
   guests started from container rootfses.

first-boot.sh rewritten as universal init wrapper:
 - Works as PID 1 (mounts essential filesystems) OR as a systemd
   oneshot (existing behavior).
 - Installs both systemd-sysv AND openssh-server (container images
   have neither).
 - Dispatch updated: debian, alpine, fedora, arch, opensuse families
   + ID_LIKE fallback. All tests updated.

Opencode capability skip for direct-boot images:
 - The opencode readiness check (WaitReady on vsock port 4096) now
   returns nil for images without an initrd, since pulled container
   images don't ship the opencode service. Without this, the VM
   would be marked as error for lacking an opinionated add-on.

Docs: README and kernel-catalog.md updated to recommend generic-6.12
as the default kernel for OCI-pulled images. AGENTS.md notes the new
build script.

Verified live:
 - banger kernel pull generic-6.12
 - banger image pull docker.io/library/debian:bookworm --kernel-ref generic-6.12
 - banger vm create --image debian-bookworm --name testbox --nat
 - banger vm ssh testbox -- "id; uname -r; systemctl is-active banger-vsock-agent"
 → uid=0(root), kernel 6.12.8, Debian bookworm, vsock-agent active,
   sshd running, SSH working.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 20:12:56 -03:00
2478fe3cc3
Phase B-4: docs for Phase B completion
docs/oci-import.md: removed the "Phase A acquisition-only" framing
and the bootability-gap warnings. Expanded architecture section
with ApplyOwnership + InjectGuestAgents. Added a "guest-side boot
sequence" diagram-in-prose showing network → first-boot → vsock-
agent unit ordering. Added a "how to add distro support" section
pointing at the ID-case dispatch in first-boot.sh.

README.md: replaced the experimental-caveat block with an honest
"boots as a banger VM directly, no image build step required"
description. Pointer to the docs for distro support details.

Tech-debt list trimmed — ownership fixup and first-boot install
are no longer planned work, they shipped. What remains: private-
registry auth (authn.DefaultKeychain), cache eviction, first-boot
timeout UX (retry still works but could be smoother with a
FirstBootPending flag), non-systemd distros.

All 20 packages green. make lint clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:06:37 -03:00
2e4d4b14da
Phase 4: OCI import docs
New docs/oci-import.md covers the full Phase A story:
 - end-user flow (kernel pull + image pull + image list)
 - what works now (layer replay + whiteouts, path-traversal
   hardening, content-aware sizing, layer caching, composition
   with image build)
 - what does not work yet (direct boot due to ownership
   caveat, private registries, non-amd64 platforms)
 - architecture of internal/imagepull + the daemon orchestrator
 - path layout (OCI cache, staging, published)
 - tech debt: the three plausible ownership-fixup approaches
   (debugfs, hcsshim/tar2ext4, user namespaces) with honest
   trade-offs for Phase B to choose from later
 - trust model (digest chain covers transport; signature
   verification out of scope)

README.md gains an image pull example alongside image register
+ --kernel-ref, with a pointer to the docs and an honest "pulled
images are a base for image build, not yet directly bootable"
warning.

AGENTS.md gets the one-line note pointing at the new doc.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 17:37:07 -03:00
da4a6bf45b
Add lint targets, fix gofmt drift, broaden Makefile build inputs
Three small operational improvements.

1. Makefile build dependencies now cover everything under cmd/ and
   internal/, not just *.go. The previous GO_SOURCES find pattern
   missed embedded assets (catalog.json today, anything else added
   later), so editing a JSON manifest didn't trigger a rebuild and
   left the binary stale. New BUILD_INPUTS covers all files; go's own
   build cache absorbs any redundant invocations. GO_SOURCES is kept
   for fmt/lint targets which still want only Go files.

2. New `make lint` (default + lint-go + lint-shell):
   - lint-go: gofmt -l (fail if any output) and go vet ./...
   - lint-shell: shellcheck --severity=error on scripts/*.sh
   The shell floor is set at error-level for now; the legacy
   make-rootfs-*.sh / make-*-kernel.sh / customize.sh scripts have
   warning-level findings (sudo-cat redirects, heredoc quoting) that
   would block landing this if we tightened immediately. Documented
   as tech debt in docs/kernel-catalog.md alongside a note about
   eventually replacing the per-distro bash with a uniform Go tool.

3. gofmt drift fixed in internal/daemon/imagemgr/build.go,
   session/session.go, and vm_create_ops.go (trailing newline +
   gofmt's preferred function-definition wrapping). Now
   `make lint` passes cleanly; future drift will fail CI/local lint
   instead of accumulating.

AGENTS.md gains a one-line note on make lint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 16:49:17 -03:00
fa95849f5a
Phase 5: kernel catalog publish flow + docs
Manual publish flow for the kernel catalog, designed for the current
no-CI, private-repo state of banger.

scripts/publish-kernel.sh <name>:
 - Reads $BANGER_KERNELS_DIR/<name>/ (the canonical layout produced by
   `banger kernel import`).
 - Pulls distro / arch / kernel_version from the local manifest.
 - Packages vmlinux + optional initrd.img + optional modules/ as
   <name>-<arch>.tar.zst with zstd -19.
 - Computes sha256 + size.
 - rclone copyto -> r2:banger-kernels/<file>.
 - HEAD-checks https://kernels.thaloco.com/<file> to catch
   public-access misconfig before declaring success.
 - jq-patches internal/kernelcat/catalog.json: replaces any prior
   entry with the same name, then sorts entries by name.
 - Prints next-step git+make commands; does not commit or rebuild
   automatically.

Environment overrides RCLONE_REMOTE / RCLONE_BUCKET / BASE_URL /
BANGER_KERNELS_DIR for non-default setups.

docs/kernel-catalog.md covers the architecture (embedded JSON +
external tarballs), end-user flow, the add/update/remove playbook,
naming and tarball-layout conventions, the trust model (sha256 in
embedded catalog catches transport/swap; no signing yet), and where
the bucket lives.

README.md gains a kernel-catalog example next to the existing image
register example. AGENTS.md points at publish-kernel.sh and the docs.

.gitignore now excludes .env so accidental drops of R2 credentials
don't follow into commits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:56:56 -03:00
01c7cb5e65
Reorganize the source checkout layout
Separate tracked source from generated artifacts so the repo root stops accumulating helper scripts, manifests, and local runtime outputs.

Move manual shell entrypoints under scripts/, manifests under config/, and the Firecracker API reference under docs/reference/. Make build and runtimebundle now target build/bin, build/runtime, and build/dist as the canonical source-checkout paths.

Update runtime discovery, helper scripts, tests, and docs to follow the new layout while keeping legacy source-checkout runtime fallbacks for existing local bundles during migration.

Validated with bash -n on the moved scripts, make build, and GOCACHE=/tmp/banger-gocache go test ./....
2026-03-21 17:22:57 -03:00