banger/docs/image-catalog.md
Thales Maciel 8029b2e1bc
docs: promote vm run + image catalog as the happy path
Lead the README with `banger vm run` (one command, auto-pull default
image + kernel from the catalogs), move `image register` / `image
build` / OCI-pull to a "power-user flows" section. Golden-image
content from customize.sh moves to the golden-image Dockerfile story.

New `docs/image-catalog.md` mirrors `docs/kernel-catalog.md` — the
bundle format, content-addressed filenames, publish flow, trust
model, R2 hosting. Cross-links with oci-import.md.

`docs/oci-import.md` refactored to document the OCI-pull path as the
fallthrough for arbitrary registry refs (it's the secondary path now
that the catalog covers the headline debian-bookworm case). Phase A
caveats removed — ownership fixup, agent injection, and first-boot
sshd install all landed.

AGENTS.md: promotes `vm run` as the smoke-test primitive, notes the
default-image auto-pull behaviour, and points at both catalog docs.

README shrinks 330 → 198 lines, mostly by removing the experimental
void/alpine sections (those flows still work as advanced scripts but
the README no longer advertises them).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:33:30 -03:00

123 lines
4.6 KiB
Markdown

# Image catalog
The image catalog ships pre-built banger rootfs bundles so users don't
have to register or build anything. It's the fast path behind
`banger vm run` (auto-pull) and `banger image pull <name>`. The
catalog is embedded into the banger binary and updated each release.
End-user flow:
```bash
banger image pull debian-bookworm # explicit
banger vm run --name sandbox # implicit (auto-pulls)
```
## Architecture
Two parts — the same shape as the kernel catalog:
1. **`internal/imagecat/catalog.json`** — JSON manifest embedded into
the banger binary via `go:embed`. Each entry: name, distro, arch,
kernel_ref (a `kernelcat` entry name), tarball URL, tarball
sha256, size.
2. **Tarballs at `https://images.thaloco.com/`** — Cloudflare R2
bucket `banger-images`, fronted by a public custom domain. Each
tarball is `<name>-<arch>-<sha256-prefix>.tar.zst` (content-
addressed filename so CDN edge cache can never serve stale bytes
for the URL the catalog points at). Contents at the archive root:
`rootfs.ext4` (finalized: flattened + ownership-fixed + agent-
injected at build time) and `manifest.json`.
The `banger image pull` bundle path streams the tarball, verifies
sha256 against the catalog entry, extracts both files into a staging
dir, resolves the kernel via `kernel_ref` (auto-pulling from
`kernelcat` if the user hasn't pulled it yet), stages boot artifacts
alongside, and registers the result as a managed image.
The same `image pull` command transparently falls through to the
existing OCI-pull path when `<name>` doesn't match a catalog entry —
see [`docs/oci-import.md`](oci-import.md).
## Adding or updating an entry
The repo has no CI for bundle publishing yet. Catalog updates are
manual.
```bash
# 1. Build the bundle + upload + patch catalog.json in one shot.
scripts/publish-golden-image.sh
# 2. Review and commit the catalog change.
git diff -- internal/imagecat/catalog.json
git add internal/imagecat/catalog.json
git commit -m 'imagecat: publish debian-bookworm'
# 3. Rebuild so the new catalog is embedded.
make build
```
`scripts/publish-golden-image.sh` wraps `scripts/make-golden-bundle.sh`
(which runs `docker build` on `images/golden/Dockerfile` then pipes
`docker export` into `banger internal make-bundle`), computes the
bundle's sha256, uses the first 12 hex chars as a cache-busting
filename suffix, uploads via `rclone` to R2, HEAD-checks the public
URL, and patches `internal/imagecat/catalog.json`.
Environment overrides if the defaults need to change:
`RCLONE_REMOTE`, `RCLONE_BUCKET`, `BASE_URL`.
`--skip-upload` builds the bundle into `dist/` and stops — useful for
local testing without touching R2 or the catalog.
## Bundle format
A bundle is a tar+zstd archive with exactly two entries at the root:
```
rootfs.ext4 # finalized banger rootfs
manifest.json # {name, distro, arch, kernel_ref, description}
```
`rootfs.ext4` is fully prepared at build time: ownership fixed via
`debugfs sif`, banger guest agents (vsock agent, network bootstrap,
first-boot unit) already injected and enabled in
`multi-user.target.wants`. The pull path only has to place the file
and register the image — no mkfs, no ownership pass, no injection on
the daemon host.
## Removing an entry
1. Remove the entry from `internal/imagecat/catalog.json` and commit.
2. Delete the tarball from R2:
`rclone delete banger-images:banger-images/<name>-<arch>-<hash>.tar.zst`.
3. Rebuild banger.
Already-pulled local images are not invalidated — users keep using
them until they run `banger image delete <name>`.
## Versioning conventions
- **Entry names**: `<distro>-<release>` (e.g. `debian-bookworm`).
Per-release names make it trivial to publish `debian-trixie`
alongside without collisions.
- **Content-addressed filenames**: the `-<sha256-prefix>` suffix is
mandatory (set by `publish-golden-image.sh`). Never reuse a URL for
different bytes.
- **Architecture**: `x86_64` only today. The `arch` field is additive
— adding `arm64` is a config change, not a schema change.
## Trust model
Same as the kernel catalog: the embedded `catalog.json` carries each
bundle's sha256, and `imagecat.Fetch` rejects any download whose hash
doesn't match. This protects against transport corruption and against
an attacker swapping an R2 object without landing a commit in the
banger repo. GPG/sigstore signing is deferred until banger is public
and the threat model justifies the operational overhead.
## Hosting
Tarballs live in Cloudflare R2 (bucket `banger-images`), served at
`images.thaloco.com`. The bucket is publicly readable; writes require
the R2 API token configured on the `banger-images` rclone remote.