banger/docs/oci-import.md
Thales Maciel 8029b2e1bc
docs: promote vm run + image catalog as the happy path
Lead the README with `banger vm run` (one command, auto-pull default
image + kernel from the catalogs), move `image register` / `image
build` / OCI-pull to a "power-user flows" section. Golden-image
content from customize.sh moves to the golden-image Dockerfile story.

New `docs/image-catalog.md` mirrors `docs/kernel-catalog.md` — the
bundle format, content-addressed filenames, publish flow, trust
model, R2 hosting. Cross-links with oci-import.md.

`docs/oci-import.md` refactored to document the OCI-pull path as the
fallthrough for arbitrary registry refs (it's the secondary path now
that the catalog covers the headline debian-bookworm case). Phase A
caveats removed — ownership fixup, agent injection, and first-boot
sshd install all landed.

AGENTS.md: promotes `vm run` as the smoke-test primitive, notes the
default-image auto-pull behaviour, and points at both catalog docs.

README shrinks 330 → 198 lines, mostly by removing the experimental
void/alpine sections (those flows still work as advanced scripts but
the README no longer advertises them).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:33:30 -03:00

6.5 KiB
Raw Blame History

OCI import (banger image pull)

banger image pull has two paths. The primary one — catalog bundle — is documented in docs/image-catalog.md. This doc covers the fallthrough: OCI-registry pull for arbitrary container images.

When to use it

Use the OCI path when you need a distro or image that isn't in the catalog. The catalog covers the common happy path (debian-bookworm); anything else (alpine, fedora, ubuntu, custom corporate images) goes through OCI pull.

banger image pull docker.io/library/alpine:3.20 --kernel-ref generic-6.12
banger image pull ghcr.io/myorg/devimg:v2        --kernel-ref generic-6.12

banger image pull dispatches based on the reference:

  • banger image pull debian-bookworm → catalog (fast path).
  • banger image pull docker.io/library/foo:bar → OCI (anything not in the catalog).

What works

  • Any public OCI image that exposes a linux/amd64 manifest.
  • Correct layer replay with whiteout semantics (.wh.* deletes, .wh..wh..opq opaque-dir markers).
  • Path-traversal and relative-symlink-escape protection.
  • Content-aware default sizing (content × 1.5, floor 1 GiB).
  • Layer caching on disk, keyed by blob sha256.
  • Ownership preservation — tar-header uid/gid/mode captured during flatten, applied to the ext4 via a debugfs pass, so setuid binaries (sudo, passwd) and root-owned config (/etc/shadow, /etc/sudoers) end up correctly owned.
  • Pre-injected banger agents — the pulled ext4 ships with banger-vsock-agent, banger-network.service, and the banger-first-boot unit already enabled.
  • First-boot sshd install — a one-shot systemd service installs openssh-server via the guest's package manager on first boot. Dispatches on /etc/os-releaseapt-get / apk / dnf / pacman / zypper. Subsequent boots skip the install.
  • Composition with image build --from-image.

What doesn't yet work

  • Private registries. Anonymous pulls only. Docker Hub, GHCR (public), quay.io (public) all work. Adding auth via authn.DefaultKeychain (from go-containerregistry) is a cheap follow-up when someone needs it.
  • Non-linux/amd64. The kernel catalog is x86_64-only, so pulled rootfses match. arm64 is additive in the schema.
  • Non-systemd rootfses. The injected units assume systemd as PID 1. Alpine ≥3.20 ships systemd; older alpine + void + busybox- init images won't honour the banger-* units.
  • First boot needs network access. The first-boot sshd install reaches out to the distro's package repo. VMs without NAT or without the bridge reaching the internet time out. The marker file stays in place so a later restart retries.

Architecture

internal/imagepull/ owns the mechanics:

  • Pull wraps go-containerregistry's remote.Image with the linux/amd64 platform pinned. Layer blobs cache under ~/.cache/banger/oci/blobs/ and populate lazily during flatten.
  • Flatten replays layers oldest-first into a staging directory, applies whiteouts, rejects unsafe paths. Returns a Metadata map of per-file uid/gid/mode from tar headers.
  • BuildExt4 runs mkfs.ext4 -F -d <staging> -E root_owner=0:0 at the size of the pre-truncated file — no mount, no sudo, no loopback. Requires e2fsprogs ≥ 1.43.
  • ApplyOwnership streams a batched set_inode_field script to debugfs -w to rewrite per-file uid/gid/mode to the captured tar- header values.
  • InjectGuestAgents uses the same debugfs scripting to drop banger's guest assets into the ext4 with root ownership: vsock agent binary, network bootstrap + unit, first-boot script + unit, multi-user.target.wants symlinks, vsock modules-load config, /var/lib/banger/first-boot-pending marker.

internal/daemon/images_pull.go orchestrates pullFromOCI:

  1. Parse + validate the OCI ref, derive a default name when --name is omitted (debian-bookworm from docker.io/library/debian:bookworm).
  2. Resolve kernel info via resolveKernelInputs (auto-pulls from kernelcat if --kernel-ref names a catalog entry that isn't yet local).
  3. Stage at <ImagesDir>/<id>.staging; extract layers to a temp tree under $TMPDIR.
  4. BuildExt4ApplyOwnershipInjectGuestAgents.
  5. imagemgr.StageBootArtifacts stages the kernel triple alongside.
  6. Atomic os.Rename publishes the artifact dir.
  7. Persist a model.Image{Managed: true, …} record.

Guest-side boot sequence

On first boot of a pulled image:

  1. banger-network.service — brings the guest interface up with the IP assigned by banger's VM-create lifecycle.
  2. banger-first-boot.service (first boot only) — reads /etc/os-release, dispatches to the native package manager, installs openssh-server, enables ssh.service.
  3. banger-vsock-agent.service — the health-check daemon banger uses to confirm the VM is alive.

Subsequent boots skip step 2.

Adding distro support to first-boot

internal/imagepull/assets/first-boot.sh is the POSIX-sh dispatch. Add a new ID= branch and its install command, then rebuild banger (the asset is go:embed-ed).

Supported ID values today: debian, ubuntu, kali, raspbian, linuxmint, pop, alpine, fedora, rhel, centos, rocky, almalinux, arch, archlinux, manjaro, opensuse*, suse. Unknown distros fall back to ID_LIKE, then error cleanly.

Paths

What Where
Layer blob cache ~/.cache/banger/oci/blobs/sha256/<hex>
Staging dir ~/.local/state/banger/images/<id>.staging/
Extraction scratch $TMPDIR/banger-pull-<rand>/
Published image ~/.local/state/banger/images/<id>/rootfs.ext4

Tech debt

  • Auth. When we add private-registry support, the natural path is authn.DefaultKeychain, which honours ~/.docker/config.json and the standard credential helpers.
  • Cache eviction. OCI layer blobs accumulate forever. A banger image cache prune command is a cheap follow-up when disk usage becomes a complaint.
  • Non-systemd rootfses. The guest agents assume systemd. Adding openrc / s6 / busybox-init variants means keeping parallel unit trees keyed on /etc/os-release.

Trust model

image pull (OCI path) delegates trust to the registry the user selected. go-containerregistry verifies layer digests against the manifest during download, so a tampered mirror can't ship modified layers without breaking the sha256 chain. Banger does not verify OCI image signatures (cosign/sigstore) — users who care should verify references out-of-band.