# OCI import (`banger image pull`) `banger image pull ` downloads a container image from any OCI-compatible registry (Docker Hub, GHCR, quay.io, self-hosted, …), flattens its layers into an ext4 rootfs, and registers the result as a managed banger image. Paired with the kernel catalog, this dissolves the "where do I get a rootfs" bottleneck for most users — any distro that ships an official container image can now boot (eventually) as a banger VM. ```bash banger kernel pull void-6.12 banger image pull docker.io/library/debian:bookworm --kernel-ref void-6.12 banger image list # debian-bookworm appears, Managed=true ``` ## What works - Pulling any public OCI image that exposes a `linux/amd64` manifest. - Correct layer replay with whiteout semantics (`.wh.*` deletes, `.wh..wh..opq` opaque-dir markers). - Path-traversal and relative-symlink-escape protection. - Content-aware default sizing (`content × 1.25`, floor 1 GiB). - Layer caching on disk, keyed by blob SHA256. - **File ownership preservation.** Tar-header uid/gid/mode is captured during flatten and applied to the resulting ext4 via a `debugfs` pass, so setuid binaries (`sudo`, `passwd`) and root-owned config files (`/etc/shadow`, `/etc/sudoers`) end up correctly owned. - **Banger guest agents pre-injected.** The pulled ext4 ships with `/usr/local/bin/banger-vsock-agent`, `banger-network.service`, and `banger-vsock-agent.service` already in place and enabled. - **First-boot sshd install.** A one-shot systemd service installs `openssh-server` via the guest's package manager on first boot — apt-get / apk / dnf / pacman / zypper dispatch based on `/etc/os-release`. Subsequent boots skip the install. - Piping pulled images into the existing `banger image build --from-image` flow. ## What doesn't yet work - **Private registries**. Auth is not implemented; anonymous pulls only. Docker Hub, GHCR (public), quay.io (public), etc. all work. - **Non-`linux/amd64` platforms**. The kernel catalog is x86_64-only, so pulled rootfses match. `arm64` is additive in the schema; wire- up lands when a user needs it. - **Non-systemd distros.** The injected units assume systemd as PID 1. Alpine ≥3.20 ships systemd; older alpine + void + busybox-init images won't honour the banger-network / banger-first-boot units. - **First boot needs network access.** The provisioning step reaches out to the distro's package repo to install openssh-server. VMs without NAT or without the bridge reaching the internet will time out on first boot. The marker file stays in place so a later boot retries. ## Architecture `internal/imagepull/` owns the pure mechanics: - **`Pull`** (`imagepull.go`) wraps `go-containerregistry`'s `remote.Image` with the `linux/amd64` platform pinned. Layer blobs are cached on disk via `cache.NewFilesystemCache` under `/blobs/` — Pull itself does not drain the layer streams; that happens lazily during `Flatten`, and the cache populates on read. - **`Flatten`** (`flatten.go`) replays layers oldest-first into a staging directory, applying whiteouts and rejecting unsafe paths. Returns a `Metadata` map capturing per-file uid/gid/mode from each tar header. - **`BuildExt4`** (`ext4.go`) runs `mkfs.ext4 -F -d -E root_owner=0:0` to populate the image file at create time — no mount, no sudo, no loopback. Requires `e2fsprogs ≥ 1.43` (`mkfs.ext4 -d` is the populate-at-create flag; nearly all modern distros ship it). - **`ApplyOwnership`** (`ownership.go`) streams a batched `set_inode_field` script to `debugfs -w -f -` to rewrite per-file uid/gid/mode to the captured tar-header values. Without this pass the ext4 would carry the runner's on-disk uids. - **`InjectGuestAgents`** (`inject.go`) uses the same `debugfs` scripting to drop banger's guest-side assets into the pulled ext4 with root ownership: - `/usr/local/bin/banger-vsock-agent` - `/usr/local/libexec/banger-network-bootstrap` - `/usr/local/libexec/banger-first-boot` - `/etc/systemd/system/banger-{network,vsock-agent,first-boot}.service` - enable-at-boot symlinks under `multi-user.target.wants/` - `/etc/modules-load.d/banger-vsock.conf` - `/var/lib/banger/first-boot-pending` (marker file) `internal/daemon/images_pull.go` orchestrates: 1. Parse + validate the OCI ref. 2. Derive a friendly default name (`debian-bookworm` for `docker.io/library/debian:bookworm`) when `--name` is omitted. 3. Resolve kernel info via the shared `resolveKernelInputs` helper (the same code path as `image register --kernel-ref`). 4. Stage at `/.staging`; extract layers to a temp tree under `os.TempDir` (bulk transient data stays off the persistent state filesystem). 5. `imagepull.BuildExt4` produces `/rootfs.ext4`. 6. `ApplyOwnership` + `InjectGuestAgents` run in one finalize step. 7. `imagemgr.StageBootArtifacts` stages the kernel triple alongside. 8. Atomic `os.Rename(, )` publishes the artifact dir. 9. Persist a `model.Image{Managed: true, …}` record. Any failure removes the staging dir. Post-rename failures remove the final dir and roll back the store write. ## Guest-side boot sequence On the first boot of a pulled image, systemd starts three banger units in order: 1. **`banger-network.service`** — runs the bootstrap script that parses `/etc/banger-network.conf` (written by banger's VM-create lifecycle) and brings the guest interface up with the assigned IP. 2. **`banger-first-boot.service`** (only on first boot; removes its own trigger file on success) — reads `/etc/os-release`, dispatches to the native package manager, installs `openssh-server`, enables `ssh.service` / `sshd.service`. 3. **`banger-vsock-agent.service`** — runs the health-check daemon banger uses to confirm the VM is alive. After first boot completes, subsequent boots skip the install step entirely. Banger's host-side SSH polling (`guest.WaitForSSH`) naturally retries until sshd is listening. ## Adding distro support `internal/imagepull/assets/first-boot.sh` is the POSIX-sh dispatch. Add a new `ID=` branch and its install command to the `case` block, then rebuild banger — the asset is `go:embed`-ed into the binary. Supported `ID` values today: `debian`, `ubuntu`, `kali`, `raspbian`, `linuxmint`, `pop`, `alpine`, `fedora`, `rhel`, `centos`, `rocky`, `almalinux`, `arch`, `archlinux`, `manjaro`, `opensuse*`, `suse`. Unknown distros fall back to `ID_LIKE`, then error clearly with a pointer to edit the script. ## Paths | What | Where | Purpose | |------|-------|---------| | Layer blob cache | `~/.cache/banger/oci/blobs/sha256/` | Re-pulls of the same image digest are local-only | | Staging dir | `~/.local/state/banger/images/.staging/` | Short-lived; atomic-renamed to `/` on success | | Staging rootfs tree | `$TMPDIR/banger-pull-/` | Extraction scratch space; removed after ext4 build | | Published image | `~/.local/state/banger/images//rootfs.ext4` | Managed artifact stored alongside the kernel triple | ## Composition with `image build` A pulled image boots as-is — ownership is correct, sshd installs on first boot, banger's agents are in place. That means the existing `image build --from-image` pipeline composes on top: ```bash banger image build --from-image debian-bookworm --name debian-dev --docker ``` `image build` spins up a transient VM using the base image, runs `scripts/customize.sh` over it, and saves the result as a new managed image with the opinionated tooling (mise, opencode, claude, pi, tmux plugins, optionally docker) layered on top. ## Tech debt - **Auth**. When we add private-registry support, the natural path is `authn.DefaultKeychain` from `go-containerregistry`, which already honours `~/.docker/config.json` and the standard credential helpers. No banger-specific config needed. - **Cache eviction**. Layer blobs under `OCICacheDir` accumulate forever. A `banger image cache prune` command is a cheap follow-up when disk usage becomes a complaint. - **First-boot timeout UX**. If you run `banger vm ssh` immediately after `banger vm create`, the package install for `openssh-server` may still be running and SSH will fail. Current mitigation: retry. Better: a per-image `FirstBootPending` flag that tells the daemon to extend its SSH wait timeout for the first boot, cleared on success. Tracked but not implemented. - **Non-systemd distros**. The guest agents assume systemd. Adding openrc / s6 / busybox-init variants means keeping parallel unit trees in `inject.go` keyed on `/etc/os-release`. Only pick up when a user actually wants it. ## Trust model `image pull` delegates trust to the OCI registry the user selected. `go-containerregistry` verifies layer digests against the manifest during download, so a tampered mirror can't ship modified layers without breaking the sha256 chain. Beyond that, banger does not verify OCI image signatures (cosign/sigstore) — users who care should verify their references out-of-band.