banger/docs/oci-import.md
Thales Maciel 2478fe3cc3
Phase B-4: docs for Phase B completion
docs/oci-import.md: removed the "Phase A acquisition-only" framing
and the bootability-gap warnings. Expanded architecture section
with ApplyOwnership + InjectGuestAgents. Added a "guest-side boot
sequence" diagram-in-prose showing network → first-boot → vsock-
agent unit ordering. Added a "how to add distro support" section
pointing at the ID-case dispatch in first-boot.sh.

README.md: replaced the experimental-caveat block with an honest
"boots as a banger VM directly, no image build step required"
description. Pointer to the docs for distro support details.

Tech-debt list trimmed — ownership fixup and first-boot install
are no longer planned work, they shipped. What remains: private-
registry auth (authn.DefaultKeychain), cache eviction, first-boot
timeout UX (retry still works but could be smoother with a
FirstBootPending flag), non-systemd distros.

All 20 packages green. make lint clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:06:37 -03:00

193 lines
8.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# OCI import (`banger image pull`)
`banger image pull <oci-ref>` downloads a container image from any
OCI-compatible registry (Docker Hub, GHCR, quay.io, self-hosted, …),
flattens its layers into an ext4 rootfs, and registers the result as
a managed banger image.
Paired with the kernel catalog, this dissolves the "where do I get a
rootfs" bottleneck for most users — any distro that ships an official
container image can now boot (eventually) as a banger VM.
```bash
banger kernel pull void-6.12
banger image pull docker.io/library/debian:bookworm --kernel-ref void-6.12
banger image list # debian-bookworm appears, Managed=true
```
## What works
- Pulling any public OCI image that exposes a `linux/amd64` manifest.
- Correct layer replay with whiteout semantics (`.wh.*` deletes,
`.wh..wh..opq` opaque-dir markers).
- Path-traversal and relative-symlink-escape protection.
- Content-aware default sizing (`content × 1.25`, floor 1 GiB).
- Layer caching on disk, keyed by blob SHA256.
- **File ownership preservation.** Tar-header uid/gid/mode is captured
during flatten and applied to the resulting ext4 via a `debugfs`
pass, so setuid binaries (`sudo`, `passwd`) and root-owned config
files (`/etc/shadow`, `/etc/sudoers`) end up correctly owned.
- **Banger guest agents pre-injected.** The pulled ext4 ships with
`/usr/local/bin/banger-vsock-agent`, `banger-network.service`, and
`banger-vsock-agent.service` already in place and enabled.
- **First-boot sshd install.** A one-shot systemd service installs
`openssh-server` via the guest's package manager on first boot —
apt-get / apk / dnf / pacman / zypper dispatch based on
`/etc/os-release`. Subsequent boots skip the install.
- Piping pulled images into the existing `banger image build
--from-image` flow.
## What doesn't yet work
- **Private registries**. Auth is not implemented; anonymous pulls
only. Docker Hub, GHCR (public), quay.io (public), etc. all work.
- **Non-`linux/amd64` platforms**. The kernel catalog is x86_64-only,
so pulled rootfses match. `arm64` is additive in the schema; wire-
up lands when a user needs it.
- **Non-systemd distros.** The injected units assume systemd as PID 1.
Alpine ≥3.20 ships systemd; older alpine + void + busybox-init
images won't honour the banger-network / banger-first-boot units.
- **First boot needs network access.** The provisioning step reaches
out to the distro's package repo to install openssh-server. VMs
without NAT or without the bridge reaching the internet will time
out on first boot. The marker file stays in place so a later boot
retries.
## Architecture
`internal/imagepull/` owns the pure mechanics:
- **`Pull`** (`imagepull.go`) wraps `go-containerregistry`'s
`remote.Image` with the `linux/amd64` platform pinned. Layer
blobs are cached on disk via `cache.NewFilesystemCache` under
`<OCICacheDir>/blobs/` — Pull itself does not drain the layer
streams; that happens lazily during `Flatten`, and the cache
populates on read.
- **`Flatten`** (`flatten.go`) replays layers oldest-first into a
staging directory, applying whiteouts and rejecting unsafe paths.
Returns a `Metadata` map capturing per-file uid/gid/mode from
each tar header.
- **`BuildExt4`** (`ext4.go`) runs `mkfs.ext4 -F -d <staging>
-E root_owner=0:0` to populate the image file at create time —
no mount, no sudo, no loopback. Requires `e2fsprogs ≥ 1.43`
(`mkfs.ext4 -d` is the populate-at-create flag; nearly all
modern distros ship it).
- **`ApplyOwnership`** (`ownership.go`) streams a batched
`set_inode_field` script to `debugfs -w -f -` to rewrite per-file
uid/gid/mode to the captured tar-header values. Without this pass
the ext4 would carry the runner's on-disk uids.
- **`InjectGuestAgents`** (`inject.go`) uses the same `debugfs`
scripting to drop banger's guest-side assets into the pulled ext4
with root ownership:
- `/usr/local/bin/banger-vsock-agent`
- `/usr/local/libexec/banger-network-bootstrap`
- `/usr/local/libexec/banger-first-boot`
- `/etc/systemd/system/banger-{network,vsock-agent,first-boot}.service`
- enable-at-boot symlinks under `multi-user.target.wants/`
- `/etc/modules-load.d/banger-vsock.conf`
- `/var/lib/banger/first-boot-pending` (marker file)
`internal/daemon/images_pull.go` orchestrates:
1. Parse + validate the OCI ref.
2. Derive a friendly default name (`debian-bookworm` for
`docker.io/library/debian:bookworm`) when `--name` is omitted.
3. Resolve kernel info via the shared `resolveKernelInputs` helper
(the same code path as `image register --kernel-ref`).
4. Stage at `<ImagesDir>/<id>.staging`; extract layers to a temp
tree under `os.TempDir` (bulk transient data stays off the
persistent state filesystem).
5. `imagepull.BuildExt4` produces `<staging>/rootfs.ext4`.
6. `ApplyOwnership` + `InjectGuestAgents` run in one finalize step.
7. `imagemgr.StageBootArtifacts` stages the kernel triple alongside.
8. Atomic `os.Rename(<staging>, <final>)` publishes the artifact dir.
9. Persist a `model.Image{Managed: true, …}` record.
Any failure removes the staging dir. Post-rename failures remove the
final dir and roll back the store write.
## Guest-side boot sequence
On the first boot of a pulled image, systemd starts three banger
units in order:
1. **`banger-network.service`** — runs the bootstrap script that
parses `/etc/banger-network.conf` (written by banger's VM-create
lifecycle) and brings the guest interface up with the assigned IP.
2. **`banger-first-boot.service`** (only on first boot; removes its
own trigger file on success) — reads `/etc/os-release`, dispatches
to the native package manager, installs `openssh-server`, enables
`ssh.service` / `sshd.service`.
3. **`banger-vsock-agent.service`** — runs the health-check daemon
banger uses to confirm the VM is alive.
After first boot completes, subsequent boots skip the install step
entirely. Banger's host-side SSH polling (`guest.WaitForSSH`)
naturally retries until sshd is listening.
## Adding distro support
`internal/imagepull/assets/first-boot.sh` is the POSIX-sh dispatch.
Add a new `ID=` branch and its install command to the `case` block,
then rebuild banger — the asset is `go:embed`-ed into the binary.
Supported `ID` values today: `debian`, `ubuntu`, `kali`, `raspbian`,
`linuxmint`, `pop`, `alpine`, `fedora`, `rhel`, `centos`, `rocky`,
`almalinux`, `arch`, `archlinux`, `manjaro`, `opensuse*`, `suse`.
Unknown distros fall back to `ID_LIKE`, then error clearly with a
pointer to edit the script.
## Paths
| What | Where | Purpose |
|------|-------|---------|
| Layer blob cache | `~/.cache/banger/oci/blobs/sha256/<hex>` | Re-pulls of the same image digest are local-only |
| Staging dir | `~/.local/state/banger/images/<id>.staging/` | Short-lived; atomic-renamed to `<id>/` on success |
| Staging rootfs tree | `$TMPDIR/banger-pull-<rand>/` | Extraction scratch space; removed after ext4 build |
| Published image | `~/.local/state/banger/images/<id>/rootfs.ext4` | Managed artifact stored alongside the kernel triple |
## Composition with `image build`
A pulled image boots as-is — ownership is correct, sshd installs on
first boot, banger's agents are in place. That means the existing
`image build --from-image` pipeline composes on top:
```bash
banger image build --from-image debian-bookworm --name debian-dev --docker
```
`image build` spins up a transient VM using the base image, runs
`scripts/customize.sh` over it, and saves the result as a new managed
image with the opinionated tooling (mise, opencode, claude, pi, tmux
plugins, optionally docker) layered on top.
## Tech debt
- **Auth**. When we add private-registry support, the natural path is
`authn.DefaultKeychain` from `go-containerregistry`, which already
honours `~/.docker/config.json` and the standard credential
helpers. No banger-specific config needed.
- **Cache eviction**. Layer blobs under `OCICacheDir` accumulate
forever. A `banger image cache prune` command is a cheap follow-up
when disk usage becomes a complaint.
- **First-boot timeout UX**. If you run `banger vm ssh` immediately
after `banger vm create`, the package install for `openssh-server`
may still be running and SSH will fail. Current mitigation: retry.
Better: a per-image `FirstBootPending` flag that tells the daemon
to extend its SSH wait timeout for the first boot, cleared on
success. Tracked but not implemented.
- **Non-systemd distros**. The guest agents assume systemd. Adding
openrc / s6 / busybox-init variants means keeping parallel unit
trees in `inject.go` keyed on `/etc/os-release`. Only pick up
when a user actually wants it.
## Trust model
`image pull` delegates trust to the OCI registry the user selected.
`go-containerregistry` verifies layer digests against the manifest
during download, so a tampered mirror can't ship modified layers
without breaking the sha256 chain. Beyond that, banger does not
verify OCI image signatures (cosign/sigstore) — users who care should
verify their references out-of-band.