banger/docs/oci-import.md

135 lines
5.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# OCI import (`banger image pull`)
`banger image pull` has two paths. The primary one — catalog bundle —
is documented in [`docs/image-catalog.md`](image-catalog.md). This
doc covers the fallthrough: OCI-registry pull for arbitrary container
images.
## When to use it
Use the OCI path when you need a distro or image that isn't in the
catalog. The catalog covers the common happy path
(`debian-bookworm`); anything else (`alpine`, `fedora`, `ubuntu`,
custom corporate images) goes through OCI pull.
```bash
banger image pull docker.io/library/alpine:3.20 --kernel-ref generic-6.12
banger image pull ghcr.io/myorg/devimg:v2 --kernel-ref generic-6.12
```
`banger image pull` dispatches based on the reference:
- `banger image pull debian-bookworm` → catalog (fast path).
- `banger image pull docker.io/library/foo:bar` → OCI (anything not
in the catalog).
## What works
- Any public OCI image that exposes a `linux/amd64` manifest.
- Correct layer replay with whiteout semantics (`.wh.*` deletes,
`.wh..wh..opq` opaque-dir markers).
- Path-traversal, debugfs-hostile filename, and relative-symlink-escape protection.
- Content-aware default sizing (`content × 1.5`, floor 1 GiB).
- Layer caching on disk, keyed by blob sha256.
- **Ownership preservation** — tar-header uid/gid/mode captured
during flatten, applied to the ext4 via a `debugfs` pass, so
setuid binaries (`sudo`, `passwd`) and root-owned config
(`/etc/shadow`, `/etc/sudoers`) end up correctly owned.
- **Pre-injected banger agents** — the pulled ext4 ships with
`banger-vsock-agent`, `banger-network.service`, and the
`banger-first-boot` unit already enabled.
- **First-boot sshd install** — a one-shot systemd service installs
`openssh-server` via the guest's package manager on first boot.
Dispatches on `/etc/os-release``apt-get` / `apk` / `dnf` /
`pacman` / `zypper`. Subsequent boots skip the install.
## What doesn't yet work
- **Private registries**. Anonymous pulls only. Docker Hub, GHCR
(public), quay.io (public) all work. Adding auth via
`authn.DefaultKeychain` (from `go-containerregistry`) is a cheap
follow-up when someone needs it.
- **Non-`linux/amd64`**. The kernel catalog is x86_64-only, so pulled
rootfses match. `arm64` is additive in the schema.
- **Non-systemd rootfses**. The injected units assume systemd as
PID 1. Alpine ≥3.20 ships systemd; older alpine + void + busybox-
init images won't honour the banger-* units.
- **First boot needs network access**. The first-boot sshd install
reaches out to the distro's package repo. VMs without NAT or
without the bridge reaching the internet time out. The marker file
stays in place so a later restart retries.
## Architecture
> Implementation details live in [`docs/oci-import-internals.md`](oci-import-internals.md).
## Guest-side boot sequence
On first boot of a pulled image:
1. **`banger-network.service`** — brings the guest interface up with
the IP assigned by banger's VM-create lifecycle.
2. **`banger-first-boot.service`** (first boot only) — reads
`/etc/os-release`, dispatches to the native package manager,
installs `openssh-server`, enables `ssh.service`.
3. **`banger-vsock-agent.service`** — the health-check daemon banger
uses to confirm the VM is alive.
Subsequent boots skip step 2.
## Adding distro support to first-boot
`internal/imagepull/assets/first-boot.sh` is the POSIX-sh dispatch.
Add a new `ID=` branch and its install command, then rebuild banger
(the asset is `go:embed`-ed).
Supported `ID` values today: `debian`, `ubuntu`, `kali`, `raspbian`,
`linuxmint`, `pop`, `alpine`, `fedora`, `rhel`, `centos`, `rocky`,
`almalinux`, `arch`, `archlinux`, `manjaro`, `opensuse*`, `suse`.
Unknown distros fall back to `ID_LIKE`, then error cleanly.
## Paths
Paths below assume the system install (`banger system install`). When
running `bangerd` directly without the helper, the same files live
under `~/.cache/banger/` and `~/.local/state/banger/` instead.
| What | Where |
|------|-------|
| Layer blob cache | `/var/cache/banger/oci/blobs/sha256/<hex>` |
| Staging dir | `/var/lib/banger/images/<id>.staging/` |
| Extraction scratch | `$TMPDIR/banger-pull-<rand>/` |
| Published image | `/var/lib/banger/images/<id>/rootfs.ext4` |
## Cache lifecycle
OCI layer blobs accumulate as you pull images. Banger flattens every
pull into a self-contained ext4, so the cache is purely a re-pull
avoidance — losing it only costs network round-trips on the next
pull of the same image. Reclaim disk with:
```
banger image cache prune --dry-run # report size only
banger image cache prune # remove every cached blob
```
Run with the daemon idle; an in-flight pull racing against prune may
fail and need a retry.
## Tech debt
- **Auth**. When we add private-registry support, the natural path
is `authn.DefaultKeychain`, which honours `~/.docker/config.json`
and the standard credential helpers.
- **Non-systemd rootfses**. The guest agents assume systemd. Adding
openrc / s6 / busybox-init variants means keeping parallel unit
trees keyed on `/etc/os-release`.
## Trust model
`image pull` (OCI path) delegates trust to the registry the user
selected. `go-containerregistry` verifies layer digests against the
manifest during download, so a tampered mirror can't ship modified
layers without breaking the sha256 chain. Banger does not verify OCI
image signatures (cosign/sigstore) — users who care should verify
references out-of-band.