banger/docs/oci-import.md
Thales Maciel d743a8ba4b
daemon: persist teardown fallbacks and reject unsafe import paths
Preserve cleanup after daemon restarts and harden OCI and tar imports
against filenames that debugfs cannot encode safely.

Mirror tap, loop, and dm teardown identity onto VM.Runtime, teach
cleanup and reconcile to fall back to those persisted fields when
handles.json is missing or corrupt, and clear the recovery state on
stop, error, and delete paths.

Reject debugfs-hostile entry names during flattening and in
ApplyOwnership itself, then add regression coverage for corrupt
handles.json recovery and unsafe import paths.

Verified with targeted go tests, make lint-go, make lint-shell, and
make build.
2026-04-23 16:21:59 -03:00

153 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# OCI import (`banger image pull`)
`banger image pull` has two paths. The primary one — catalog bundle —
is documented in [`docs/image-catalog.md`](image-catalog.md). This
doc covers the fallthrough: OCI-registry pull for arbitrary container
images.
## When to use it
Use the OCI path when you need a distro or image that isn't in the
catalog. The catalog covers the common happy path
(`debian-bookworm`); anything else (`alpine`, `fedora`, `ubuntu`,
custom corporate images) goes through OCI pull.
```bash
banger image pull docker.io/library/alpine:3.20 --kernel-ref generic-6.12
banger image pull ghcr.io/myorg/devimg:v2 --kernel-ref generic-6.12
```
`banger image pull` dispatches based on the reference:
- `banger image pull debian-bookworm` → catalog (fast path).
- `banger image pull docker.io/library/foo:bar` → OCI (anything not
in the catalog).
## What works
- Any public OCI image that exposes a `linux/amd64` manifest.
- Correct layer replay with whiteout semantics (`.wh.*` deletes,
`.wh..wh..opq` opaque-dir markers).
- Path-traversal, debugfs-hostile filename, and relative-symlink-escape protection.
- Content-aware default sizing (`content × 1.5`, floor 1 GiB).
- Layer caching on disk, keyed by blob sha256.
- **Ownership preservation** — tar-header uid/gid/mode captured
during flatten, applied to the ext4 via a `debugfs` pass, so
setuid binaries (`sudo`, `passwd`) and root-owned config
(`/etc/shadow`, `/etc/sudoers`) end up correctly owned.
- **Pre-injected banger agents** — the pulled ext4 ships with
`banger-vsock-agent`, `banger-network.service`, and the
`banger-first-boot` unit already enabled.
- **First-boot sshd install** — a one-shot systemd service installs
`openssh-server` via the guest's package manager on first boot.
Dispatches on `/etc/os-release``apt-get` / `apk` / `dnf` /
`pacman` / `zypper`. Subsequent boots skip the install.
## What doesn't yet work
- **Private registries**. Anonymous pulls only. Docker Hub, GHCR
(public), quay.io (public) all work. Adding auth via
`authn.DefaultKeychain` (from `go-containerregistry`) is a cheap
follow-up when someone needs it.
- **Non-`linux/amd64`**. The kernel catalog is x86_64-only, so pulled
rootfses match. `arm64` is additive in the schema.
- **Non-systemd rootfses**. The injected units assume systemd as
PID 1. Alpine ≥3.20 ships systemd; older alpine + void + busybox-
init images won't honour the banger-* units.
- **First boot needs network access**. The first-boot sshd install
reaches out to the distro's package repo. VMs without NAT or
without the bridge reaching the internet time out. The marker file
stays in place so a later restart retries.
## Architecture
`internal/imagepull/` owns the mechanics:
- **`Pull`** wraps `go-containerregistry`'s `remote.Image` with the
`linux/amd64` platform pinned. Layer blobs cache under
`~/.cache/banger/oci/blobs/` and populate lazily during flatten.
- **`Flatten`** replays layers oldest-first into a staging directory,
applies whiteouts, rejects unsafe paths plus filenames that banger's
debugfs ownership fixup cannot encode safely. Returns a `Metadata`
map of per-file uid/gid/mode from tar headers.
- **`BuildExt4`** runs `mkfs.ext4 -F -d <staging> -E root_owner=0:0`
at the size of the pre-truncated file — no mount, no sudo, no
loopback. Requires `e2fsprogs ≥ 1.43`.
- **`ApplyOwnership`** streams a batched `set_inode_field` script to
`debugfs -w` to rewrite per-file uid/gid/mode to the captured tar-
header values.
- **`InjectGuestAgents`** uses the same `debugfs` scripting to drop
banger's guest assets into the ext4 with root ownership:
vsock agent binary, network bootstrap + unit, first-boot script +
unit, `multi-user.target.wants` symlinks, vsock modules-load
config, `/var/lib/banger/first-boot-pending` marker.
`internal/daemon/images_pull.go` orchestrates `pullFromOCI`:
1. Parse + validate the OCI ref, derive a default name when `--name`
is omitted (`debian-bookworm` from
`docker.io/library/debian:bookworm`).
2. Resolve kernel info via `resolveKernelInputs` (auto-pulls from
`kernelcat` if `--kernel-ref` names a catalog entry that isn't
yet local).
3. Stage at `<ImagesDir>/<id>.staging`; extract layers to a temp
tree under `$TMPDIR`.
4. `BuildExt4``ApplyOwnership``InjectGuestAgents`.
5. `imagemgr.StageBootArtifacts` stages the kernel triple alongside.
6. Atomic `os.Rename` publishes the artifact dir.
7. Persist a `model.Image{Managed: true, …}` record.
## Guest-side boot sequence
On first boot of a pulled image:
1. **`banger-network.service`** — brings the guest interface up with
the IP assigned by banger's VM-create lifecycle.
2. **`banger-first-boot.service`** (first boot only) — reads
`/etc/os-release`, dispatches to the native package manager,
installs `openssh-server`, enables `ssh.service`.
3. **`banger-vsock-agent.service`** — the health-check daemon banger
uses to confirm the VM is alive.
Subsequent boots skip step 2.
## Adding distro support to first-boot
`internal/imagepull/assets/first-boot.sh` is the POSIX-sh dispatch.
Add a new `ID=` branch and its install command, then rebuild banger
(the asset is `go:embed`-ed).
Supported `ID` values today: `debian`, `ubuntu`, `kali`, `raspbian`,
`linuxmint`, `pop`, `alpine`, `fedora`, `rhel`, `centos`, `rocky`,
`almalinux`, `arch`, `archlinux`, `manjaro`, `opensuse*`, `suse`.
Unknown distros fall back to `ID_LIKE`, then error cleanly.
## Paths
| What | Where |
|------|-------|
| Layer blob cache | `~/.cache/banger/oci/blobs/sha256/<hex>` |
| Staging dir | `~/.local/state/banger/images/<id>.staging/` |
| Extraction scratch | `$TMPDIR/banger-pull-<rand>/` |
| Published image | `~/.local/state/banger/images/<id>/rootfs.ext4` |
## Tech debt
- **Auth**. When we add private-registry support, the natural path
is `authn.DefaultKeychain`, which honours `~/.docker/config.json`
and the standard credential helpers.
- **Cache eviction**. OCI layer blobs accumulate forever. A `banger
image cache prune` command is a cheap follow-up when disk usage
becomes a complaint.
- **Non-systemd rootfses**. The guest agents assume systemd. Adding
openrc / s6 / busybox-init variants means keeping parallel unit
trees keyed on `/etc/os-release`.
## Trust model
`image pull` (OCI path) delegates trust to the registry the user
selected. `go-containerregistry` verifies layer digests against the
manifest during download, so a tampered mirror can't ship modified
layers without breaking the sha256 chain. Banger does not verify OCI
image signatures (cosign/sigstore) — users who care should verify
references out-of-band.