imagepull: update stale package + BuildExt4 docs

The package doc in internal/imagepull/imagepull.go still described
a two-step Pull + Flatten + BuildExt4 pipeline and warned that the
resulting image was "suitable as input to `image build` but not
directly bootable" because ownership preservation was deferred.
That's been wrong for a while: ApplyOwnership
(internal/imagepull/ownership.go) restores tar-header uid/gid/mode
via a debugfs set_inode_field batch, and InjectGuestAgents
(internal/imagepull/inject.go) writes banger's guest-side assets
into the image. `image pull` now produces a directly bootable
rootfs end-to-end.

Updated:
  - imagepull.go package doc — describes the full
    Pull → Flatten → BuildExt4 → ApplyOwnership → InjectGuestAgents
    pipeline and drops the "Phase A limitations" list that spoke
    of deferred ownership.
  - ext4.go BuildExt4 doc — notes that the filesystem is root-owned
    via `-E root_owner=0:0` and points at ApplyOwnership as the
    step that handles per-file ownership, instead of the previous
    "see the package doc for the implications" handwave.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Thales Maciel 2026-04-23 14:34:25 -03:00
parent 5eceebe49f
commit 2ebd2b64bb
No known key found for this signature in database
GPG key ID: 33112E6833C34679
2 changed files with 32 additions and 21 deletions

View file

@ -17,12 +17,14 @@ const MinExt4Size int64 = 1 << 20 * 64 // 64 MiB
// BuildExt4 creates outFile as a sparse ext4 image of sizeBytes and
// populates it from srcDir using `mkfs.ext4 -F -d`. No mount, no sudo.
//
// sizeBytes must be at least MinExt4Size. Callers are expected to size
// the file with headroom over the staged tree (the daemon orchestrator
// does this; this function only enforces a sanity floor).
// sizeBytes must be at least MinExt4Size. Callers size the file with
// headroom over the staged tree (the daemon orchestrator does this;
// this function only enforces a sanity floor).
//
// The resulting image's file ownership reflects srcDir's on-disk
// ownership — see the package doc for the implications.
// The filesystem itself is root-owned via `-E root_owner=0:0`, but
// the per-file uid/gid/mode inside srcDir are the runner's — Go's
// unprivileged tar extraction can't preserve them. The pipeline's
// next step, ApplyOwnership, restores the tar-header values.
func BuildExt4(ctx context.Context, runner system.CommandRunner, srcDir, outFile string, sizeBytes int64) error {
if sizeBytes < MinExt4Size {
return fmt.Errorf("ext4 size %d below minimum %d", sizeBytes, MinExt4Size)

View file

@ -1,26 +1,35 @@
// Package imagepull pulls OCI container images from registries and lays
// them down as banger-ready ext4 rootfs files. The package is a primitive:
// it produces an ext4 file plus per-file ownership metadata. Higher layers
// (the daemon's PullImage orchestrator) decide where the file lands and
// how it gets registered.
// them down as banger-ready, directly-bootable ext4 rootfs files. The
// package is a primitive: each step does one thing and returns. The
// daemon's PullImage orchestrator (internal/daemon/images_pull.go)
// drives the pipeline and decides where the output lands.
//
// Pipeline, in call order:
//
// Three concerns:
// - Pull resolves an OCI reference, selects the linux/amd64 platform,
// and returns a v1.Image whose layer blobs are cached on disk so
// re-pulls are cheap.
// and returns a v1.Image whose layer blobs are cached on disk under
// cacheDir/blobs/sha256/<hex> so re-pulls are local.
// - Flatten replays the layers in order into a staging directory,
// applies whiteouts, and rejects unsafe paths/symlinks.
// - BuildExt4 turns that staging directory into an ext4 file via
// `mkfs.ext4 -d` (no mount, no sudo).
// applies whiteouts, rejects unsafe paths/symlinks, and returns
// Metadata capturing the original tar-header uid/gid/mode for
// every entry.
// - BuildExt4 turns the staging directory into an ext4 file via
// `mkfs.ext4 -F -d` (no mount, no sudo). Root-owns the filesystem
// via `-E root_owner=0:0`.
// - ApplyOwnership streams a debugfs `set_inode_field` script to
// rewrite per-file uid/gid/mode from the captured Metadata —
// restores setuid bits, root-owned configs, etc. that `mkfs.ext4
// -d` would have left as the runner's uid/gid.
// - InjectGuestAgents writes banger's guest-side assets (vsock
// agent binary + systemd unit, network bootstrap script + unit,
// vsock module load) into the image in a single debugfs -w batch.
//
// Limitations (Phase A v1):
// The result is a bootable rootfs. The daemon registers it with the
// image store; from then on, `vm run` uses it like any other image.
//
// Limitations:
// - Anonymous registry pulls only. Auth is deferred.
// - Hardcoded linux/amd64. Other platforms reject at Pull time.
// - File ownership in the resulting ext4 is the runner's uid/gid;
// setuid binaries and root-owned config files lose their original
// ownership. Phase B will add a debugfs- or tar2ext4-based fixup
// pass; until then the produced image is suitable as input to
// `image build` but not directly bootable.
package imagepull
import (