Commit graph

2 commits

Author SHA1 Message Date
bddfa75feb
imagepull.Pull: don't eager-open layer readers
The eager "fetch once to surface network errors" loop in Pull was
opening each layer's Compressed() stream and immediately closing it
without draining. The go-containerregistry filesystem cache populates
lazily via tee-on-read — opening and closing without reading wrote
ZERO-BYTE blobs into the cache. Every subsequent pull of the same
digest then served those corrupted blobs, producing a 1 GiB ext4
containing nothing but banger's injected files.

Symptom caught during B-4 live verification: real debian:bookworm
pulls had 43 used inodes (out of 65536) and /usr contained only
/usr/local — the debian content was silently missing.

Fix: remove the eager-fetch loop entirely. Flatten naturally drains
layers when it reads them, and the cache populates correctly on that
path. Network errors now surface from Flatten instead of Pull, which
is fine — they surface at the same place they always had to.

Test TestPullCachesLayersAndReturnsImage → TestPullResolvesImageAnd
FlattenPopulatesCache, reworded to assert the new contract: Pull
resolves the image; Flatten is what populates the cache with
non-empty blobs.

Users with a corrupted cache from a pre-fix pull must clear it:
  rm -rf ~/.cache/banger/oci

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:03:52 -03:00
78376ba6ec
Phase 1: imagepull package — pull, flatten, ext4
New internal/imagepull/ subpackage. Three concerns, each
independently testable:

Pull (imagepull.go):
 - github.com/google/go-containerregistry's remote.Image with the
   linux/amd64 platform pinned. Anonymous pulls only for v1.
 - Layer blobs cached on disk via cache.NewFilesystemCache under
   <cacheDir>/blobs/sha256/<hex> — OCI-standard layout so
   skopeo/crane could co-exist later.
 - Eagerly touches every layer once so network errors surface at
   Pull time, not deep in Flatten.

Flatten (flatten.go):
 - Replays layers oldest-first into destDir.
 - Whiteout-aware: .wh.<name> deletes the named entry,
   .wh..wh..opq wipes the parent directory's contents from prior
   layers.
 - Path-traversal hardening mirrored from kernelcat extractTar:
   reject .., absolute paths, and symlinks/hardlinks whose
   resolved target escapes destDir.
 - Handles tar.TypeReg, TypeDir, TypeSymlink, TypeLink. Skips
   device/fifo nodes silently (need privilege; udev/devtmpfs
   handles them in the guest).

BuildExt4 (ext4.go):
 - Truncates outFile to sizeBytes, then runs `mkfs.ext4 -F -d
   <srcDir> -E root_owner=0:0`. No mount, no sudo, no loopback.
 - 64 MiB floor; callers handle real sizing with content-aware
   headroom.
 - File ownership in the resulting ext4 reflects srcDir's on-disk
   ownership — runner's uid/gid since extraction was unprivileged.
   Documented in package doc as a Phase A v1 limitation; Phase B
   will add a debugfs- or tar2ext4-based ownership fixup.

paths.Layout gains OCICacheDir at $XDG_CACHE_HOME/banger/oci/,
ensured at startup alongside the other dirs.

Tests use go-containerregistry's in-process registry to push and
pull synthetic multi-layer images. Cover: layer caching round-trip,
whiteout + opaque-marker handling, path-traversal rejection, unsafe
symlink rejection, real mkfs.ext4 round-trip (skipped if mkfs.ext4
absent), and tiny-size rejection.

go-containerregistry v0.21.5 added as a direct dep, plus its
transitive closure (containerd/stargz, opencontainers/go-digest,
docker/cli config helpers, etc).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 17:22:13 -03:00