imagepull.Pull: don't eager-open layer readers

The eager "fetch once to surface network errors" loop in Pull was
opening each layer's Compressed() stream and immediately closing it
without draining. The go-containerregistry filesystem cache populates
lazily via tee-on-read — opening and closing without reading wrote
ZERO-BYTE blobs into the cache. Every subsequent pull of the same
digest then served those corrupted blobs, producing a 1 GiB ext4
containing nothing but banger's injected files.

Symptom caught during B-4 live verification: real debian:bookworm
pulls had 43 used inodes (out of 65536) and /usr contained only
/usr/local — the debian content was silently missing.

Fix: remove the eager-fetch loop entirely. Flatten naturally drains
layers when it reads them, and the cache populates correctly on that
path. Network errors now surface from Flatten instead of Pull, which
is fine — they surface at the same place they always had to.

Test TestPullCachesLayersAndReturnsImage → TestPullResolvesImageAnd
FlattenPopulatesCache, reworded to assert the new contract: Pull
resolves the image; Flatten is what populates the cache with
non-empty blobs.

Users with a corrupted cache from a pre-fix pull must clear it:
  rm -rf ~/.cache/banger/oci

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Thales Maciel 2026-04-16 19:03:52 -03:00
parent c3fb4ccc3e
commit bddfa75feb
No known key found for this signature in database
GPG key ID: 33112E6833C34679
2 changed files with 27 additions and 22 deletions

View file

@ -78,20 +78,11 @@ func Pull(ctx context.Context, ref, cacheDir string) (PulledImage, error) {
return PulledImage{}, fmt.Errorf("resolve digest for %q: %w", ref, err)
}
// Touch the layers once so they are guaranteed present in the cache
// before Flatten runs; surfaces network errors here, not deep inside
// Flatten's hot loop.
layers, err := cached.Layers()
if err != nil {
return PulledImage{}, fmt.Errorf("read layers for %q: %w", ref, err)
}
for i, layer := range layers {
rc, err := layer.Compressed()
if err != nil {
return PulledImage{}, fmt.Errorf("fetch layer %d for %q: %w", i, ref, err)
}
_ = rc.Close()
}
// The filesystem cache populates lazily: blobs only land on disk once
// Flatten drains them via layer.Uncompressed() / Compressed(). We
// deliberately do NOT eagerly open layers here — opening without
// draining writes a zero-byte blob to the cache, which then poisons
// every subsequent pull of the same digest.
return PulledImage{
Reference: parsed.String(),