Preserve cleanup after daemon restarts and harden OCI and tar imports
against filenames that debugfs cannot encode safely.
Mirror tap, loop, and dm teardown identity onto VM.Runtime, teach
cleanup and reconcile to fall back to those persisted fields when
handles.json is missing or corrupt, and clear the recovery state on
stop, error, and delete paths.
Reject debugfs-hostile entry names during flattening and in
ApplyOwnership itself, then add regression coverage for corrupt
handles.json recovery and unsafe import paths.
Verified with targeted go tests, make lint-go, make lint-shell, and
make build.
The eager "fetch once to surface network errors" loop in Pull was
opening each layer's Compressed() stream and immediately closing it
without draining. The go-containerregistry filesystem cache populates
lazily via tee-on-read — opening and closing without reading wrote
ZERO-BYTE blobs into the cache. Every subsequent pull of the same
digest then served those corrupted blobs, producing a 1 GiB ext4
containing nothing but banger's injected files.
Symptom caught during B-4 live verification: real debian:bookworm
pulls had 43 used inodes (out of 65536) and /usr contained only
/usr/local — the debian content was silently missing.
Fix: remove the eager-fetch loop entirely. Flatten naturally drains
layers when it reads them, and the cache populates correctly on that
path. Network errors now surface from Flatten instead of Pull, which
is fine — they surface at the same place they always had to.
Test TestPullCachesLayersAndReturnsImage → TestPullResolvesImageAnd
FlattenPopulatesCache, reworded to assert the new contract: Pull
resolves the image; Flatten is what populates the cache with
non-empty blobs.
Users with a corrupted cache from a pre-fix pull must clear it:
rm -rf ~/.cache/banger/oci
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
imagepull.Flatten now captures per-file uid/gid/mode/type from the
tar headers as it walks layers, returning a Metadata map alongside
the extracted tree. Whiteouts correctly drop the victim's metadata.
The returned Metadata feeds the new imagepull.ApplyOwnership, which
pipes a batched `set_inode_field` script to `debugfs -w -f -`.
Why: mkfs.ext4 -d copies the runner's on-disk uids verbatim, so
without this pass setuid binaries become setuid-nonroot and sshd
refuses to start on the resulting image. With the pass, a pulled
debian:bookworm has /usr/bin/sudo with uid=0 + setuid bit surviving
intact.
imagepull.BuildExt4 signature unchanged; ownership is applied as a
separate step by the daemon orchestrator between BuildExt4 and
StageBootArtifacts, keeping each helper focused. The seam
(d.pullAndFlatten) now returns (Metadata, error) for test stubs to
feed synthetic metadata.
StdinRunner is a new duck-typed extension next to CommandRunner;
the real system.Runner implements RunStdin, test mocks don't need
to unless they exercise stdin. Prevents every existing mock from
growing a new method.
Tests:
- TestFlattenCapturesHeaderMetadata: setuid bit + mode survive the
tar-header walk
- TestApplyOwnershipRewritesUidGidMode: real debugfs round-trip —
create ext4 with runner's uid, apply synthetic metadata setting
uid=0 + setuid mode, verify via `debugfs -R stat` that the
inode now has uid=0 and mode 04755
- TestBuildOwnershipScriptDeterministic: sorted, well-formed
sif script output
Debugfs and mkfs.ext4 tests skip if the binaries aren't on PATH.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Container (and kernel) layers routinely ship symlinks with absolute
targets — /usr/bin/mawk, /lib/modules/<ver>/build, etc. Those are
interpreted relative to the rootfs at runtime (`/` inside the VM),
not against the host filesystem, so they are rooted inside dest by
construction and need no escape check at write time.
The previous logic resolved absolute Linknames literally (against
the host root), compared to the staging dir, and rejected everything
that didn't happen to live under it. That made `banger image pull
docker.io/library/debian:bookworm` fail on the very first symlink
("etc/alternatives/awk -> /usr/bin/mawk").
Relative targets still get the traversal check — a relative
Linkname with ../s can genuinely escape dest at write time even if
in-VM resolution would be safe — so the defense against malicious
relative chains is intact.
Tests:
- TestFlattenAcceptsAbsoluteSymlink replaces the old overly-strict
test, using the exact etc/alternatives/awk -> /usr/bin/mawk case
that broke debian:bookworm.
- TestFlattenRejectsRelativeSymlinkEscape confirms relative-with-
traversal is still rejected with the same "unsafe symlink"
error.
Same fix applied in internal/kernelcat/fetch.go for consistency;
future kernel bundles with absolute symlinks in the modules tree
would otherwise hit the same wall.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New internal/imagepull/ subpackage. Three concerns, each
independently testable:
Pull (imagepull.go):
- github.com/google/go-containerregistry's remote.Image with the
linux/amd64 platform pinned. Anonymous pulls only for v1.
- Layer blobs cached on disk via cache.NewFilesystemCache under
<cacheDir>/blobs/sha256/<hex> — OCI-standard layout so
skopeo/crane could co-exist later.
- Eagerly touches every layer once so network errors surface at
Pull time, not deep in Flatten.
Flatten (flatten.go):
- Replays layers oldest-first into destDir.
- Whiteout-aware: .wh.<name> deletes the named entry,
.wh..wh..opq wipes the parent directory's contents from prior
layers.
- Path-traversal hardening mirrored from kernelcat extractTar:
reject .., absolute paths, and symlinks/hardlinks whose
resolved target escapes destDir.
- Handles tar.TypeReg, TypeDir, TypeSymlink, TypeLink. Skips
device/fifo nodes silently (need privilege; udev/devtmpfs
handles them in the guest).
BuildExt4 (ext4.go):
- Truncates outFile to sizeBytes, then runs `mkfs.ext4 -F -d
<srcDir> -E root_owner=0:0`. No mount, no sudo, no loopback.
- 64 MiB floor; callers handle real sizing with content-aware
headroom.
- File ownership in the resulting ext4 reflects srcDir's on-disk
ownership — runner's uid/gid since extraction was unprivileged.
Documented in package doc as a Phase A v1 limitation; Phase B
will add a debugfs- or tar2ext4-based ownership fixup.
paths.Layout gains OCICacheDir at $XDG_CACHE_HOME/banger/oci/,
ensured at startup alongside the other dirs.
Tests use go-containerregistry's in-process registry to push and
pull synthetic multi-layer images. Cover: layer caching round-trip,
whiteout + opaque-marker handling, path-traversal rejection, unsafe
symlink rejection, real mkfs.ext4 round-trip (skipped if mkfs.ext4
absent), and tiny-size rejection.
go-containerregistry v0.21.5 added as a direct dep, plus its
transitive closure (containerd/stargz, opencontainers/go-digest,
docker/cli config helpers, etc).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>