Commit graph

6 commits

Author SHA1 Message Date
bddfa75feb
imagepull.Pull: don't eager-open layer readers
The eager "fetch once to surface network errors" loop in Pull was
opening each layer's Compressed() stream and immediately closing it
without draining. The go-containerregistry filesystem cache populates
lazily via tee-on-read — opening and closing without reading wrote
ZERO-BYTE blobs into the cache. Every subsequent pull of the same
digest then served those corrupted blobs, producing a 1 GiB ext4
containing nothing but banger's injected files.

Symptom caught during B-4 live verification: real debian:bookworm
pulls had 43 used inodes (out of 65536) and /usr contained only
/usr/local — the debian content was silently missing.

Fix: remove the eager-fetch loop entirely. Flatten naturally drains
layers when it reads them, and the cache populates correctly on that
path. Network errors now surface from Flatten instead of Pull, which
is fine — they surface at the same place they always had to.

Test TestPullCachesLayersAndReturnsImage → TestPullResolvesImageAnd
FlattenPopulatesCache, reworded to assert the new contract: Pull
resolves the image; Flatten is what populates the cache with
non-empty blobs.

Users with a corrupted cache from a pre-fix pull must clear it:
  rm -rf ~/.cache/banger/oci

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:03:52 -03:00
c3fb4ccc3e
Phase B-3: first-boot sshd install
New internal/imagepull/assets/first-boot.sh: POSIX-sh oneshot that
detects the guest distro from /etc/os-release (ID + ID_LIKE
fallback), installs openssh-server via the native package manager,
and enables/starts sshd. Covers debian/ubuntu/kali/raspbian/pop,
alpine, fedora/rhel/centos/rocky/almalinux, arch/manjaro, and
opensuse/suse. Unknown distros fail clearly with a pointer at
editing the script to add a branch.

Marker-driven: the service has ConditionPathExists=
/var/lib/banger/first-boot-pending, and the script removes the
marker on success. Subsequent boots no-op.

Testability seams in the script: RUN_PLAN=1 skips the
sshd-already-present short-circuit and makes the dispatch echo the
planned command instead of executing it. OS_RELEASE_FILE and
BANGER_FIRST_BOOT_MARKER env vars override paths so the Go tests
exercise the real dispatch logic in a tempdir without touching
/etc or /var/lib on the host.

Embedding: internal/imagepull/firstboot.go go:embeds both the
script and the systemd unit; exposes FirstBootScript() and
FirstBootUnit() plus the FirstBootScriptPath /
FirstBootMarkerPath / FirstBootUnitName constants.

Injection: InjectGuestAgents now drops /usr/local/libexec/
banger-first-boot (0755), /etc/systemd/system/banger-first-boot.
service (0644), the empty /var/lib/banger/first-boot-pending
marker (0644), and the multi-user.target.wants enable symlink.
All uid=0, gid=0.

Tests: eight-case dispatch-by-distro (debian, ubuntu, alpine,
fedora, arch, opensuse, plus ID_LIKE fallbacks for weird
derivatives). Script syntax check via `sh -n`. Unit-contains-
expected-fields check. Existing inject round-trip test extended
to assert the first-boot bits land in the ext4.

Deferred: per-image FirstBootPending flag + extended SSH wait
timeout at VM start. Will add if live verification (B-4) shows
the naive retry UX is unacceptable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:20:33 -03:00
491c8e1ebb
Phase B-2: pre-inject banger guest agents into pulled rootfs
New imagepull.InjectGuestAgents writes banger's guest-side assets
straight into the pulled ext4 so systemd will start them at first boot:

  /usr/local/bin/banger-vsock-agent             (binary, 0755)
  /usr/local/libexec/banger-network-bootstrap   (script, 0755)
  /etc/systemd/system/banger-network.service    (unit, 0644)
  /etc/systemd/system/banger-vsock-agent.service (unit, 0644)
  /etc/modules-load.d/banger-vsock.conf         (modules, 0644)

  plus enable-at-boot symlinks under
  /etc/systemd/system/multi-user.target.wants/

All writes + ownership + symlinks go through one `debugfs -w -f -`
invocation. No sudo required because the caller owns the ext4 file.
Script is deterministic: shallow-first mkdir, then write, then sif,
then symlink. "File exists" errors from mkdir on already-present
dirs are tolerated (debugfs keeps going past them with -f, and we
filter them out of the output scan).

Asset content reuses the existing guestnet.BootstrapScript /
SystemdServiceUnit / ConfigPath and vsockagent.ServiceUnit /
ModulesLoadConfig / GuestInstallPath — one source of truth, no
duplicated systemd unit strings.

Daemon wiring: new d.finalizePulledRootfs seam runs both
ApplyOwnership (B-1) and InjectGuestAgents as one phase between
BuildExt4 and StageBootArtifacts. The companion vsock-agent binary
is resolved via paths.CompanionBinaryPath. Existing daemon tests
stub the seam with a no-op to avoid needing a real companion
binary + debugfs in the test harness.

Tests: real-ext4 round-trip that builds a minimal ext4, runs
InjectGuestAgents, then verifies every expected path is present
via `debugfs stat`, plus uid=0 and mode 0755 on the vsock-agent
binary. Also: missing-binary rejection, ancestor-collection order
test. debugfs/mkfs.ext4 tests skip on hosts without the binaries.

After B-1+B-2, any OCI image that already ships sshd boots with
banger-network and banger-vsock-agent running; image pull is
one step from "useful rootfs primitive". B-3 (first-boot sshd
install) unlocks images that don't ship sshd.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:08:56 -03:00
43982a4ae3
Phase B-1: ownership fixup via debugfs pass
imagepull.Flatten now captures per-file uid/gid/mode/type from the
tar headers as it walks layers, returning a Metadata map alongside
the extracted tree. Whiteouts correctly drop the victim's metadata.
The returned Metadata feeds the new imagepull.ApplyOwnership, which
pipes a batched `set_inode_field` script to `debugfs -w -f -`.

Why: mkfs.ext4 -d copies the runner's on-disk uids verbatim, so
without this pass setuid binaries become setuid-nonroot and sshd
refuses to start on the resulting image. With the pass, a pulled
debian:bookworm has /usr/bin/sudo with uid=0 + setuid bit surviving
intact.

imagepull.BuildExt4 signature unchanged; ownership is applied as a
separate step by the daemon orchestrator between BuildExt4 and
StageBootArtifacts, keeping each helper focused. The seam
(d.pullAndFlatten) now returns (Metadata, error) for test stubs to
feed synthetic metadata.

StdinRunner is a new duck-typed extension next to CommandRunner;
the real system.Runner implements RunStdin, test mocks don't need
to unless they exercise stdin. Prevents every existing mock from
growing a new method.

Tests:
 - TestFlattenCapturesHeaderMetadata: setuid bit + mode survive the
   tar-header walk
 - TestApplyOwnershipRewritesUidGidMode: real debugfs round-trip —
   create ext4 with runner's uid, apply synthetic metadata setting
   uid=0 + setuid mode, verify via `debugfs -R stat` that the
   inode now has uid=0 and mode 04755
 - TestBuildOwnershipScriptDeterministic: sorted, well-formed
   sif script output

Debugfs and mkfs.ext4 tests skip if the binaries aren't on PATH.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:04:22 -03:00
fdaf7cce0f
imagepull + kernelcat: allow absolute symlink targets
Container (and kernel) layers routinely ship symlinks with absolute
targets — /usr/bin/mawk, /lib/modules/<ver>/build, etc. Those are
interpreted relative to the rootfs at runtime (`/` inside the VM),
not against the host filesystem, so they are rooted inside dest by
construction and need no escape check at write time.

The previous logic resolved absolute Linknames literally (against
the host root), compared to the staging dir, and rejected everything
that didn't happen to live under it. That made `banger image pull
docker.io/library/debian:bookworm` fail on the very first symlink
("etc/alternatives/awk -> /usr/bin/mawk").

Relative targets still get the traversal check — a relative
Linkname with ../s can genuinely escape dest at write time even if
in-VM resolution would be safe — so the defense against malicious
relative chains is intact.

Tests:
 - TestFlattenAcceptsAbsoluteSymlink replaces the old overly-strict
   test, using the exact etc/alternatives/awk -> /usr/bin/mawk case
   that broke debian:bookworm.
 - TestFlattenRejectsRelativeSymlinkEscape confirms relative-with-
   traversal is still rejected with the same "unsafe symlink"
   error.

Same fix applied in internal/kernelcat/fetch.go for consistency;
future kernel bundles with absolute symlinks in the modules tree
would otherwise hit the same wall.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 17:33:16 -03:00
78376ba6ec
Phase 1: imagepull package — pull, flatten, ext4
New internal/imagepull/ subpackage. Three concerns, each
independently testable:

Pull (imagepull.go):
 - github.com/google/go-containerregistry's remote.Image with the
   linux/amd64 platform pinned. Anonymous pulls only for v1.
 - Layer blobs cached on disk via cache.NewFilesystemCache under
   <cacheDir>/blobs/sha256/<hex> — OCI-standard layout so
   skopeo/crane could co-exist later.
 - Eagerly touches every layer once so network errors surface at
   Pull time, not deep in Flatten.

Flatten (flatten.go):
 - Replays layers oldest-first into destDir.
 - Whiteout-aware: .wh.<name> deletes the named entry,
   .wh..wh..opq wipes the parent directory's contents from prior
   layers.
 - Path-traversal hardening mirrored from kernelcat extractTar:
   reject .., absolute paths, and symlinks/hardlinks whose
   resolved target escapes destDir.
 - Handles tar.TypeReg, TypeDir, TypeSymlink, TypeLink. Skips
   device/fifo nodes silently (need privilege; udev/devtmpfs
   handles them in the guest).

BuildExt4 (ext4.go):
 - Truncates outFile to sizeBytes, then runs `mkfs.ext4 -F -d
   <srcDir> -E root_owner=0:0`. No mount, no sudo, no loopback.
 - 64 MiB floor; callers handle real sizing with content-aware
   headroom.
 - File ownership in the resulting ext4 reflects srcDir's on-disk
   ownership — runner's uid/gid since extraction was unprivileged.
   Documented in package doc as a Phase A v1 limitation; Phase B
   will add a debugfs- or tar2ext4-based ownership fixup.

paths.Layout gains OCICacheDir at $XDG_CACHE_HOME/banger/oci/,
ensured at startup alongside the other dirs.

Tests use go-containerregistry's in-process registry to push and
pull synthetic multi-layer images. Cover: layer caching round-trip,
whiteout + opaque-marker handling, path-traversal rejection, unsafe
symlink rejection, real mkfs.ext4 round-trip (skipped if mkfs.ext4
absent), and tiny-size rejection.

go-containerregistry v0.21.5 added as a direct dep, plus its
transitive closure (containerd/stargz, opencontainers/go-digest,
docker/cli config helpers, etc).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 17:22:13 -03:00