imagecat,kernelcat: bound staged download, hash before extract

Both Fetch flows previously streamed resp.Body straight into
zstd → tar → on-disk extractor with the SHA256 check tacked on at
the END. A bad mirror or an attacker that's compromised the catalog
host could ship a multi-gigabyte tarball, watch banger expand it to
disk, and only THEN see the helpful "sha256 mismatch" message —
having already filled the host filesystem.

Reorder the operations: stage the compressed tarball to a temp file
under the destination directory through an io.LimitReader (cap +1
bytes), hash on the way in, refuse to decompress if either the cap
trips or the SHA mismatches. Worst-case disk use is bounded by the
cap, not by the source.

Cap is exposed as a package var (MaxFetchedBundleBytes,
MaxFetchedKernelBytes) so callers can tune per-deployment and tests
can squeeze it down to provoke the rejection. Default 8 GiB —
generous enough for a 4 GiB rootfs (which compresses to ~1-2 GiB),
tight enough to make a "fill the host disk" attack expensive.

The temp file lives in the destination dir so extraction stays on
the same filesystem and we don't pay for cross-FS rename. defer
os.Remove cleans up; the existing per-package cleanup() handler
still removes any partial extraction on hash mismatch / extraction
failure.

Tests: each package gets a TestFetchRejectsOversizedTarballBefore
Extraction that sets the cap to 64 bytes, points Fetch at a multi-KB
tarball, and asserts (a) error mentions "cap", (b) destination dir
is left clean (no leaked rootfs / manifest / kernel tree). All
existing tests still pass — happy path, hash mismatch, missing
files, path traversal, HTTP error, etc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Thales Maciel 2026-04-28 16:09:55 -03:00
parent 3805b093b4
commit 4004ce2e7e
No known key found for this signature in database
GPG key ID: 33112E6833C34679
4 changed files with 172 additions and 28 deletions

View file

@ -130,6 +130,44 @@ func TestFetchRejectsSHA256Mismatch(t *testing.T) {
}
}
// TestFetchRejectsOversizedTarballBeforeExtraction pins the new
// disk-bound cap: by setting MaxFetchedBundleBytes very low, the
// staged-tarball download must trip the limit and refuse to even
// decompress, leaving the destination dir clean. This is the
// "compromised mirror floods the host" scenario.
func TestFetchRejectsOversizedTarballBeforeExtraction(t *testing.T) {
manifest := Manifest{Name: "debian-bookworm"}
bundle, sum := makeBundle(t, manifest, bytes.Repeat([]byte("x"), 4096))
srv := serveBundle(t, bundle)
t.Cleanup(srv.Close)
prev := MaxFetchedBundleBytes
MaxFetchedBundleBytes = 64
t.Cleanup(func() { MaxFetchedBundleBytes = prev })
dest := t.TempDir()
_, err := Fetch(context.Background(), srv.Client(), dest, CatEntry{
Name: "debian-bookworm",
TarballURL: srv.URL + "/bundle.tar.zst",
TarballSHA256: sum,
})
if err == nil {
t.Fatal("Fetch succeeded against an oversized tarball; want size-cap rejection")
}
if !strings.Contains(err.Error(), "cap") {
t.Fatalf("err = %v, want size-cap message", err)
}
// dest must be untouched: no rootfs, no manifest, no leftover tmp.
entries, _ := os.ReadDir(dest)
if len(entries) != 0 {
var names []string
for _, e := range entries {
names = append(names, e.Name())
}
t.Fatalf("dest left dirty after size-cap rejection: %v", names)
}
}
func TestFetchRejectsUnexpectedTarEntry(t *testing.T) {
// Hand-roll a bundle with a third, disallowed entry.
var rawTar bytes.Buffer