Three load-bearing fixes that together let `banger update` (and its auto-rollback path) restart the helper + daemon without killing every running VM. New smoke scenarios prove the property end-to-end. Bug fixes: 1. Disable the firecracker SDK's signal-forwarding goroutine. The default ForwardSignals = [SIGINT, SIGQUIT, SIGTERM, SIGHUP, SIGABRT] installs a handler in the helper that propagates the helper's SIGTERM (sent by systemd on `systemctl stop bangerd- root.service`) to every running firecracker child. Set ForwardSignals to an empty (non-nil) slice so setupSignals short-circuits at len()==0. 2. Add SendSIGKILL=no to bangerd-root.service. KillMode=process limits the initial SIGTERM to the helper main, but systemd still SIGKILLs leftover cgroup processes during the FinalKillSignal stage unless SendSIGKILL=no. 3. Route restart-helper / restart-daemon / wait-daemon-ready failures through rollbackAndRestart instead of rollbackAndWrap. rollbackAndWrap restored .previous binaries but didn't re- restart the failed unit, leaving the helper dead with the rolled-back binary on disk after a failed update. Testing infrastructure (production binaries unaffected): - Hidden --manifest-url and --pubkey-file flags on `banger update` let the smoke harness redirect the updater at locally-built release artefacts. Marked Hidden in cobra; not advertised in --help. - FetchManifestFrom / VerifyBlobSignatureWithKey / FetchAndVerifySignatureWithKey export the existing logic against caller-supplied URL / pubkey. The default entry points still call them with the embedded canonical values. Smoke scenarios: - update_check: --check against fake manifest reports update available - update_to_unknown: --to v9.9.9 fails before any host mutation - update_no_root: refuses without sudo, install untouched - update_dry_run: stages + verifies, no swap, version unchanged - update_keeps_vm_alive: real swap to v0.smoke.0; same VM (same boot_id) answers SSH after the daemon restart - update_rollback_keeps_vm_alive: v0.smoke.broken-bangerd ships a bangerd that passes --check-migrations but exits 1 as the daemon. The post-swap `systemctl restart bangerd` fails, rollbackAndRestart fires, the .previous binaries are restored and re-restarted; the same VM still answers SSH afterwards - daemon_admin (separate prep): covers `banger daemon socket`, `bangerd --check-migrations --system`, `sudo banger daemon stop` The smoke release builder generates a fresh ECDSA P-256 keypair with openssl, signs SHA256SUMS cosign-compatibly, and serves artefacts from a backgrounded python http.server. verify_smoke_check_test.go pins the openssl/cosign signature equivalence so the smoke release builder can't silently drift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
177 lines
6.9 KiB
Go
177 lines
6.9 KiB
Go
// Package updater drives `banger update`: discover a new release,
|
||
// download + verify it, swap binaries atomically, restart the systemd
|
||
// units, run doctor, roll back on failure. The package is split across
|
||
// files by responsibility — manifest.go owns the release-discovery
|
||
// shape, the rest is in their own files.
|
||
package updater
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
"io"
|
||
"net/http"
|
||
"strings"
|
||
"time"
|
||
)
|
||
|
||
// manifestURL is the canonical URL of banger's release manifest on
|
||
// the Cloudflare R2 bucket. Hardcoded (rather than pulling from
|
||
// config) so a compromised daemon config can't redirect the updater
|
||
// to a different bucket. Var (not const) only because tests need to
|
||
// point at an httptest.Server; production never mutates it.
|
||
//
|
||
// The bucket lives at releases.thaloco.com; the path /banger/ scopes
|
||
// it inside the bucket so the same host can serve other projects'
|
||
// release artifacts later.
|
||
var manifestURL = "https://releases.thaloco.com/banger/manifest.json"
|
||
|
||
// ManifestURL exposes the configured URL for callers that want to
|
||
// surface it in user-facing output (e.g. `banger update --check`).
|
||
func ManifestURL() string { return manifestURL }
|
||
|
||
// MaxManifestBytes caps the manifest download size. The manifest is
|
||
// JSON with a small bounded shape (10s of releases × ~200 bytes
|
||
// each); 1 MiB is generous and protects us from a server that
|
||
// accidentally serves an arbitrary file.
|
||
const MaxManifestBytes int64 = 1 << 20
|
||
|
||
// MaxSHA256SumsBytes caps the SHA256SUMS download. One line per
|
||
// release artifact (today: one line for the tarball); 16 KiB is
|
||
// orders of magnitude over what we'd ever publish.
|
||
const MaxSHA256SumsBytes int64 = 16 * 1024
|
||
|
||
// MaxTarballBytes caps the release-tarball download. Banger's three
|
||
// binaries plus a SHA256SUMS file fit comfortably under this; if a
|
||
// future release approaches the cap, bump intentionally and ship a
|
||
// note in CHANGELOG.
|
||
const MaxTarballBytes int64 = 256 * 1024 * 1024
|
||
|
||
// Manifest is the top-level shape of releases.thaloco.com/banger/manifest.json.
|
||
// SchemaVersion lets us evolve the structure without breaking older
|
||
// CLIs — a CLI that doesn't recognise its current SchemaVersion
|
||
// refuses to update rather than guessing.
|
||
type Manifest struct {
|
||
SchemaVersion int `json:"schema_version"`
|
||
LatestStable string `json:"latest_stable"`
|
||
Releases []Release `json:"releases"`
|
||
}
|
||
|
||
// Release describes one published banger build. The tarball + the
|
||
// SHA256SUMS file (and optionally its cosign signature) live at the
|
||
// URLs listed here; the actual binary hashes come from SHA256SUMS,
|
||
// not from the manifest, so manifest tampering can't substitute a
|
||
// hash for a known-good tarball.
|
||
type Release struct {
|
||
Version string `json:"version"`
|
||
TarballURL string `json:"tarball_url"`
|
||
SHA256SumsURL string `json:"sha256sums_url"`
|
||
SHA256SumsSigURL string `json:"sha256sums_sig_url,omitempty"`
|
||
ReleasedAt time.Time `json:"released_at"`
|
||
}
|
||
|
||
// ManifestSchemaVersion is the SchemaVersion this CLI knows how to
|
||
// parse. Bumped together with any breaking change in Manifest /
|
||
// Release.
|
||
const ManifestSchemaVersion = 1
|
||
|
||
// FetchManifest downloads the release manifest from the embedded
|
||
// canonical URL and validates its shape. Returns an error if the
|
||
// server is unreachable, returns non-2xx, exceeds the size cap, or
|
||
// the schema_version is newer than this CLI knows.
|
||
func FetchManifest(ctx context.Context, client *http.Client) (Manifest, error) {
|
||
return FetchManifestFrom(ctx, client, manifestURL)
|
||
}
|
||
|
||
// FetchManifestFrom is FetchManifest against an explicit URL. Used by
|
||
// the smoke suite (via `banger update --manifest-url …`) to drive the
|
||
// updater against a locally-served fake manifest. Production callers
|
||
// stick with FetchManifest.
|
||
func FetchManifestFrom(ctx context.Context, client *http.Client, url string) (Manifest, error) {
|
||
if client == nil {
|
||
client = http.DefaultClient
|
||
}
|
||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
|
||
if err != nil {
|
||
return Manifest{}, err
|
||
}
|
||
resp, err := client.Do(req)
|
||
if err != nil {
|
||
return Manifest{}, fmt.Errorf("fetch manifest: %w", err)
|
||
}
|
||
defer resp.Body.Close()
|
||
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
|
||
return Manifest{}, fmt.Errorf("fetch manifest: HTTP %s", resp.Status)
|
||
}
|
||
if resp.ContentLength > MaxManifestBytes {
|
||
return Manifest{}, fmt.Errorf("manifest is %d bytes, exceeds %d-byte cap", resp.ContentLength, MaxManifestBytes)
|
||
}
|
||
body, err := io.ReadAll(io.LimitReader(resp.Body, MaxManifestBytes+1))
|
||
if err != nil {
|
||
return Manifest{}, fmt.Errorf("read manifest: %w", err)
|
||
}
|
||
if int64(len(body)) > MaxManifestBytes {
|
||
return Manifest{}, fmt.Errorf("manifest body exceeded %d-byte cap", MaxManifestBytes)
|
||
}
|
||
return ParseManifest(body)
|
||
}
|
||
|
||
// ParseManifest unmarshals manifest bytes and validates the schema
|
||
// version. Exposed as a separate function so tests can drive it
|
||
// without an HTTP server.
|
||
func ParseManifest(body []byte) (Manifest, error) {
|
||
var m Manifest
|
||
if err := json.Unmarshal(body, &m); err != nil {
|
||
return Manifest{}, fmt.Errorf("parse manifest: %w", err)
|
||
}
|
||
if m.SchemaVersion == 0 {
|
||
return Manifest{}, fmt.Errorf("manifest missing schema_version")
|
||
}
|
||
if m.SchemaVersion > ManifestSchemaVersion {
|
||
return Manifest{}, fmt.Errorf("manifest schema_version %d is newer than this CLI knows (%d); upgrade banger to read it", m.SchemaVersion, ManifestSchemaVersion)
|
||
}
|
||
if strings.TrimSpace(m.LatestStable) == "" && len(m.Releases) > 0 {
|
||
return Manifest{}, fmt.Errorf("manifest missing latest_stable")
|
||
}
|
||
for i, r := range m.Releases {
|
||
if strings.TrimSpace(r.Version) == "" {
|
||
return Manifest{}, fmt.Errorf("release[%d]: empty version", i)
|
||
}
|
||
if strings.TrimSpace(r.TarballURL) == "" {
|
||
return Manifest{}, fmt.Errorf("release[%d] (%s): empty tarball_url", i, r.Version)
|
||
}
|
||
if strings.TrimSpace(r.SHA256SumsURL) == "" {
|
||
return Manifest{}, fmt.Errorf("release[%d] (%s): empty sha256sums_url", i, r.Version)
|
||
}
|
||
}
|
||
return m, nil
|
||
}
|
||
|
||
// LookupRelease finds the release with the given version (e.g.
|
||
// "v0.1.0") in the manifest. Returns an error when no match exists —
|
||
// helpful when a user passes `--to v9.9.9` against a manifest that
|
||
// hasn't seen v9.9.9 yet.
|
||
func (m Manifest) LookupRelease(version string) (Release, error) {
|
||
wanted := strings.TrimSpace(version)
|
||
if wanted == "" {
|
||
return Release{}, fmt.Errorf("version is required")
|
||
}
|
||
for _, r := range m.Releases {
|
||
if r.Version == wanted {
|
||
return r, nil
|
||
}
|
||
}
|
||
available := make([]string, 0, len(m.Releases))
|
||
for _, r := range m.Releases {
|
||
available = append(available, r.Version)
|
||
}
|
||
return Release{}, fmt.Errorf("release %q not found in manifest (available: %s)", wanted, strings.Join(available, ", "))
|
||
}
|
||
|
||
// Latest returns the release matching the manifest's latest_stable
|
||
// pointer. Errors when the pointer doesn't reference any listed
|
||
// release — that's a manifest publishing bug worth surfacing rather
|
||
// than silently picking some other release.
|
||
func (m Manifest) Latest() (Release, error) {
|
||
return m.LookupRelease(m.LatestStable)
|
||
}
|