daemon: persist teardown fallbacks and reject unsafe import paths
Preserve cleanup after daemon restarts and harden OCI and tar imports against filenames that debugfs cannot encode safely. Mirror tap, loop, and dm teardown identity onto VM.Runtime, teach cleanup and reconcile to fall back to those persisted fields when handles.json is missing or corrupt, and clear the recovery state on stop, error, and delete paths. Reject debugfs-hostile entry names during flattening and in ApplyOwnership itself, then add regression coverage for corrupt handles.json recovery and unsafe import paths. Verified with targeted go tests, make lint-go, make lint-shell, and make build.
This commit is contained in:
parent
86a56fedb3
commit
d743a8ba4b
15 changed files with 272 additions and 81 deletions
|
|
@ -89,11 +89,10 @@ type VMSpec struct {
|
|||
|
||||
// VMRuntime holds the durable runtime state that the daemon needs
|
||||
// to reach a VM: identity, declared state, and deterministic derived
|
||||
// paths. Transient kernel/process handles (PID, tap, loop devices,
|
||||
// dm-snapshot names) live on VMHandles, NOT here — the daemon keeps
|
||||
// them in an in-memory cache backed by a per-VM handles.json scratch
|
||||
// file, so a daemon restart rebuilds them from OS state rather than
|
||||
// trusting whatever was last written into a SQLite column.
|
||||
// paths. The authoritative live handle set still lives on VMHandles,
|
||||
// but teardown-critical storage/network identifiers are mirrored here
|
||||
// as recovery fallbacks so restart-time cleanup still works when
|
||||
// handles.json is missing or corrupt.
|
||||
//
|
||||
// Everything in VMRuntime is safe to persist: the paths are
|
||||
// deterministic from (VM ID, layout) and survive restart unchanged;
|
||||
|
|
@ -110,14 +109,15 @@ type VMRuntime struct {
|
|||
MetricsPath string `json:"metrics_path,omitempty"`
|
||||
DNSName string `json:"dns_name,omitempty"`
|
||||
VMDir string `json:"vm_dir"`
|
||||
// TapDevice mirrors VMHandles.TapDevice but persists across
|
||||
// daemon restarts / handle-cache loss. NAT teardown needs the
|
||||
// exact tap name to delete the FORWARD rules; if we only had
|
||||
// the handle cache, a crash between tap acquire and handles.json
|
||||
// write — or a corrupt handles.json on the next daemon start —
|
||||
// would silently leak the rules. Storing it on the VM record
|
||||
// makes cleanup correct as long as the VM row exists.
|
||||
// Teardown fallback fields mirror the handle cache onto the VM row.
|
||||
// They are recovery-only: while the daemon is alive, VMHandles stays
|
||||
// authoritative. On restart, cleanup can fall back to these values if
|
||||
// handles.json is missing or corrupt.
|
||||
TapDevice string `json:"tap_device,omitempty"`
|
||||
BaseLoop string `json:"base_loop,omitempty"`
|
||||
COWLoop string `json:"cow_loop,omitempty"`
|
||||
DMName string `json:"dm_name,omitempty"`
|
||||
DMDev string `json:"dm_dev,omitempty"`
|
||||
SystemOverlay string `json:"system_overlay_path"`
|
||||
WorkDiskPath string `json:"work_disk_path"`
|
||||
LastError string `json:"last_error,omitempty"`
|
||||
|
|
|
|||
|
|
@ -3,11 +3,11 @@ package model
|
|||
// VMHandles captures the transient, per-boot kernel/process handles
|
||||
// that banger obtains while starting a VM and releases when stopping
|
||||
// it. Unlike VMRuntime (durable spec + identity + derived paths),
|
||||
// nothing in VMHandles survives a daemon restart in authoritative
|
||||
// form: each value is either rediscovered from the OS (PID from the
|
||||
// firecracker api socket, DM name deterministically from the VM ID)
|
||||
// or read from a per-VM scratch file that the daemon rebuilds at
|
||||
// every start.
|
||||
// VMHandles is the authoritative live-handle view while the daemon is
|
||||
// up. On restart, the daemon rebuilds it from the OS plus the per-VM
|
||||
// scratch file; teardown-critical fields are also mirrored onto
|
||||
// VMRuntime so cleanup can still proceed if that scratch file is
|
||||
// missing or corrupt.
|
||||
//
|
||||
// The daemon keeps an in-memory cache keyed by VM ID. Lifecycle
|
||||
// transitions update the cache and a small `handles.json` scratch
|
||||
|
|
@ -16,10 +16,9 @@ package model
|
|||
// OS state. If anything is stale the VM is marked stopped and the
|
||||
// cache entry is dropped.
|
||||
//
|
||||
// VMHandles never appears in the `vms` SQLite rows. Keeping it off
|
||||
// the durable schema was the whole point of the split — persistent
|
||||
// records describe what a VM SHOULD be; handles describe what is
|
||||
// currently true about it.
|
||||
// VMHandles itself never appears in the `vms` SQLite rows. Some fields
|
||||
// are mirrored onto VMRuntime as crash-recovery fallback state, but the
|
||||
// cache + scratch file remain the canonical live source.
|
||||
type VMHandles struct {
|
||||
// PID is the firecracker process PID. Zero means "not running
|
||||
// (from our perspective)". Always verifiable via
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue