daemon: persist tap device on VM.Runtime so NAT teardown survives handle-cache loss
Cleanup identity for kernel objects was split across two sources of
truth: vm.Runtime (DB-backed, durable) held paths and the guest IP,
but the TAP name lived only in the in-process handle cache + the
best-effort handles.json scratch file next to the VM dir. Every
other cleanup-identifying datum has a fallback — firecracker PID
can be rediscovered via `pgrep -f <apiSock>`, loops via losetup, dm
name from the deterministic ShortID(vm.ID). The tap is the one
truly cache-only datum (allocated from a pool, not derivable).
That made NAT teardown fragile:
- daemon crash between `acquireTap` and the handles.json write
- handles.json corrupt on the next daemon start
- partial cleanup that already zeroed the cache
In any of those cases natCapability.Cleanup short-circuited
("skipping nat cleanup without runtime network handles") and the
per-VM POSTROUTING MASQUERADE + the two FORWARD rules keyed off
the tap would leak. The VM row in the DB still existed, so a retry
couldn't close the loop — the tap name was simply gone.
Fix: mirror TapDevice onto model.VMRuntime (serialised via the
existing runtime_json column, omitempty so existing rows upgrade
cleanly). Set it in startVMLocked right next to the
s.setVMHandles call that seeds the in-memory cache; clear it at
every post-cleanup reset site (stop normal path + stop stale
branch, kill normal path + kill stale branch, cleanupOnErr in
start, reconcile's stale-vm branch, the stats poller's auto-stop
path).
Fallbacks now cascade:
- natCapability.Cleanup: handles cache → Runtime.TapDevice
- cleanupRuntime (releaseTap): handles cache → Runtime.TapDevice
Both surfaces refuse gracefully (old behaviour) only when neither
source has a value, which really does mean "no tap was ever
allocated for this VM" rather than "we lost track of it."
Test: TestNATCapabilityCleanup_FallsBackToRuntimeTapDevice clears
the handle cache, sets vm.Runtime.TapDevice, and asserts Cleanup
reaches the runner — the exact scenario the review flagged as a
plausible leak and the exact code path that now guarantees it
doesn't.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
1850904d9c
commit
5eceebe49f
7 changed files with 72 additions and 16 deletions
|
|
@ -101,18 +101,26 @@ type VMSpec struct {
|
|||
// LastError carries the last failure message for debugging. State
|
||||
// mirrors VMRecord.State.
|
||||
type VMRuntime struct {
|
||||
State VMState `json:"state"`
|
||||
GuestIP string `json:"guest_ip"`
|
||||
APISockPath string `json:"api_sock_path,omitempty"`
|
||||
VSockPath string `json:"vsock_path,omitempty"`
|
||||
VSockCID uint32 `json:"vsock_cid,omitempty"`
|
||||
LogPath string `json:"log_path,omitempty"`
|
||||
MetricsPath string `json:"metrics_path,omitempty"`
|
||||
DNSName string `json:"dns_name,omitempty"`
|
||||
VMDir string `json:"vm_dir"`
|
||||
SystemOverlay string `json:"system_overlay_path"`
|
||||
WorkDiskPath string `json:"work_disk_path"`
|
||||
LastError string `json:"last_error,omitempty"`
|
||||
State VMState `json:"state"`
|
||||
GuestIP string `json:"guest_ip"`
|
||||
APISockPath string `json:"api_sock_path,omitempty"`
|
||||
VSockPath string `json:"vsock_path,omitempty"`
|
||||
VSockCID uint32 `json:"vsock_cid,omitempty"`
|
||||
LogPath string `json:"log_path,omitempty"`
|
||||
MetricsPath string `json:"metrics_path,omitempty"`
|
||||
DNSName string `json:"dns_name,omitempty"`
|
||||
VMDir string `json:"vm_dir"`
|
||||
// TapDevice mirrors VMHandles.TapDevice but persists across
|
||||
// daemon restarts / handle-cache loss. NAT teardown needs the
|
||||
// exact tap name to delete the FORWARD rules; if we only had
|
||||
// the handle cache, a crash between tap acquire and handles.json
|
||||
// write — or a corrupt handles.json on the next daemon start —
|
||||
// would silently leak the rules. Storing it on the VM record
|
||||
// makes cleanup correct as long as the VM row exists.
|
||||
TapDevice string `json:"tap_device,omitempty"`
|
||||
SystemOverlay string `json:"system_overlay_path"`
|
||||
WorkDiskPath string `json:"work_disk_path"`
|
||||
LastError string `json:"last_error,omitempty"`
|
||||
}
|
||||
|
||||
type VMStats struct {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue