banger/internal
Thales Maciel c9358ab390
daemon: sync guest over ssh before stop to preserve workspace writes
VM stop has been quietly losing data freshly written via
`vm workspace prepare`: stop+start of a workspace-prepared VM would
come back with /root/repo wiped on the work disk.

Root cause is firecracker + Debian's systemd defaults. FC's
SendCtrlAltDel (the only "graceful shutdown" action FC exposes) just
delivers the keystroke; what the guest does with it is its choice.
Debian routes ctrl-alt-del.target -> reboot.target, so the guest
reboots, FC stays alive, the daemon's 10s wait_for_exit window
expires, and the SIGKILL fallback drops anything still in FC's
userspace I/O path. For an idle VM that's invisible. For one that
just took 100s of small writes through a workspace prepare, it's
data loss.

Fix is to dial the guest over SSH inside StopVM and run
`sync; systemctl --no-block poweroff || /sbin/poweroff -f &` before
the existing SendCtrlAltDel path. The synchronous `sync` is the
load-bearing piece — it blocks until every dirty page hits virtio-blk
and lands in the on-host root.ext4. Whether poweroff completes
before SIGKILL fires is incidental; sync has already run. SSH
unreachable falls back to the old SendCtrlAltDel behaviour so a
broken-network guest can't make stop hang.

Bounded by a 5s SSH-dial timeout so a half-broken guest can't extend
the overall stop window past gracefulShutdownWait.

Also adds two smoke scenarios:
- `workspace + stop/start`: prepare -> stop -> start -> assert
  marker survives. This is the regression that caught the bug.
- `vm exec`: end-to-end coverage for d59425a — auto-cd into the
  prepared workspace, exit-code propagation, dirty-host warning,
  --auto-prepare resync, refusal on stopped VM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:41:32 -03:00
..
api images: remove the docker field 2026-04-26 20:28:40 -03:00
buildinfo Stamp shared build metadata into banger binaries 2026-03-22 17:14:06 -03:00
cli feat(vm): add vm exec command with workspace dirty detection 2026-04-26 23:53:45 -03:00
config daemon: split owner daemon from root helper 2026-04-26 12:43:17 -03:00
daemon daemon: sync guest over ssh before stop to preserve workspace writes 2026-04-27 15:41:32 -03:00
firecracker daemon: split owner daemon from root helper 2026-04-26 12:43:17 -03:00
guest ssh: trust-on-first-use host key pinning everywhere 2026-04-19 16:46:03 -03:00
guestconfig Refactor VM lifecycle around capabilities 2026-03-18 19:28:26 -03:00
guestnet Stop using kernel IP autoconfig for runtime VMs 2026-03-21 21:54:18 -03:00
hostnat coverage: medium batch — hostnat runner, store guest-sessions, daemon helpers 2026-04-18 18:03:37 -03:00
imagecat publish-golden-image: content-addressed tarball names 2026-04-18 15:26:57 -03:00
imagepull system: mkfs work disks with lazy_itable_init + lazy_journal_init 2026-04-26 21:32:57 -03:00
installmeta daemon: split owner daemon from root helper 2026-04-26 12:43:17 -03:00
kernelcat Prune legacy void/alpine + customize.sh flows 2026-04-18 15:39:53 -03:00
model feat(vm): add vm exec command with workspace dirty detection 2026-04-26 23:53:45 -03:00
namegen coverage: make targets + close zero-cov gaps (namegen, sessionstream) 2026-04-18 17:44:37 -03:00
paths daemon: split owner daemon from root helper 2026-04-26 12:43:17 -03:00
policy Add vsock-backed VM port inspection 2026-03-19 15:52:11 -03:00
roothelper daemon: thread per-RPC op_id end-to-end 2026-04-26 22:13:44 -03:00
rpc daemon: thread per-RPC op_id end-to-end 2026-04-26 22:13:44 -03:00
store feat(vm): add vm exec command with workspace dirty detection 2026-04-26 23:53:45 -03:00
system system: mkfs work disks with lazy_itable_init + lazy_journal_init 2026-04-26 21:32:57 -03:00
toolingplan coverage: easy-wins batch across cli, system, paths, vmdns, toolingplan 2026-04-18 17:57:05 -03:00
vmdns coverage: easy-wins batch across cli, system, paths, vmdns, toolingplan 2026-04-18 17:57:05 -03:00
vsockagent Add vsock-backed VM port inspection 2026-03-19 15:52:11 -03:00