Two promises the doc was making that the code doesn't keep: 1. "Helpers moved out so the package stays focused on orchestration." The package still has ~29 files and ~130 func (d *Daemon) methods wiring VM lifecycle, image management, host networking, background reconciliation, and JSON-RPC dispatch. Calling it "just orchestration" sets readers up for surprise. Rewrite the subpackages preamble to say so, and flag the service split as a post-v0.1.0 project. 2. "vmLocks[id] is held only across short synchronous state validation and DB mutations." That's what workspace.prepare does; regular lifecycle ops (start/stop/delete/set) go through withVMLockByRef and hold the lock across the whole callback body, which for `start` means preflight + bridge + firecracker spawn + post-boot wiring. Rewrite the vmLocks bullet and the lock-ordering section to say that explicitly, so readers don't build "surely my long flow under the lock can't be what the doc means" reasoning on top of a false premise. Doc-only change. Code behaviour is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.5 KiB
internal/daemon architecture
This document describes the current daemon package layout: the Daemon
composition root, the subpackages that own stateless helpers and shared
primitives, and the lock ordering every caller must respect.
Composition
Daemon is the composition root. Subsystem state and locks live on their
owning types:
- Layout, config, store, runner, logger, pid — infrastructure handles.
vmLocks vmLockSet— per-VM*sync.Mutex, one per VM ID. Held for the entire lifecycle op on that VM: astartholds it across preflight, bridge setup, firecracker spawn, and post-boot wiring (seconds to tens of seconds). Twostart/stop/delete/setcalls against the same VM therefore serialise; calls against different VMs run independently. If you need a slow guest-side operation to NOT block lifecycle ops on the same VM, scope it out of the lock explicitly the wayworkspace.preparedoes (see below).workspaceLocks vmLockSet— per-VM mutex scoped toworkspace.prepare/workspace.export. These ops acquirevmLocks[id]only long enough to validate VM state + snapshot the fields they need, release it, then acquireworkspaceLocks[id]for the slow guest I/O phase. That keepsvm stop/delete/restartfrom queueing behind a running tar import.handles *handleCache— in-memory map of per-VM transient kernel/ process handles (PID, tap device, loop devices, DM target). The cache is rebuildable: each VM directory holds a smallhandles.jsonscratch file that the daemon reads at startup to reconstruct the cache and verify processes against/procvia pgrep. Nothing in the durablevmsSQLite row describes transient kernel state. Seeinternal/daemon/vm_handles.go.createVMMu sync.Mutex— serialisesCreateVM(guards name uniqueness- guest IP allocation window).
imageOpsMu sync.Mutex— serialises image-registry mutations (PullImage,RegisterImage,PromoteImage,DeleteImage).createOps opstate.Registry[*vmCreateOperationState]— in-flight VM create operations; owns its own lock.tapPool tapPool— TAP interface pool; owns its own lock.listener,vmDNS— networking.vmCaps— registered VM capability hooks.pullAndFlatten,finalizePulledRootfs,bundleFetch,requestHandler,guestWaitForSSH,guestDial,workspaceInspectRepo,workspaceImport— injectable seams used by tests.
Subpackages
Stateless helpers that don't need the Daemon composition root have
been lifted into subpackages. Lifecycle orchestration, image-registry
orchestration, host networking bootstrap, background reconciliation,
and the JSON-RPC dispatch all still live in this package — it is not
"just orchestration." ~29 files and ~130 func (d *Daemon) methods
share the root struct today. A future project would be to split VM
lifecycle, image management, and the background reconciler into
services with explicit interfaces; that's out of scope for v0.1.0.
Each subpackage takes explicit dependencies (typically a
system.Runner-compatible interface) and holds no global state beyond
small test seams.
| Subpackage | Purpose |
|---|---|
internal/daemon/opstate |
Generic Registry[T AsyncOp] for async-operation bookkeeping. |
internal/daemon/dmsnap |
Device-mapper COW snapshot create/cleanup/remove. |
internal/daemon/fcproc |
Firecracker process primitives (bridge, tap, binary, PID, kill, wait). |
internal/daemon/imagemgr |
Image subsystem pure helpers: validators, staging, build script gen. |
internal/daemon/workspace |
Workspace helpers: git inspection, copy prep, guest import script. |
All subpackages are leaves — no intra-daemon subpackage imports another.
Lock ordering
Acquire in this order, release in reverse. Never acquire in the opposite direction.
vmLocks[id] → workspaceLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks
vmLocks[id] and workspaceLocks[id] are NEVER held at the same
time. workspace.prepare acquires vmLocks[id] just long enough to
validate VM state, releases it, then acquires workspaceLocks[id]
for the guest I/O phase. Regular lifecycle ops (start, stop,
delete, set) do NOT do this split — they hold vmLocks[id]
across the whole flow.
Subsystem-local locks (tapPool.mu, opstate.Registry mu) are leaves.
They do not contend with each other.
Notes:
vmLocks[id]is the outer lock for any operation scoped to a single VM. Acquired viawithVMLockByID/withVMLockByRef. The callback runs under the lock — treat the whole function body as critical section.createVMMuandimageOpsMuare narrow: each guards one family of mutations and is released before any blocking guest I/O.- Holding a subsystem-local lock while calling into guest SSH is discouraged; copy needed state out under the lock and release before blocking I/O.
External API
Only internal/cli imports this package. The surface is:
daemon.Open(ctx) (*Daemon, error)(*Daemon).Serve(ctx) error(*Daemon).Close() errordaemon.Doctor(...)— host diagnostics (no receiver).
All other *Daemon methods are reached only through the RPC dispatch
switch in daemon.go and are free to move/rename during refactoring.