Previously withVMLockByRef held the per-VM mutex across InspectRepo,
waitForGuestSSH, dialGuest, ImportRepoToGuest (the tar stream!), and
the readonly chmod. A large repo could block `vm stop` / `vm delete`
/ `vm restart` on the same VM for however long the import took.
Split into two phases:
1. VM mutex held briefly to validate state (running + PID alive)
and snapshot the fields needed for SSH (guest IP, api sock).
2. VM mutex released. Acquire workspaceLocks[id] — a separate
per-VM mutex scoped to workspace.prepare / workspace.export —
for the guest I/O phase.
Lifecycle ops (stop/delete/restart/set) only take vmLocks, so they
no longer queue behind a slow import. Two concurrent prepares on the
same VM still serialise via workspaceLocks so tar streams don't
interleave. ExportVMWorkspace also acquires workspaceLocks to avoid
snapshotting a half-streamed import.
Two regression tests (sequential — they swap package-level seams):
ReleasesVMLockDuringGuestIO: stall the import fake, assert the VM
mutex is acquirable from another goroutine during the stall.
SerialisesConcurrentPreparesOnSameVM: 3 concurrent prepares, assert
Import is only ever invoked 1-at-a-time per VM.
ARCHITECTURE.md documents the split + updated lock ordering.
99 lines
4.6 KiB
Markdown
99 lines
4.6 KiB
Markdown
# `internal/daemon` architecture
|
|
|
|
This document describes the current daemon package layout: the `Daemon`
|
|
composition root, the subpackages that own stateless helpers and shared
|
|
primitives, and the lock ordering every caller must respect.
|
|
|
|
## Composition
|
|
|
|
`Daemon` is the composition root. Subsystem state and locks live on their
|
|
owning types:
|
|
|
|
- Layout, config, store, runner, logger, pid — infrastructure handles.
|
|
- `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID. Held only
|
|
across short, synchronous state validation and DB mutations so slow
|
|
guest I/O does not block lifecycle ops on the same VM.
|
|
- `workspaceLocks vmLockSet` — per-VM mutex scoped to
|
|
`workspace.prepare` / `workspace.export`. Serialises concurrent
|
|
workspace operations on a single VM (two simultaneous tar imports
|
|
would clobber each other) without touching `vmLocks`, so
|
|
`vm stop` / `delete` / `restart` never queue behind a slow import.
|
|
- `createVMMu sync.Mutex` — serialises `CreateVM` (guards name uniqueness
|
|
+ guest IP allocation window).
|
|
- `imageOpsMu sync.Mutex` — serialises image-registry mutations
|
|
(`PullImage`, `RegisterImage`, `PromoteImage`, `DeleteImage`).
|
|
- `createOps opstate.Registry[*vmCreateOperationState]` — in-flight VM
|
|
create operations; owns its own lock.
|
|
- `tapPool tapPool` — TAP interface pool; owns its own lock.
|
|
- `sessions sessionRegistry` — active guest session controllers; owns
|
|
its own lock.
|
|
- `listener`, `webListener`, `webServer`, `webURL`, `vmDNS` — networking.
|
|
- `vmCaps` — registered VM capability hooks.
|
|
- `pullAndFlatten`, `finalizePulledRootfs`, `bundleFetch`,
|
|
`requestHandler`, `guestWaitForSSH`, `guestDial`,
|
|
`waitForGuestSessionReady` — injectable seams used by tests.
|
|
|
|
## Subpackages
|
|
|
|
Pure helpers have moved into subpackages so the daemon package itself stays
|
|
focused on orchestration. Each subpackage takes explicit dependencies
|
|
(typically a `system.Runner`-compatible interface) and holds no global
|
|
state beyond small test seams.
|
|
|
|
| Subpackage | Purpose |
|
|
| --------------------------------- | ---------------------------------------------------------------------- |
|
|
| `internal/daemon/opstate` | Generic `Registry[T AsyncOp]` for async-operation bookkeeping. |
|
|
| `internal/daemon/dmsnap` | Device-mapper COW snapshot create/cleanup/remove. |
|
|
| `internal/daemon/fcproc` | Firecracker process primitives (bridge, tap, binary, PID, kill, wait). |
|
|
| `internal/daemon/imagemgr` | Image subsystem pure helpers: validators, staging, build script gen. |
|
|
| `internal/daemon/session` | Guest-session helpers: state paths, scripts, parsing, utilities. |
|
|
| `internal/daemon/workspace` | Workspace helpers: git inspection, copy prep, guest import script. |
|
|
|
|
`workspace` imports `session` for `ShellQuote` and `FormatStepError`; all
|
|
other subpackages are leaves (no other intra-daemon subpackage imports).
|
|
|
|
## Lock ordering
|
|
|
|
Acquire in this order, release in reverse. Never acquire in the opposite
|
|
direction.
|
|
|
|
```
|
|
vmLocks[id] → workspaceLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks
|
|
```
|
|
|
|
`vmLocks[id]` and `workspaceLocks[id]` are NEVER held at the same
|
|
time. `workspace.prepare` acquires `vmLocks[id]` just long enough to
|
|
validate VM state, releases it, then acquires `workspaceLocks[id]`
|
|
for the guest I/O phase.
|
|
|
|
Subsystem-local locks (`tapPool.mu`, `sessionRegistry.mu`,
|
|
`opstate.Registry` mu, `guestSessionController.attachMu` /
|
|
`writeMu`) are leaves. They do not contend with each other.
|
|
|
|
Notes:
|
|
|
|
- `vmLocks[id]` is the outer lock for any operation scoped to a single VM.
|
|
Acquired via `withVMLockByID` / `withVMLockByRef`.
|
|
- `createVMMu` and `imageOpsMu` are narrow: each guards one family of
|
|
mutations and is released before any blocking guest I/O.
|
|
- Holding a subsystem-local lock while calling into guest SSH is
|
|
discouraged; copy needed state out under the lock and release before
|
|
blocking I/O.
|
|
|
|
## External API
|
|
|
|
Only `internal/cli` imports this package. The surface is:
|
|
|
|
- `daemon.Open(ctx) (*Daemon, error)`
|
|
- `(*Daemon).Serve(ctx) error`
|
|
- `(*Daemon).Close() error`
|
|
- `daemon.Doctor(...)` — host diagnostics (no receiver).
|
|
|
|
All other `*Daemon` methods are reached only through the RPC `dispatch`
|
|
switch in `daemon.go` and are free to move/rename during refactoring.
|
|
|
|
## Web UI
|
|
|
|
The optional web UI served at `web_listen_addr` is experimental. It is
|
|
enabled by default for local observability but is not considered a stable
|
|
or supported interface. Set `web_listen_addr = ""` in config to disable.
|