daemon split (5/5): document the service composition

Phase 5 of the daemon god-struct refactor. Code motion landed in
phases 1-4; this commit retells the architecture so the docs match
the structure.

ARCHITECTURE.md loses the "deferred v0.2 project" hedge about
splitting services. The Composition section now describes the four
services (HostNetwork, ImageService, WorkspaceService, VMService)
that own behaviour, the consumer-defined seam pattern for
cross-service calls, and the lazy-init getter pattern that keeps
existing test literals compiling.

doc.go inventories which methods live on which service, and the
lock-ordering section gains the service prefixes (e.g.
VMService.vmLocks instead of bare vmLocks) so readers don't have to
guess which type owns which mutex.

No code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Thales Maciel 2026-04-20 20:58:53 -03:00
parent 466a7c30c4
commit 0cfd8a5451
No known key found for this signature in database
GPG key ID: 33112E6833C34679
2 changed files with 180 additions and 131 deletions

View file

@ -1,94 +1,135 @@
# `internal/daemon` architecture # `internal/daemon` architecture
This document describes the current daemon package layout: the `Daemon` This document describes the current daemon package layout: the `Daemon`
composition root, the subpackages that own stateless helpers and shared composition root, the four services it wires together, the subpackages
primitives, and the lock ordering every caller must respect. that own stateless helpers, and the lock ordering every caller must
respect.
## Composition ## Composition
`Daemon` is the composition root. Subsystem state and locks live on their `Daemon` is a thin composition root. It holds shared infrastructure
owning types: (store, runner, logger, layout, config, listener) plus pointers to
four focused services. RPC dispatch is a pure forwarder into those
services; no lifecycle / image / workspace / networking behaviour
lives on `*Daemon` itself.
```
Daemon
├── *HostNetwork — bridge, tap pool, NAT, DNS, firecracker process,
│ DM snapshots, vsock readiness
├── *ImageService — register, promote, delete, pull (bundle + OCI),
│ kernel catalog, managed-seed refresh
├── *WorkspaceService — workspace.prepare / workspace.export, auth-key
│ + git-identity sync onto the work disk
└── *VMService — VM lifecycle (create/start/stop/restart/kill/
delete/set), stats polling, ports query,
handle cache, per-VM lock set, create-op
registry, preflight validation
```
Each service owns its own state. Cross-service calls go through narrow
consumer-defined seams:
- `WorkspaceService` does not hold a `*VMService` pointer. It takes
function-typed deps (`vmResolver`, `aliveChecker`, `withVMLockByRef`,
`imageResolver`, `imageWorkSeed`) so it sees exactly the operations
it needs and nothing more. Those deps are captured as closures so
construction-order cycles don't recur.
- `VMService` holds direct pointers to `*HostNetwork`, `*ImageService`,
and `*WorkspaceService`. Orchestrating a VM start really does compose
all three (bridge + tap + image resolution + work-disk sync), and
declaring a function-typed interface for every call would balloon
the surface for no win — services are unexported, so package-external
code can never reach them.
- Capability hooks still take `*Daemon` as their receiver argument,
but `VMService` calls into them through a `capabilityHooks` struct
(function-typed bag) populated at construction. The service has no
`*Daemon` pointer.
Lazy-init getters (`d.hostNet()`, `d.imageSvc()`, `d.workspaceSvc()`,
`d.vmSvc()`) let existing test literals (`&Daemon{store: db, runner: r}`)
keep working — the getter constructs the service from whatever is on
the `Daemon` if nothing was pre-wired.
## Service state
### `HostNetwork` (`host_network.go`, `nat.go`, `dns_routing.go`, `tap_pool.go`, `snapshot.go`)
- `tapPool` — TAP interface pool, owns its own lock.
- `vmDNS *vmdns.Server` — in-process DNS server for `.vm` names.
- No direct VM-state access. Where an operation needs a VM's tap name
(e.g. `ensureNAT`), the signature takes `guestIP` + `tap` string so
the caller (VMService) resolves them first.
### `ImageService` (`image_service.go`, `images.go`, `images_pull.go`, `image_seed.go`, `kernels.go`)
- `imageOpsMu sync.Mutex` — the publication-window lock. Held only
across the recheck-name + atomic-rename + UpsertImage commit atom.
Slow work (network fetch, ext4 build, SSH-key seeding) runs unlocked.
- Test seams `pullAndFlatten`, `finalizePulledRootfs`, `bundleFetch`
are struct fields (not package globals), so tests inject per-instance
fakes.
### `WorkspaceService` (`workspace_service.go`, `workspace.go`, `vm_authsync.go`)
- Layout, config, store, runner, logger, pid — infrastructure handles.
- `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID. Held for
the **entire lifecycle op** on that VM: a `start` holds it across
preflight, bridge setup, firecracker spawn, and post-boot wiring
(seconds to tens of seconds). Two `start`/`stop`/`delete`/`set` calls
against the same VM therefore serialise; calls against different VMs
run independently. If you need a slow guest-side operation to NOT
block lifecycle ops on the same VM, scope it out of the lock
explicitly the way `workspace.prepare` does (see below).
- `workspaceLocks vmLockSet` — per-VM mutex scoped to - `workspaceLocks vmLockSet` — per-VM mutex scoped to
`workspace.prepare` / `workspace.export`. These ops acquire `workspace.prepare` / `workspace.export`. These ops acquire
`vmLocks[id]` only long enough to validate VM state + snapshot the `vmLocks[id]` (on VMService) only long enough to validate VM state
fields they need, release it, then acquire `workspaceLocks[id]` for and snapshot the fields they need, then release it and acquire
the slow guest I/O phase. That keeps `vm stop` / `delete` / `restart` `workspaceLocks[id]` for the slow guest I/O phase. That keeps
from queueing behind a running tar import. `vm stop` / `delete` / `restart` from queueing behind a running tar
- `handles *handleCache` — in-memory map of per-VM transient kernel/ import.
process handles (PID, tap device, loop devices, DM target). The - Test seams `workspaceInspectRepo`, `workspaceImport` are per-instance
cache is rebuildable: each VM directory holds a small fields.
`handles.json` scratch file that the daemon reads at startup to
reconstruct the cache and verify processes against `/proc` via ### `VMService` (`vm_service.go`, `vm_lifecycle.go`, `vm_create.go`, `vm_create_ops.go`, `vm_stats.go`, `vm_set.go`, `vm_disk.go`, `vm_handles.go`, `vm_authsync.go` (via WorkspaceService), `preflight.go`, `ports.go`, `vm.go`)
pgrep. Nothing in the durable `vms` SQLite row describes transient
kernel state. See `internal/daemon/vm_handles.go`. - `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID. Held for
the **entire lifecycle op** on that VM: `start` holds it across
preflight, bridge setup, firecracker spawn, and post-boot wiring
(seconds to tens of seconds). Two `start`/`stop`/`delete`/`set`
calls against the same VM therefore serialise; calls against
different VMs run independently.
- `createVMMu sync.Mutex` — narrow **reservation** mutex. `CreateVM` - `createVMMu sync.Mutex` — narrow **reservation** mutex. `CreateVM`
resolves the image (possibly auto-pulling, which self-locks on resolves the image (possibly auto-pulling, which self-locks on
`imageOpsMu`) and parses sizing flags outside this lock, then holds `imageOpsMu`) and parses sizing flags outside this lock, then holds
`createVMMu` only to re-check that the requested VM name is still `createVMMu` only to re-check that the requested VM name is still
free, allocate the next guest IP, and insert the initial "created" free, allocate the next guest IP, and insert the initial "created"
row. The subsequent boot flow runs under the per-VM lock only. row. The subsequent boot flow runs under the per-VM lock only.
Parallel `vm create` calls therefore overlap on image resolution and - `createOps opstate.Registry[*vmCreateOperationState]` — in-flight
boot; they contend only across the millisecond-scale name+IP claim. async create operations; owns its own lock.
- `imageOpsMu sync.Mutex` — narrow **publication** mutex. `PullImage` - `handles *handleCache` — in-memory map of per-VM transient kernel/
(both bundle and OCI paths), `RegisterImage`, `PromoteImage`, and process handles (PID, tap device, loop devices, DM target). Each
`DeleteImage` do their slow work (network fetch, ext4 build, VM directory holds a small `handles.json` scratch file so the
ownership fixup, file copy, SSH-key seeding) without this lock and cache can be rebuilt at daemon startup.
acquire it only for the commit atom: recheck name free, atomic - Test seams `guestWaitForSSH`, `guestDial` are per-instance fields.
rename of the staging dir to its final home, upsert the store row.
Two pulls for different images run fully in parallel; two pulls that
race to the same name are resolved at the recheck — the loser fails
fast and its staging dir is cleaned up.
- `createOps opstate.Registry[*vmCreateOperationState]` — in-flight VM
create operations; owns its own lock.
- `tapPool tapPool` — TAP interface pool; owns its own lock.
- `listener`, `vmDNS` — networking.
- `vmCaps` — registered VM capability hooks.
- `pullAndFlatten`, `finalizePulledRootfs`, `bundleFetch`,
`requestHandler`, `guestWaitForSSH`, `guestDial`,
`workspaceInspectRepo`, `workspaceImport` — injectable seams used by tests.
## Subpackages ## Subpackages
Stateless helpers that don't need the `Daemon` composition root have Stateless helpers with no need for a service pointer live in
been lifted into subpackages. Lifecycle orchestration, image-registry subpackages. Each takes explicit dependencies (typically a
orchestration, host networking bootstrap, background reconciliation,
and the JSON-RPC dispatch all still live in this package — it is not
"just orchestration." ~29 files and ~130 `func (d *Daemon)` methods
share the root struct today. A future project would be to split VM
lifecycle, image management, and the background reconciler into
services with explicit interfaces; that's out of scope for v0.1.0.
Each subpackage takes explicit dependencies (typically a
`system.Runner`-compatible interface) and holds no global state beyond `system.Runner`-compatible interface) and holds no global state beyond
small test seams. small test seams.
| Subpackage | Purpose | | Subpackage | Purpose |
| --------------------------------- | ---------------------------------------------------------------------- | | ---------------------------- | ---------------------------------------------------------------------- |
| `internal/daemon/opstate` | Generic `Registry[T AsyncOp]` for async-operation bookkeeping. | | `internal/daemon/opstate` | Generic `Registry[T AsyncOp]` for async-operation bookkeeping. |
| `internal/daemon/dmsnap` | Device-mapper COW snapshot create/cleanup/remove. | | `internal/daemon/dmsnap` | Device-mapper COW snapshot create/cleanup/remove. |
| `internal/daemon/fcproc` | Firecracker process primitives (bridge, tap, binary, PID, kill, wait). | | `internal/daemon/fcproc` | Firecracker process primitives (bridge, tap, binary, PID, kill, wait). |
| `internal/daemon/imagemgr` | Image subsystem pure helpers: validators, staging, build script gen. | | `internal/daemon/imagemgr` | Image subsystem pure helpers: validators, staging, build script gen. |
| `internal/daemon/workspace` | Workspace helpers: git inspection, copy prep, guest import script. | | `internal/daemon/workspace` | Workspace helpers: git inspection, copy prep, guest import script. |
All subpackages are leaves — no intra-daemon subpackage imports another. All subpackages are leaves — no intra-daemon subpackage imports another.
## Lock ordering ## Lock ordering
Acquire in this order, release in reverse. Never acquire in the opposite Acquire in this order, release in reverse. Never acquire in the
direction. opposite direction.
``` ```
vmLocks[id] → workspaceLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks VMService.vmLocks[id] → WorkspaceService.workspaceLocks[id]
→ {VMService.createVMMu, ImageService.imageOpsMu}
→ subsystem-local locks
``` ```
`vmLocks[id]` and `workspaceLocks[id]` are NEVER held at the same `vmLocks[id]` and `workspaceLocks[id]` are NEVER held at the same
@ -98,14 +139,15 @@ for the guest I/O phase. Regular lifecycle ops (`start`, `stop`,
`delete`, `set`) do NOT do this split — they hold `vmLocks[id]` `delete`, `set`) do NOT do this split — they hold `vmLocks[id]`
across the whole flow. across the whole flow.
Subsystem-local locks (`tapPool.mu`, `opstate.Registry` mu) are leaves. Subsystem-local locks (`tapPool.mu`, `opstate.Registry` mu,
They do not contend with each other. `handleCache.mu`) are leaves. They do not contend with each other.
Notes: Notes:
- `vmLocks[id]` is the outer lock for any operation scoped to a single VM. - `vmLocks[id]` is the outer lock for any operation scoped to a single
Acquired via `withVMLockByID` / `withVMLockByRef`. The callback runs VM. Acquired via `VMService.withVMLockByID` / `withVMLockByRef`. The
under the lock — treat the whole function body as critical section. callback runs under the lock — treat the whole function body as
critical section.
- `createVMMu` is held only across the VM-name reservation + IP - `createVMMu` is held only across the VM-name reservation + IP
allocation + initial UpsertVM. Image resolution and the full boot allocation + initial UpsertVM. Image resolution and the full boot
flow happen outside it. flow happen outside it.
@ -117,6 +159,14 @@ Notes:
discouraged; copy needed state out under the lock and release before discouraged; copy needed state out under the lock and release before
blocking I/O. blocking I/O.
## Reconcile and background work
`Daemon.reconcile(ctx)` is the orchestrator run at startup. It
rehydrates the handle cache, reaps stale VMs, and republishes DNS
records. `Daemon.backgroundLoop()` is the ticker fan-out —
`VMService.pollStats`, `VMService.stopStaleVMs`, and
`VMService.pruneVMCreateOperations` run on independent tickers.
## External API ## External API
Only `internal/cli` imports this package. The surface is: Only `internal/cli` imports this package. The surface is:
@ -126,5 +176,6 @@ Only `internal/cli` imports this package. The surface is:
- `(*Daemon).Close() error` - `(*Daemon).Close() error`
- `daemon.Doctor(...)` — host diagnostics (no receiver). - `daemon.Doctor(...)` — host diagnostics (no receiver).
All other `*Daemon` methods are reached only through the RPC `dispatch` All other methods live on the four services and are reached only
switch in `daemon.go` and are free to move/rename during refactoring. through the RPC `dispatch` switch in `daemon.go`. They are free to
move/rename during refactoring.

View file

@ -1,76 +1,74 @@
// Package daemon hosts the Banger daemon process. // Package daemon hosts the Banger daemon process.
// //
// The daemon exposes a JSON-RPC endpoint over a Unix socket. It owns VM // The daemon exposes a JSON-RPC endpoint over a Unix socket. The
// lifecycle, image management, host networking bootstrap, and state // *Daemon type is a thin composition root: it holds shared
// persistence via internal/store. // infrastructure (store, runner, logger, layout, config, listener)
// plus pointers to four focused services and forwards RPCs to them.
// //
// The package is organised into cohesive groups. Pure stateless helpers for // Services:
// each group have been lifted into subpackages; orchestrator methods
// (Daemon receivers) stay here and compose them.
// //
// Subpackages: // *HostNetwork Bridge / tap pool / NAT / DNS / firecracker
// process / DM snapshots / vsock readiness.
// Owns tapPool and vmDNS.
// *ImageService Register / promote / delete / pull (bundle +
// OCI) / kernel catalog / managed-seed refresh.
// Owns imageOpsMu.
// *WorkspaceService workspace.prepare / workspace.export + the
// per-VM authorised-key and git-identity sync
// that runs at start. Owns workspaceLocks.
// *VMService VM lifecycle (create/start/stop/restart/kill/
// delete/set), stats, ports, preflight. Owns
// vmLocks, createVMMu, createOps, handles.
// //
// internal/daemon/opstate Generic Registry[T AsyncOp] for async // Subpackages (stateless helpers):
// operations (VM create). //
// internal/daemon/opstate Generic Registry[T AsyncOp].
// internal/daemon/dmsnap Device-mapper COW snapshot lifecycle. // internal/daemon/dmsnap Device-mapper COW snapshot lifecycle.
// internal/daemon/fcproc Firecracker process helpers: bridge/tap, // internal/daemon/fcproc Firecracker process helpers.
// binary resolution, PID lookup, wait/kill. // internal/daemon/imagemgr Image subsystem helpers.
// internal/daemon/imagemgr Image subsystem helpers: path validation, // internal/daemon/workspace Workspace helpers.
// artifact staging, guest provisioning script
// generator, metadata.
// internal/daemon/workspace Workspace helpers: git repo inspection,
// shallow copy prep, guest-side import,
// finalize script generation, shell quoting.
// //
// VM lifecycle (in this package): // File inventory:
// //
// vm_create.go CreateVM and create-time disk provisioning // daemon.go Composition root, Open/Close/Serve, dispatch,
// vm_lifecycle.go Start/Stop/Restart/Kill/Delete // reconcile orchestrator, backgroundLoop.
// vm_set.go SetVM mutation // host_network.go HostNetwork struct + constructor.
// vm_stats.go stats, health, ping, stale reaper // image_service.go ImageService struct + constructor + FindImage.
// vm_disk.go system overlay, work disk provisioning // workspace_service.go WorkspaceService struct + constructor.
// vm_authsync.go per-VM authorized_key, git identity, auth file sync // vm_service.go VMService struct + constructor + FindVM,
// vm_create_ops.go async begin/status/cancel (uses opstate.Registry) // TouchVM, withVMLock* family, lockVMID.
// vm_locks.go vmLockSet: per-VM mutex set
// vm.go fcproc forwarders, DNS helpers, small utilities
// capabilities.go pluggable capability hooks executed at VM start
// preflight.go prereq validation for VM start
// snapshot.go dmsnap forwarders + dmSnapshotHandles type alias
// ports.go port forwarding inspection
// //
// Image management (in this package): // nat.go, dns_routing.go, tap_pool.go, snapshot.go HostNetwork methods.
// images.go, images_pull.go, image_seed.go, kernels.go ImageService methods.
// workspace.go, vm_authsync.go WorkspaceService methods.
// vm_lifecycle.go, vm_create.go, vm_create_ops.go,
// vm_stats.go, vm_set.go, vm_disk.go, vm_handles.go,
// ports.go, preflight.go VMService methods.
// //
// images.go register, promote, delete, find, list // vm.go Cross-service constants, rebuildDNS /
// images_pull.go image pull: catalog (bundle) + OCI paths // cleanupRuntime / generateName (*VMService),
// image_seed.go managed work-seed SSH fingerprint refresh // and small stateless utilities.
// // capabilities.go Pluggable capability hooks executed at VM
// Guest interaction (in this package): // start. Hook methods take *Daemon; VMService
// // reaches them through a capabilityHooks seam.
// guest_ssh.go guestSSHClient, dialGuest, waitForGuestSSH // vm_locks.go vmLockSet primitive.
// ssh_client_config.go daemon-managed SSH client key material // guest_ssh.go guestSSHClient, dialGuest, waitForGuestSSH.
// workspace.go ExportVMWorkspace, PrepareVMWorkspace // ssh_client_config.go Daemon-managed SSH client key material.
// // doctor.go Host diagnostics.
// Host bootstrap (in this package): // logger.go slog configuration.
// // runtime_assets.go Companion-binary paths.
// nat.go NAT prereq registration
// dns_routing.go systemd-resolved per-interface routing
// tap_pool.go TAP interface pool (state in tapPool type)
//
// Core (in this package):
//
// daemon.go Daemon struct, Open/Close/Serve, dispatch
// doctor.go host diagnostics
// logger.go slog configuration
// runtime_assets.go paths to bundled companion binaries
// //
// Lock ordering: // Lock ordering:
// //
// vmLocks[id] → workspaceLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks // VMService.vmLocks[id] → WorkspaceService.workspaceLocks[id]
// → {VMService.createVMMu, ImageService.imageOpsMu}
// → subsystem-local locks
// //
// vmLocks[id] is held across entire lifecycle ops (start/stop/delete/set), // vmLocks[id] and workspaceLocks[id] are NEVER held at the same
// not just a validation window — callers that want to avoid blocking // time. workspace.prepare acquires vmLocks[id] only long enough to
// lifecycle on slow guest I/O must explicitly split off to // validate VM state, releases it, then acquires workspaceLocks[id]
// workspaceLocks[id] the way workspace.prepare does. Subsystem-local // for the slow guest I/O phase. Lifecycle ops (start/stop/delete/
// locks (tapPool.mu, opstate.Registry mu) are leaves and do not contend // set) hold vmLocks[id] across the whole flow. Subsystem-local
// with each other. See ARCHITECTURE.md for details. // locks (tapPool.mu, opstate.Registry mu, handleCache.mu) are
// leaves. See ARCHITECTURE.md for details.
package daemon package daemon