banger/internal/daemon/ARCHITECTURE.md
Thales Maciel 2b6437d1b4
remove vm session feature
Cuts the daemon-managed guest-session machinery (start/list/show/
logs/stop/kill/attach/send). The feature shipped aimed at agent-
orchestration workflows (programmatic stdin piping into a long-lived
guest process) that aren't driving any concrete user today, and the
~2.3K LOC of daemon surface area — attach bridge, FIFO keepalive,
controller registry, sessionstream framing, SQLite persistence — was
locking in an API we'd have to keep through v0.1.0.

Anything session-flavoured that people actually need today can be
done with `vm ssh + tmux` or `vm run -- cmd`.

Deleted:
- internal/cli/commands_vm_session.go
- internal/daemon/{guest_sessions,session_lifecycle,session_attach,session_stream,session_controller}.go
- internal/daemon/session/ (guest-session helpers package)
- internal/sessionstream/ (framing package)
- internal/daemon/guest_sessions_test.go
- internal/store/guest_session_test.go
- GuestSession* types from internal/{api,model}
- Store UpsertGuestSession/GetGuestSession/ListGuestSessionsByVM/DeleteGuestSession + scanner helpers
- guest.session.* RPC dispatch entries
- 5 CLI session tests, 2 completion tests, 2 printer tests

Extracted:
- ShellQuote + FormatStepError lifted to internal/daemon/workspace/util.go
  (only non-session consumer); workspace package now self-contained
- internal/daemon/guest_ssh.go keeps guestSSHClient + dialGuest +
  waitForGuestSSH — still used by workspace prepare/export
- internal/daemon/fake_firecracker_test.go preserves the test helper
  that used to live in guest_sessions_test.go

Store schema: CREATE TABLE guest_sessions and its column migrations
removed. Existing dev DBs keep an orphan table (harmless, pre-v0.1.0).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:47:58 -03:00

95 lines
4.4 KiB
Markdown

# `internal/daemon` architecture
This document describes the current daemon package layout: the `Daemon`
composition root, the subpackages that own stateless helpers and shared
primitives, and the lock ordering every caller must respect.
## Composition
`Daemon` is the composition root. Subsystem state and locks live on their
owning types:
- Layout, config, store, runner, logger, pid — infrastructure handles.
- `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID. Held only
across short, synchronous state validation and DB mutations so slow
guest I/O does not block lifecycle ops on the same VM.
- `workspaceLocks vmLockSet` — per-VM mutex scoped to
`workspace.prepare` / `workspace.export`. Serialises concurrent
workspace operations on a single VM (two simultaneous tar imports
would clobber each other) without touching `vmLocks`, so
`vm stop` / `delete` / `restart` never queue behind a slow import.
- `handles *handleCache` — in-memory map of per-VM transient kernel/
process handles (PID, tap device, loop devices, DM target). The
cache is rebuildable: each VM directory holds a small
`handles.json` scratch file that the daemon reads at startup to
reconstruct the cache and verify processes against `/proc` via
pgrep. Nothing in the durable `vms` SQLite row describes transient
kernel state. See `internal/daemon/vm_handles.go`.
- `createVMMu sync.Mutex` — serialises `CreateVM` (guards name uniqueness
+ guest IP allocation window).
- `imageOpsMu sync.Mutex` — serialises image-registry mutations
(`PullImage`, `RegisterImage`, `PromoteImage`, `DeleteImage`).
- `createOps opstate.Registry[*vmCreateOperationState]` — in-flight VM
create operations; owns its own lock.
- `tapPool tapPool` — TAP interface pool; owns its own lock.
- `listener`, `vmDNS` — networking.
- `vmCaps` — registered VM capability hooks.
- `pullAndFlatten`, `finalizePulledRootfs`, `bundleFetch`,
`requestHandler`, `guestWaitForSSH`, `guestDial`,
`workspaceInspectRepo`, `workspaceImport` — injectable seams used by tests.
## Subpackages
Pure helpers have moved into subpackages so the daemon package itself stays
focused on orchestration. Each subpackage takes explicit dependencies
(typically a `system.Runner`-compatible interface) and holds no global
state beyond small test seams.
| Subpackage | Purpose |
| --------------------------------- | ---------------------------------------------------------------------- |
| `internal/daemon/opstate` | Generic `Registry[T AsyncOp]` for async-operation bookkeeping. |
| `internal/daemon/dmsnap` | Device-mapper COW snapshot create/cleanup/remove. |
| `internal/daemon/fcproc` | Firecracker process primitives (bridge, tap, binary, PID, kill, wait). |
| `internal/daemon/imagemgr` | Image subsystem pure helpers: validators, staging, build script gen. |
| `internal/daemon/workspace` | Workspace helpers: git inspection, copy prep, guest import script. |
All subpackages are leaves — no intra-daemon subpackage imports another.
## Lock ordering
Acquire in this order, release in reverse. Never acquire in the opposite
direction.
```
vmLocks[id] → workspaceLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks
```
`vmLocks[id]` and `workspaceLocks[id]` are NEVER held at the same
time. `workspace.prepare` acquires `vmLocks[id]` just long enough to
validate VM state, releases it, then acquires `workspaceLocks[id]`
for the guest I/O phase.
Subsystem-local locks (`tapPool.mu`, `opstate.Registry` mu) are leaves.
They do not contend with each other.
Notes:
- `vmLocks[id]` is the outer lock for any operation scoped to a single VM.
Acquired via `withVMLockByID` / `withVMLockByRef`.
- `createVMMu` and `imageOpsMu` are narrow: each guards one family of
mutations and is released before any blocking guest I/O.
- Holding a subsystem-local lock while calling into guest SSH is
discouraged; copy needed state out under the lock and release before
blocking I/O.
## External API
Only `internal/cli` imports this package. The surface is:
- `daemon.Open(ctx) (*Daemon, error)`
- `(*Daemon).Serve(ctx) error`
- `(*Daemon).Close() error`
- `daemon.Doctor(...)` — host diagnostics (no receiver).
All other `*Daemon` methods are reached only through the RPC `dispatch`
switch in `daemon.go` and are free to move/rename during refactoring.