workspace prepare: release VM mutex before guest I/O
Previously withVMLockByRef held the per-VM mutex across InspectRepo,
waitForGuestSSH, dialGuest, ImportRepoToGuest (the tar stream!), and
the readonly chmod. A large repo could block `vm stop` / `vm delete`
/ `vm restart` on the same VM for however long the import took.
Split into two phases:
1. VM mutex held briefly to validate state (running + PID alive)
and snapshot the fields needed for SSH (guest IP, api sock).
2. VM mutex released. Acquire workspaceLocks[id] — a separate
per-VM mutex scoped to workspace.prepare / workspace.export —
for the guest I/O phase.
Lifecycle ops (stop/delete/restart/set) only take vmLocks, so they
no longer queue behind a slow import. Two concurrent prepares on the
same VM still serialise via workspaceLocks so tar streams don't
interleave. ExportVMWorkspace also acquires workspaceLocks to avoid
snapshotting a half-streamed import.
Two regression tests (sequential — they swap package-level seams):
ReleasesVMLockDuringGuestIO: stall the import fake, assert the VM
mutex is acquirable from another goroutine during the stall.
SerialisesConcurrentPreparesOnSameVM: 3 concurrent prepares, assert
Import is only ever invoked 1-at-a-time per VM.
ARCHITECTURE.md documents the split + updated lock ordering.
This commit is contained in:
parent
99de42385f
commit
6cd52d12f4
4 changed files with 265 additions and 22 deletions
|
|
@ -10,7 +10,14 @@ primitives, and the lock ordering every caller must respect.
|
|||
owning types:
|
||||
|
||||
- Layout, config, store, runner, logger, pid — infrastructure handles.
|
||||
- `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID.
|
||||
- `vmLocks vmLockSet` — per-VM `*sync.Mutex`, one per VM ID. Held only
|
||||
across short, synchronous state validation and DB mutations so slow
|
||||
guest I/O does not block lifecycle ops on the same VM.
|
||||
- `workspaceLocks vmLockSet` — per-VM mutex scoped to
|
||||
`workspace.prepare` / `workspace.export`. Serialises concurrent
|
||||
workspace operations on a single VM (two simultaneous tar imports
|
||||
would clobber each other) without touching `vmLocks`, so
|
||||
`vm stop` / `delete` / `restart` never queue behind a slow import.
|
||||
- `createVMMu sync.Mutex` — serialises `CreateVM` (guards name uniqueness
|
||||
+ guest IP allocation window).
|
||||
- `imageOpsMu sync.Mutex` — serialises image-registry mutations
|
||||
|
|
@ -51,9 +58,14 @@ Acquire in this order, release in reverse. Never acquire in the opposite
|
|||
direction.
|
||||
|
||||
```
|
||||
vmLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks
|
||||
vmLocks[id] → workspaceLocks[id] → {createVMMu, imageOpsMu} → subsystem-local locks
|
||||
```
|
||||
|
||||
`vmLocks[id]` and `workspaceLocks[id]` are NEVER held at the same
|
||||
time. `workspace.prepare` acquires `vmLocks[id]` just long enough to
|
||||
validate VM state, releases it, then acquires `workspaceLocks[id]`
|
||||
for the guest I/O phase.
|
||||
|
||||
Subsystem-local locks (`tapPool.mu`, `sessionRegistry.mu`,
|
||||
`opstate.Registry` mu, `guestSessionController.attachMu` /
|
||||
`writeMu`) are leaves. They do not contend with each other.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue