Harden runtime diagnostics for milestone 3

Make the milestone 3 runtime story predictable instead of treating doctor, self-check, and startup failures as loosely related surfaces.

Split doctor and self-check into distinct read-only flows, add tri-state diagnostic status with stable IDs and next steps, and reuse that wording in CLI output, service logs, and tray-triggered diagnostics. Add non-mutating config/model probes, a make runtime-check gate, and public recovery/validation docs for the X11 GA roadmap.

Validation: make runtime-check; PYTHONPATH=src python3 -m unittest discover -s tests -p 'test_*.py'; python3 -m py_compile src/*.py tests/*.py; PYTHONPATH=src python3 -m aman doctor --help; PYTHONPATH=src python3 -m aman self-check --help. Leave milestone 3 open in the roadmap until the manual X11 validation rows are filled.
This commit is contained in:
Thales Maciel 2026-03-12 17:41:23 -03:00
parent a3368056ff
commit ed1b59240b
No known key found for this signature in database
GPG key ID: 33112E6833C34679
16 changed files with 1298 additions and 248 deletions

View file

@ -103,6 +103,31 @@ When Aman does not behave as expected, use this order:
3. Inspect `journalctl --user -u aman -f`.
4. Re-run Aman in the foreground with `aman run --config ~/.config/aman/config.json --verbose`.
See [`docs/runtime-recovery.md`](docs/runtime-recovery.md) for the failure IDs,
example output, and the common recovery branches behind this sequence.
## Diagnostics
- `aman doctor` is the fast, read-only preflight for config, X11 session,
audio runtime, input resolution, hotkey availability, injection backend
selection, and service prerequisites.
- `aman self-check` is the deeper, still read-only installed-system readiness
check. It includes every `doctor` check plus managed model cache, cache
writability, service unit/state, and startup readiness.
- The tray `Run Diagnostics` action runs the same deeper `self-check` path and
logs any non-`ok` results.
- Exit code `0` means every check finished as `ok` or `warn`. Exit code `2`
means at least one check finished as `fail`.
Example output:
```text
[OK] config.load: loaded config from /home/user/.config/aman/config.json
[WARN] model.cache: managed editor model is not cached at /home/user/.cache/aman/models/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf | next_step: start Aman once on a networked connection so it can download the managed editor model, then rerun `aman self-check --config /home/user/.config/aman/config.json`
[FAIL] service.state: user service is installed but failed to start | next_step: inspect `journalctl --user -u aman -f` to see why aman.service is failing
overall: fail
```
## Runtime Dependencies
- X11
@ -319,6 +344,8 @@ Service notes:
setup, support, or debugging.
- Start recovery with `aman doctor`, then `aman self-check`, before inspecting
`systemctl --user status aman` and `journalctl --user -u aman -f`.
- See [`docs/runtime-recovery.md`](docs/runtime-recovery.md) for the expected
diagnostic IDs and next steps.
## Usage
@ -354,6 +381,7 @@ make package
make package-portable
make package-deb
make package-arch
make runtime-check
make release-check
```
@ -398,6 +426,7 @@ make run
make run config.example.json
make doctor
make self-check
make runtime-check
make eval-models
make sync-default-model
make check-default-model