Harden runtime diagnostics for milestone 3

Make the milestone 3 runtime story predictable instead of treating doctor, self-check, and startup failures as loosely related surfaces.

Split doctor and self-check into distinct read-only flows, add tri-state diagnostic status with stable IDs and next steps, and reuse that wording in CLI output, service logs, and tray-triggered diagnostics. Add non-mutating config/model probes, a make runtime-check gate, and public recovery/validation docs for the X11 GA roadmap.

Validation: make runtime-check; PYTHONPATH=src python3 -m unittest discover -s tests -p 'test_*.py'; python3 -m py_compile src/*.py tests/*.py; PYTHONPATH=src python3 -m aman doctor --help; PYTHONPATH=src python3 -m aman self-check --help. Leave milestone 3 open in the roadmap until the manual X11 validation rows are filled.

2026-03-12 17:41:23 -03:00

16 KiB

Raw Blame History

aman

Local amanuensis

Python X11 STT daemon that records audio, runs Whisper, applies local AI cleanup, and injects text.

Target User

The canonical Aman user is a desktop professional who wants dictation and rewriting features without learning Python tooling.

End-user path: portable X11 release bundle for mainstream distros.
Alternate package channels: Debian/Ubuntu .deb and Arch packaging inputs.
Developer path: Python/uv workflows.

Persona details and distribution policy are documented in docs/persona-and-distribution.md.

Release Channels

Aman is not GA yet for X11 users across distros. The maintained release channels are:

Portable X11 bundle: current canonical end-user channel.
Debian/Ubuntu .deb: secondary packaged channel.
Arch PKGBUILD plus source tarball: secondary maintainer and power-user channel.
Python wheel and sdist: current developer and integrator channel.

GA Support Matrix

Surface	Contract
Desktop session	X11 only
Runtime dependencies	Installed from the distro package manager
Supported daily-use mode	`systemd --user` service
Manual foreground mode	`aman run` for setup, support, and debugging
Canonical recovery sequence	`aman doctor` -> `aman self-check` -> `journalctl --user -u aman` -> `aman run --verbose`
Representative GA validation families	Debian/Ubuntu, Arch, Fedora, openSUSE
Portable installer prerequisite	System CPython `3.10`, `3.11`, or `3.12`

Install (Portable Bundle)

Download aman-x11-linux-<version>.tar.gz and aman-x11-linux-<version>.tar.gz.sha256, install the runtime dependencies for your distro, then install the bundle:

sha256sum -c aman-x11-linux-<version>.tar.gz.sha256
tar -xzf aman-x11-linux-<version>.tar.gz
cd aman-x11-linux-<version>
./install.sh

The installer writes the user service, updates ~/.local/bin/aman, and runs systemctl --user enable --now aman automatically. On first service start, Aman opens the graphical settings window if ~/.config/aman/config.json does not exist yet.

Upgrade by extracting the newer bundle and running its install.sh again. Config and cache are preserved by default.

Uninstall with:

~/.local/share/aman/current/uninstall.sh

Add --purge if you also want to remove ~/.config/aman/ and ~/.cache/aman/.

Detailed install, upgrade, uninstall, and conflict guidance lives in docs/portable-install.md.

Secondary Channels

Debian/Ubuntu (`.deb`)

Download a release artifact and install it:

sudo apt install ./aman_<version>_<arch>.deb
systemctl --user daemon-reload
systemctl --user enable --now aman

Arch Linux

Use the generated packaging inputs (PKGBUILD + source tarball) in dist/arch/ or your own packaging pipeline.

Daily-Use And Support Modes

Supported daily-use path: install Aman, then run it as a systemd --user service.
Supported manual path: use aman run in the foreground while setting up, debugging, or collecting support logs.

Recovery Sequence

When Aman does not behave as expected, use this order:

Run aman doctor --config ~/.config/aman/config.json.
Run aman self-check --config ~/.config/aman/config.json.
Inspect journalctl --user -u aman -f.
Re-run Aman in the foreground with aman run --config ~/.config/aman/config.json --verbose.

See docs/runtime-recovery.md for the failure IDs, example output, and the common recovery branches behind this sequence.

Diagnostics

aman doctor is the fast, read-only preflight for config, X11 session, audio runtime, input resolution, hotkey availability, injection backend selection, and service prerequisites.
aman self-check is the deeper, still read-only installed-system readiness check. It includes every doctor check plus managed model cache, cache writability, service unit/state, and startup readiness.
The tray Run Diagnostics action runs the same deeper self-check path and logs any non-ok results.
Exit code 0 means every check finished as ok or warn. Exit code 2 means at least one check finished as fail.

Example output:

[OK] config.load: loaded config from /home/user/.config/aman/config.json
[WARN] model.cache: managed editor model is not cached at /home/user/.cache/aman/models/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf | next_step: start Aman once on a networked connection so it can download the managed editor model, then rerun `aman self-check --config /home/user/.config/aman/config.json`
[FAIL] service.state: user service is installed but failed to start | next_step: inspect `journalctl --user -u aman -f` to see why aman.service is failing
overall: fail

Runtime Dependencies

X11
PortAudio runtime (libportaudio2 or distro equivalent)
GTK3 and AppIndicator runtime (gtk3, libayatana-appindicator3)
Python GTK and X11 bindings (python3-gi/python-gobject, python-xlib)

Ubuntu/Debian

sudo apt install -y libportaudio2 python3-gi python3-xlib gir1.2-gtk-3.0 libayatana-appindicator3-1

Arch Linux

sudo pacman -S --needed portaudio gtk3 libayatana-appindicator python-gobject python-xlib

Fedora

sudo dnf install -y portaudio gtk3 libayatana-appindicator-gtk3 python3-gobject python3-xlib

openSUSE

sudo zypper install -y portaudio gtk3 libayatana-appindicator3-1 python3-gobject python3-python-xlib

Quickstart (Portable Bundle)

For supported daily use on the portable bundle:

Install the runtime dependencies for your distro.
Download and extract the portable release bundle.
Run ./install.sh from the extracted bundle.
Save the first-run settings window.
Validate the install:

aman self-check --config ~/.config/aman/config.json

If you need the manual foreground path for setup or support:

aman run --config ~/.config/aman/config.json

On first launch, Aman opens a graphical settings window automatically. It includes sections for:

microphone input
hotkey
output backend
writing profile
output safety policy
runtime strategy (managed vs custom Whisper path)
help/about actions

Config

Create ~/.config/aman/config.json (or let aman create it automatically on first start if missing):

{
  "config_version": 1,
  "daemon": { "hotkey": "Cmd+m" },
  "recording": { "input": "0" },
  "stt": {
    "provider": "local_whisper",
    "model": "base",
    "device": "cpu",
    "language": "auto"
  },
  "models": {
    "allow_custom_models": false,
    "whisper_model_path": ""
  },
  "injection": {
    "backend": "clipboard",
    "remove_transcription_from_clipboard": false
  },
  "safety": {
    "enabled": true,
    "strict": false
  },
  "ux": {
    "profile": "default",
    "show_notifications": true
  },
  "advanced": {
    "strict_startup": true
  },
  "vocabulary": {
    "replacements": [
      { "from": "Martha", "to": "Marta" },
      { "from": "docker", "to": "Docker" }
    ],
    "terms": ["Systemd", "Kubernetes"]
  }
}

config_version is required and currently must be 1. Legacy unversioned configs are migrated automatically on load.

Recording input can be a device index (preferred) or a substring of the device name. If recording.input is explicitly set and cannot be resolved, startup fails instead of falling back to a default device.

Config validation is strict: unknown fields are rejected with a startup error. Validation errors include the exact field and an example fix snippet.

Profile options:

ux.profile=default: baseline cleanup behavior.
ux.profile=fast: lower-latency AI generation settings.
ux.profile=polished: same cleanup depth as default.
safety.enabled=true: enables fact-preservation checks (names/numbers/IDs/URLs).
safety.strict=false: fallback to safer draft when fact checks fail.
safety.strict=true: reject output when fact checks fail.
advanced.strict_startup=true: keep fail-fast startup validation behavior.

Transcription language:

stt.language=auto (default) enables Whisper auto-detection.
You can pin language with Whisper codes (for example en, es, pt, ja, zh) or common names like English/Spanish.
If a pinned language hint is rejected by the runtime, Aman logs a warning and retries with auto-detect.

Hotkey notes:

Use one key plus optional modifiers (for example Cmd+m, Super+m, Ctrl+space).
Super and Cmd are equivalent aliases for the same modifier.

AI cleanup is always enabled and uses the locked local Qwen2.5-1.5B GGUF model downloaded to ~/.cache/aman/models/ during daemon initialization. Prompts are structured with semantic XML tags for both system and user messages to improve instruction adherence and output consistency. Cleanup runs in two local passes:

pass 1 drafts cleaned text and labels ambiguity decisions (correction/literal/spelling/filler)
pass 2 audits those decisions conservatively and emits final cleaned_text This keeps Aman in dictation mode: it does not execute editing instructions embedded in transcript text. Before Aman reports ready, local llama runs a tiny warmup completion so the first real transcription is faster. If warmup fails and advanced.strict_startup=true, startup fails fast. With advanced.strict_startup=false, Aman logs a warning and continues. Model downloads use a network timeout and SHA256 verification before activation. Cached models are checksum-verified on startup; mismatches trigger a forced redownload.

Provider policy:

Aman-managed mode (recommended) is the canonical supported UX: Aman handles model lifecycle and safe defaults for you.
Expert mode is opt-in and exposes a custom Whisper model path for advanced users.
Editor model/provider configuration is intentionally not exposed in config.
Custom Whisper paths are only active with models.allow_custom_models=true.

Use -v/--verbose to enable DEBUG logs, including recognized/processed transcript text and llama.cpp logs (llama:: prefix). Without -v, logs are INFO level.

Vocabulary correction:

vocabulary.replacements is deterministic correction (from -> to).
vocabulary.terms is a preferred spelling list used as hinting context.
Wildcards are intentionally rejected (*, ?, [, ], {, }) to avoid ambiguous rules.
Rules are deduplicated case-insensitively; conflicting replacements are rejected.

STT hinting:

Vocabulary is passed to Whisper as compact hotwords only when that argument is supported by the installed faster-whisper runtime.
Aman enables word_timestamps when supported and runs a conservative alignment heuristic pass (self-correction/restart detection) before the editor stage.

Fact guard:

Aman runs a deterministic fact-preservation verifier after editor output.
If facts are changed/invented and safety.strict=false, Aman falls back to the safer aligned draft.
If facts are changed/invented and safety.strict=true, processing fails and output is not injected.

systemd user service

make install-service

Service notes:

The supported daily-use path is the user service.
The portable installer writes and enables the user unit automatically.
The local developer unit launched by make install-service still resolves aman from PATH.
Package installs should provide the aman command automatically.
Use aman run --config ~/.config/aman/config.json in the foreground for setup, support, or debugging.
Start recovery with aman doctor, then aman self-check, before inspecting systemctl --user status aman and journalctl --user -u aman -f.
See docs/runtime-recovery.md for the expected diagnostic IDs and next steps.

Usage

Press the hotkey once to start recording.
Press it again to stop and run STT.
Press Esc while recording to cancel without processing.
Esc is only captured during active recording.
Recording start is aborted if the cancel listener cannot be armed.
Transcript contents are logged only when -v/--verbose is used.
Tray menu includes: Settings..., Help, About, Pause/Resume Aman, Reload Config, Run Diagnostics, Open Config Path, and Quit.
If required settings are not saved, Aman enters a Settings Required tray mode and does not capture audio.

Wayland note:

Running under Wayland currently exits with a message explaining that it is not supported yet.

Injection backends:

clipboard: copy to clipboard and inject via Ctrl+Shift+V (GTK clipboard + XTest)
injection: type the text with simulated keypresses (XTest)
injection.remove_transcription_from_clipboard: when true and backend is clipboard, restores/clears the clipboard after paste so the transcript is not kept there

Editor stage:

Canonical local llama.cpp editor model (managed by Aman).
Runtime flow is explicit: ASR -> Alignment Heuristics -> Editor -> Fact Guard -> Vocabulary -> Injection.

Build and packaging (maintainers):

make build
make package
make package-portable
make package-deb
make package-arch
make runtime-check
make release-check

make package-portable builds dist/aman-x11-linux-<version>.tar.gz plus its .sha256 file.

make package-deb installs Python dependencies while creating the package. For offline packaging, set AMAN_WHEELHOUSE_DIR to a directory containing the required wheels.

Benchmarking (STT bypass, always dry):

aman bench --text "draft a short email to Marta confirming lunch" --repeat 10 --warmup 2
aman bench --text-file ./bench-input.txt --repeat 20 --json

bench does not capture audio and never injects text to desktop apps. It runs the processing path from input transcript text through alignment/editor/fact-guard/vocabulary cleanup and prints timing summaries.

Model evaluation lab (dataset + matrix sweep):

aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --output benchmarks/results/latest.json
aman sync-default-model --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py

eval-models runs a structured model/parameter sweep over a JSONL dataset and outputs latency + quality metrics (including hybrid score, pass-1/pass-2 latency breakdown, and correction safety metrics for I mean and spelling-disambiguation cases). When --heuristic-dataset is provided, the report also includes alignment-heuristic quality metrics (exact match, token-F1, rule precision/recall, per-tag breakdown). sync-default-model promotes the report winner to the managed default model constants using the artifact registry and can be run in --check mode for CI/release gates.

Control:

make run
make run config.example.json
make doctor
make self-check
make runtime-check
make eval-models
make sync-default-model
make check-default-model
make check

Developer setup (optional, uv workflow):

uv sync --extra x11
uv run aman run --config ~/.config/aman/config.json

Developer setup (optional, pip workflow):

make install-local
aman run --config ~/.config/aman/config.json

CLI (support and developer workflows):

aman doctor --config ~/.config/aman/config.json --json
aman self-check --config ~/.config/aman/config.json --json
aman run --config ~/.config/aman/config.json
aman bench --text "example transcript" --repeat 5 --warmup 1
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl --json
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --json
aman sync-default-model --check --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
aman version
aman init --config ~/.config/aman/config.json --force

16 KiB Raw Blame History

aman

Target User

Release Channels

GA Support Matrix

Install (Portable Bundle)

Secondary Channels

Debian/Ubuntu (.deb)

Arch Linux

Daily-Use And Support Modes

Recovery Sequence

Diagnostics

Runtime Dependencies

Quickstart (Portable Bundle)

Config

systemd user service

Usage

16 KiB

Raw Blame History

Debian/Ubuntu (`.deb`)