aman/docs/config-reference.md
Thales Maciel 359b5fbaf4 Land milestone 4 first-run docs and media
Make the X11 user path visible on first contact instead of burying it under config and maintainer detail.

Rewrite the README around the supported quickstart, expected tray and dictation result, install validation, troubleshooting, and linked follow-on docs. Split deep config and developer material into separate docs, add checked-in screenshots plus a short WebM walkthrough, and add a generator so the media assets stay reproducible.

Also fix the CLI discovery gap by letting `aman --help` show the top-level command surface while keeping implicit foreground `run` behavior, and align the settings, help, and about copy with the supported service-plus-diagnostics model.

Validation: `PYTHONPATH=src python3 -m unittest tests.test_aman_cli tests.test_config_ui`; `PYTHONPATH=src python3 -m unittest discover -s tests -p 'test_*.py'`; `python3 -m py_compile src/*.py tests/*.py scripts/generate_docs_media.py`; `PYTHONPATH=src python3 -m aman --help`.

Milestone 4 stays open in the roadmap because `docs/x11-ga/first-run-review-notes.md` still needs a real non-implementer walkthrough.
2026-03-12 18:30:34 -03:00

154 lines
5.1 KiB
Markdown

# Config Reference
Use this document when you need the full Aman config shape and the advanced
behavior notes that are intentionally kept out of the first-run README path.
## Example config
```json
{
"config_version": 1,
"daemon": { "hotkey": "Cmd+m" },
"recording": { "input": "0" },
"stt": {
"provider": "local_whisper",
"model": "base",
"device": "cpu",
"language": "auto"
},
"models": {
"allow_custom_models": false,
"whisper_model_path": ""
},
"injection": {
"backend": "clipboard",
"remove_transcription_from_clipboard": false
},
"safety": {
"enabled": true,
"strict": false
},
"ux": {
"profile": "default",
"show_notifications": true
},
"advanced": {
"strict_startup": true
},
"vocabulary": {
"replacements": [
{ "from": "Martha", "to": "Marta" },
{ "from": "docker", "to": "Docker" }
],
"terms": ["Systemd", "Kubernetes"]
}
}
```
`config_version` is required and currently must be `1`. Legacy unversioned
configs are migrated automatically on load.
## Recording and validation
- `recording.input` can be a device index (preferred) or a substring of the
device name.
- If `recording.input` is explicitly set and cannot be resolved, startup fails
instead of falling back to a default device.
- Config validation is strict: unknown fields are rejected with a startup
error.
- Validation errors include the exact field and an example fix snippet.
## Profiles and runtime behavior
- `ux.profile=default`: baseline cleanup behavior.
- `ux.profile=fast`: lower-latency AI generation settings.
- `ux.profile=polished`: same cleanup depth as default.
- `safety.enabled=true`: enables fact-preservation checks
(names/numbers/IDs/URLs).
- `safety.strict=false`: fallback to the safer aligned draft when fact checks
fail.
- `safety.strict=true`: reject output when fact checks fail.
- `advanced.strict_startup=true`: keep fail-fast startup validation behavior.
Transcription language:
- `stt.language=auto` enables Whisper auto-detection.
- You can pin language with Whisper codes such as `en`, `es`, `pt`, `ja`, or
`zh`, or common names such as `English` / `Spanish`.
- If a pinned language hint is rejected by the runtime, Aman logs a warning and
retries with auto-detect.
Hotkey notes:
- Use one key plus optional modifiers, for example `Cmd+m`, `Super+m`, or
`Ctrl+space`.
- `Super` and `Cmd` are equivalent aliases for the same modifier.
## Managed versus expert mode
- `Aman-managed` mode is the canonical supported UX: Aman handles model
lifecycle and safe defaults for you.
- `Expert mode` is opt-in and exposes a custom Whisper model path for advanced
users.
- Editor model/provider configuration is intentionally not exposed in config.
- Custom Whisper paths are only active with
`models.allow_custom_models=true`.
Compatibility note:
- `ux.show_notifications` remains in the config schema for compatibility, but
it is not part of the current supported first-run X11 surface and is not
exposed in the settings window.
## Cleanup and model lifecycle
AI cleanup is always enabled and uses the locked local
`Qwen2.5-1.5B-Instruct-Q4_K_M.gguf` model downloaded to
`~/.cache/aman/models/` during daemon initialization.
- Prompts use semantic XML tags for both system and user messages.
- Cleanup runs in two local passes:
- pass 1 drafts cleaned text and labels ambiguity decisions
(correction/literal/spelling/filler)
- pass 2 audits those decisions conservatively and emits final
`cleaned_text`
- Aman stays in dictation mode: it does not execute editing instructions
embedded in transcript text.
- Before Aman reports `ready`, the local editor runs a tiny warmup completion
so the first real transcription is faster.
- If warmup fails and `advanced.strict_startup=true`, startup fails fast.
- With `advanced.strict_startup=false`, Aman logs a warning and continues.
- Model downloads use a network timeout and SHA256 verification before
activation.
- Cached models are checksum-verified on startup; mismatches trigger a forced
redownload.
## Verbose logging and vocabulary
- `-v/--verbose` enables DEBUG logs, including recognized/processed transcript
text and `llama::` logs.
- Without `-v`, logs stay at INFO level.
Vocabulary correction:
- `vocabulary.replacements` is deterministic correction (`from -> to`).
- `vocabulary.terms` is a preferred spelling list used as hinting context.
- Wildcards are intentionally rejected (`*`, `?`, `[`, `]`, `{`, `}`) to avoid
ambiguous rules.
- Rules are deduplicated case-insensitively; conflicting replacements are
rejected.
STT hinting:
- Vocabulary is passed to Whisper as compact `hotwords` only when that argument
is supported by the installed `faster-whisper` runtime.
- Aman enables `word_timestamps` when supported and runs a conservative
alignment heuristic pass before the editor stage.
Fact guard:
- Aman runs a deterministic fact-preservation verifier after editor output.
- If facts are changed or invented and `safety.strict=false`, Aman falls back
to the safer aligned draft.
- If facts are changed or invented and `safety.strict=true`, processing fails
and output is not injected.