Land milestone 4 first-run docs and media

Make the X11 user path visible on first contact instead of burying it under config and maintainer detail. Rewrite the README around the supported quickstart, expected tray and dictation result, install validation, troubleshooting, and linked follow-on docs. Split deep config and developer material into separate docs, add checked-in screenshots plus a short WebM walkthrough, and add a generator so the media assets stay reproducible. Also fix the CLI discovery gap by letting `aman --help` show the top-level command surface while keeping implicit foreground `run` behavior, and align the settings, help, and about copy with the supported service-plus-diagnostics model. Validation: `PYTHONPATH=src python3 -m unittest tests.test_aman_cli tests.test_config_ui`; `PYTHONPATH=src python3 -m unittest discover -s tests -p 'test_*.py'`; `python3 -m py_compile src/*.py tests/*.py scripts/generate_docs_media.py`; `PYTHONPATH=src python3 -m aman --help`. Milestone 4 stays open in the roadmap because `docs/x11-ga/first-run-review-notes.md` still needs a real non-implementer walkthrough.
2026-03-12 18:30:34 -03:00 · 2026-03-12 18:30:34 -03:00 · 359b5fbaf4
commit 359b5fbaf4
parent ed1b59240b
16 changed files with 788 additions and 411 deletions
--- a/docs/config-reference.md
+++ b/docs/config-reference.md
@ -0,0 +1,154 @@
+# Config Reference
+
+Use this document when you need the full Aman config shape and the advanced
+behavior notes that are intentionally kept out of the first-run README path.
+
+## Example config
+
+```json
+{
+  "config_version": 1,
+  "daemon": { "hotkey": "Cmd+m" },
+  "recording": { "input": "0" },
+  "stt": {
+    "provider": "local_whisper",
+    "model": "base",
+    "device": "cpu",
+    "language": "auto"
+  },
+  "models": {
+    "allow_custom_models": false,
+    "whisper_model_path": ""
+  },
+  "injection": {
+    "backend": "clipboard",
+    "remove_transcription_from_clipboard": false
+  },
+  "safety": {
+    "enabled": true,
+    "strict": false
+  },
+  "ux": {
+    "profile": "default",
+    "show_notifications": true
+  },
+  "advanced": {
+    "strict_startup": true
+  },
+  "vocabulary": {
+    "replacements": [
+      { "from": "Martha", "to": "Marta" },
+      { "from": "docker", "to": "Docker" }
+    ],
+    "terms": ["Systemd", "Kubernetes"]
+  }
+}
+```
+
+`config_version` is required and currently must be `1`. Legacy unversioned
+configs are migrated automatically on load.
+
+## Recording and validation
+
+- `recording.input` can be a device index (preferred) or a substring of the
+  device name.
+- If `recording.input` is explicitly set and cannot be resolved, startup fails
+  instead of falling back to a default device.
+- Config validation is strict: unknown fields are rejected with a startup
+  error.
+- Validation errors include the exact field and an example fix snippet.
+
+## Profiles and runtime behavior
+
+- `ux.profile=default`: baseline cleanup behavior.
+- `ux.profile=fast`: lower-latency AI generation settings.
+- `ux.profile=polished`: same cleanup depth as default.
+- `safety.enabled=true`: enables fact-preservation checks
+  (names/numbers/IDs/URLs).
+- `safety.strict=false`: fallback to the safer aligned draft when fact checks
+  fail.
+- `safety.strict=true`: reject output when fact checks fail.
+- `advanced.strict_startup=true`: keep fail-fast startup validation behavior.
+
+Transcription language:
+
+- `stt.language=auto` enables Whisper auto-detection.
+- You can pin language with Whisper codes such as `en`, `es`, `pt`, `ja`, or
+  `zh`, or common names such as `English` / `Spanish`.
+- If a pinned language hint is rejected by the runtime, Aman logs a warning and
+  retries with auto-detect.
+
+Hotkey notes:
+
+- Use one key plus optional modifiers, for example `Cmd+m`, `Super+m`, or
+  `Ctrl+space`.
+- `Super` and `Cmd` are equivalent aliases for the same modifier.
+
+## Managed versus expert mode
+
+- `Aman-managed` mode is the canonical supported UX: Aman handles model
+  lifecycle and safe defaults for you.
+- `Expert mode` is opt-in and exposes a custom Whisper model path for advanced
+  users.
+- Editor model/provider configuration is intentionally not exposed in config.
+- Custom Whisper paths are only active with
+  `models.allow_custom_models=true`.
+
+Compatibility note:
+
+- `ux.show_notifications` remains in the config schema for compatibility, but
+  it is not part of the current supported first-run X11 surface and is not
+  exposed in the settings window.
+
+## Cleanup and model lifecycle
+
+AI cleanup is always enabled and uses the locked local
+`Qwen2.5-1.5B-Instruct-Q4_K_M.gguf` model downloaded to
+`~/.cache/aman/models/` during daemon initialization.
+
+- Prompts use semantic XML tags for both system and user messages.
+- Cleanup runs in two local passes:
+  - pass 1 drafts cleaned text and labels ambiguity decisions
+    (correction/literal/spelling/filler)
+  - pass 2 audits those decisions conservatively and emits final
+    `cleaned_text`
+- Aman stays in dictation mode: it does not execute editing instructions
+  embedded in transcript text.
+- Before Aman reports `ready`, the local editor runs a tiny warmup completion
+  so the first real transcription is faster.
+- If warmup fails and `advanced.strict_startup=true`, startup fails fast.
+- With `advanced.strict_startup=false`, Aman logs a warning and continues.
+- Model downloads use a network timeout and SHA256 verification before
+  activation.
+- Cached models are checksum-verified on startup; mismatches trigger a forced
+  redownload.
+
+## Verbose logging and vocabulary
+
+- `-v/--verbose` enables DEBUG logs, including recognized/processed transcript
+  text and `llama::` logs.
+- Without `-v`, logs stay at INFO level.
+
+Vocabulary correction:
+
+- `vocabulary.replacements` is deterministic correction (`from -> to`).
+- `vocabulary.terms` is a preferred spelling list used as hinting context.
+- Wildcards are intentionally rejected (`*`, `?`, `[`, `]`, `{`, `}`) to avoid
+  ambiguous rules.
+- Rules are deduplicated case-insensitively; conflicting replacements are
+  rejected.
+
+STT hinting:
+
+- Vocabulary is passed to Whisper as compact `hotwords` only when that argument
+  is supported by the installed `faster-whisper` runtime.
+- Aman enables `word_timestamps` when supported and runs a conservative
+  alignment heuristic pass before the editor stage.
+
+Fact guard:
+
+- Aman runs a deterministic fact-preservation verifier after editor output.
+- If facts are changed or invented and `safety.strict=false`, Aman falls back
+  to the safer aligned draft.
+- If facts are changed or invented and `safety.strict=true`, processing fails
+  and output is not injected.
--- a/docs/developer-workflows.md
+++ b/docs/developer-workflows.md
@ -0,0 +1,94 @@
+# Developer And Maintainer Workflows
+
+This document keeps build, packaging, development, and benchmarking material
+out of the first-run README path.
+
+## Build and packaging
+
+```bash
+make build
+make package
+make package-portable
+make package-deb
+make package-arch
+make runtime-check
+make release-check
+```
+
+- `make package-portable` builds `dist/aman-x11-linux-<version>.tar.gz` plus
+  its `.sha256` file.
+- `make package-deb` installs Python dependencies while creating the package.
+- For offline Debian packaging, set `AMAN_WHEELHOUSE_DIR` to a directory
+  containing the required wheels.
+
+## Developer setup
+
+`uv` workflow:
+
+```bash
+uv sync --extra x11
+uv run aman run --config ~/.config/aman/config.json
+```
+
+`pip` workflow:
+
+```bash
+make install-local
+aman run --config ~/.config/aman/config.json
+```
+
+## Support and control commands
+
+```bash
+make run
+make run config.example.json
+make doctor
+make self-check
+make runtime-check
+make eval-models
+make sync-default-model
+make check-default-model
+make check
+```
+
+CLI examples:
+
+```bash
+aman doctor --config ~/.config/aman/config.json --json
+aman self-check --config ~/.config/aman/config.json --json
+aman run --config ~/.config/aman/config.json
+aman bench --text "example transcript" --repeat 5 --warmup 1
+aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl --json
+aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --json
+aman sync-default-model --check --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
+aman version
+aman init --config ~/.config/aman/config.json --force
+```
+
+## Benchmarking
+
+```bash
+aman bench --text "draft a short email to Marta confirming lunch" --repeat 10 --warmup 2
+aman bench --text-file ./bench-input.txt --repeat 20 --json
+```
+
+`bench` does not capture audio and never injects text to desktop apps. It runs
+the processing path from input transcript text through
+alignment/editor/fact-guard/vocabulary cleanup and prints timing summaries.
+
+## Model evaluation
+
+```bash
+aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl
+aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --output benchmarks/results/latest.json
+aman sync-default-model --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
+```
+
+- `eval-models` runs a structured model/parameter sweep over a JSONL dataset
+  and outputs latency plus quality metrics.
+- When `--heuristic-dataset` is provided, the report also includes
+  alignment-heuristic quality metrics.
+- `sync-default-model` promotes the report winner to the managed default model
+  constants and can be run in `--check` mode for CI and release gates.
+
+Dataset and artifact details live in [`benchmarks/README.md`](../benchmarks/README.md).
--- a/docs/media/first-run-demo.webm
+++ b/docs/media/first-run-demo.webm
--- a/docs/media/settings-window.png
+++ b/docs/media/settings-window.png
--- a/docs/media/tray-menu.png
+++ b/docs/media/tray-menu.png
--- a/docs/portable-install.md
+++ b/docs/portable-install.md
@ -2,6 +2,9 @@

 This is the canonical end-user install path for Aman on X11.

+For the shortest first-run path, screenshots, and the expected tray/dictation
+result, start with the quickstart in [`README.md`](../README.md).
+
 ## Supported environment

 - X11 desktop session
--- a/docs/release-checklist.md
+++ b/docs/release-checklist.md
@ -39,7 +39,12 @@ GA signoff bar. The GA signoff sections are required for `v1.0.0` and later.
   - `make runtime-check` passes.
   - [`docs/runtime-recovery.md`](./runtime-recovery.md) matches the shipped diagnostic IDs and next-step wording.
   - [`docs/x11-ga/runtime-validation-report.md`](./x11-ga/runtime-validation-report.md) contains current automated evidence and release-specific manual validation entries.
-12. GA validation signoff (`v1.0.0` and later):
+12. GA first-run UX signoff (`v1.0.0` and later):
+   - `README.md` leads with the supported first-run path and expected visible result.
+   - `docs/media/settings-window.png`, `docs/media/tray-menu.png`, and `docs/media/first-run-demo.webm` are current and linked from the README.
+   - [`docs/x11-ga/first-run-review-notes.md`](./x11-ga/first-run-review-notes.md) contains a non-implementer walkthrough and the questions it surfaced.
+   - `aman --help` exposes the main command surface directly.
+13. GA validation signoff (`v1.0.0` and later):
   - Validation evidence exists for Debian/Ubuntu, Arch, Fedora, and openSUSE.
   - The portable installer, upgrade path, and uninstall path are validated.
   - End-user docs and release notes match the shipped artifact set.
--- a/docs/runtime-recovery.md
+++ b/docs/runtime-recovery.md
@ -2,6 +2,23 @@

 Use this guide when Aman is installed but not behaving correctly.

+## First-run troubleshooting
+
+- Settings window did not appear:
+  run `aman run --config ~/.config/aman/config.json` once in the foreground so
+  you can complete first-run setup.
+- No tray icon after saving settings:
+  run `aman self-check --config ~/.config/aman/config.json` and confirm the
+  user service is enabled and active.
+- Hotkey does not start recording:
+  run `aman doctor --config ~/.config/aman/config.json`, then choose a
+  different hotkey in Settings if `hotkey.parse` is not `ok`.
+- Microphone test failed:
+  re-open Settings, choose another input device, then rerun `aman doctor`.
+- Text was transcribed but not injected:
+  run `aman doctor`, then rerun `aman run --config ~/.config/aman/config.json --verbose`
+  to inspect the output backend in the foreground.
+
 ## Command roles

 - `aman doctor --config ~/.config/aman/config.json` is the fast, read-only preflight for config, X11 session, audio runtime, input device resolution, hotkey availability, injection backend selection, and service prerequisites.
--- a/docs/x11-ga/04-first-run-ux-and-support-docs.md
+++ b/docs/x11-ga/04-first-run-ux-and-support-docs.md
@ -22,7 +22,7 @@ Even if install and runtime reliability are strong, Aman will not feel GA until
  - first launch
  - choosing a microphone
  - triggering the first dictation
-  - expected tray or notification behavior
+  - expected tray behavior
  - expected injected text result
 - Add a "validate your install" flow using `aman doctor` and `aman self-check`.
 - Add screenshots for the settings window and tray menu.
@ -63,6 +63,6 @@ Even if install and runtime reliability are strong, Aman will not feel GA until
 ## Evidence required to close

 - Updated README and linked support docs.
- Screenshots and demo artifact checked into the release or docs surface.
+- Screenshots and demo artifact checked into the docs surface.
 - A reviewer walk-through from someone who did not implement the docs rewrite.
 - A short list of first-run questions found during review and how the docs resolved them.
--- a/docs/x11-ga/README.md
+++ b/docs/x11-ga/README.md
@ -10,7 +10,7 @@ The current gaps are:
 - The X11 support contract and service-versus-foreground split are now documented, but the public release surface still needs the remaining trust and support work from milestones 4 and 5.
 - Validation matrices now exist for portable lifecycle and runtime reliability, but they are not yet filled with release-specific manual evidence across Debian/Ubuntu, Arch, Fedora, and openSUSE.
 - Incomplete trust surface. The project still needs a real license file, real maintainer/contact metadata, real project URLs, published release artifacts, and public checksums.
- Incomplete first-run story. The product describes a settings window and tray workflow, but there is no short happy path, no expected-result walkthrough, and no visual proof that the experience is real.
+- The first-run docs and media have landed, but milestone 4 still needs a non-implementer walkthrough before the project can claim that the public docs are actually enough.
 - Diagnostics are now the canonical recovery path, but milestone 3 still needs release-specific X11 evidence for restart, offline-start, tray diagnostics, and recovery scenarios.
 - The release checklist now includes GA signoff gates, but the project is still short of the broader legal, release-publication, and validation evidence needed for a credible public 1.0 release.

@ -100,7 +100,12 @@ Any future docs, tray copy, and release notes should point users to this same se
  [`runtime-validation-report.md`](./runtime-validation-report.md) are filled
  with real X11 validation evidence.
 - [ ] [Milestone 4: First-Run UX and Support Docs](./04-first-run-ux-and-support-docs.md)
-  Turn the product from "documented by the author" into "understandable by a new user."
+  Implementation landed on 2026-03-12: the README is now end-user-first,
+  first-run assets live under `docs/media/`, deep config and maintainer content
+  moved into linked docs, and `aman --help` exposes the top-level commands
+  directly. Leave this milestone open until
+  [`first-run-review-notes.md`](./first-run-review-notes.md) contains a real
+  non-implementer walkthrough.
 - [ ] [Milestone 5: GA Candidate Validation and Release](./05-ga-candidate-validation-and-release.md)
  Close the remaining trust, legal, release, and validation work for a public 1.0 launch.

--- a/docs/x11-ga/first-run-review-notes.md
+++ b/docs/x11-ga/first-run-review-notes.md
@ -0,0 +1,24 @@
+# First-Run Review Notes
+
+Use this file to capture the non-implementer walkthrough required to close
+milestone 4.
+
+## Review template
+
+- Reviewer:
+- Date:
+- Environment:
+- Entry point used:
+- Did the reviewer use only the public docs? yes / no
+
+## First-run questions or confusions
+
+- Question:
+  - Where it appeared:
+  - How the docs or product resolved it:
+
+## Remaining gaps
+
+- Gap:
+  - Severity:
+  - Suggested follow-up: