Thales Maciel 359b5fbaf4 Land milestone 4 first-run docs and media

Make the X11 user path visible on first contact instead of burying it under config and maintainer detail.

Rewrite the README around the supported quickstart, expected tray and dictation result, install validation, troubleshooting, and linked follow-on docs. Split deep config and developer material into separate docs, add checked-in screenshots plus a short WebM walkthrough, and add a generator so the media assets stay reproducible.

Also fix the CLI discovery gap by letting `aman --help` show the top-level command surface while keeping implicit foreground `run` behavior, and align the settings, help, and about copy with the supported service-plus-diagnostics model.

Validation: `PYTHONPATH=src python3 -m unittest tests.test_aman_cli tests.test_config_ui`; `PYTHONPATH=src python3 -m unittest discover -s tests -p 'test_*.py'`; `python3 -m py_compile src/*.py tests/*.py scripts/generate_docs_media.py`; `PYTHONPATH=src python3 -m aman --help`.

Milestone 4 stays open in the roadmap because `docs/x11-ga/first-run-review-notes.md` still needs a real non-implementer walkthrough.

2026-03-12 18:30:34 -03:00

3.1 KiB

Raw Blame History

Developer And Maintainer Workflows

This document keeps build, packaging, development, and benchmarking material out of the first-run README path.

Build and packaging

make build
make package
make package-portable
make package-deb
make package-arch
make runtime-check
make release-check

make package-portable builds dist/aman-x11-linux-<version>.tar.gz plus its .sha256 file.
make package-deb installs Python dependencies while creating the package.
For offline Debian packaging, set AMAN_WHEELHOUSE_DIR to a directory containing the required wheels.

Developer setup

uv workflow:

uv sync --extra x11
uv run aman run --config ~/.config/aman/config.json

pip workflow:

make install-local
aman run --config ~/.config/aman/config.json

Support and control commands

make run
make run config.example.json
make doctor
make self-check
make runtime-check
make eval-models
make sync-default-model
make check-default-model
make check

CLI examples:

aman doctor --config ~/.config/aman/config.json --json
aman self-check --config ~/.config/aman/config.json --json
aman run --config ~/.config/aman/config.json
aman bench --text "example transcript" --repeat 5 --warmup 1
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl --json
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --json
aman sync-default-model --check --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
aman version
aman init --config ~/.config/aman/config.json --force

Benchmarking

aman bench --text "draft a short email to Marta confirming lunch" --repeat 10 --warmup 2
aman bench --text-file ./bench-input.txt --repeat 20 --json

bench does not capture audio and never injects text to desktop apps. It runs the processing path from input transcript text through alignment/editor/fact-guard/vocabulary cleanup and prints timing summaries.

Model evaluation

aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --output benchmarks/results/latest.json
aman sync-default-model --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py

eval-models runs a structured model/parameter sweep over a JSONL dataset and outputs latency plus quality metrics.
When --heuristic-dataset is provided, the report also includes alignment-heuristic quality metrics.
sync-default-model promotes the report winner to the managed default model constants and can be run in --check mode for CI and release gates.

Dataset and artifact details live in benchmarks/README.md.

3.1 KiB Raw Blame History