Land milestone 4 first-run docs and media
Make the X11 user path visible on first contact instead of burying it under config and maintainer detail. Rewrite the README around the supported quickstart, expected tray and dictation result, install validation, troubleshooting, and linked follow-on docs. Split deep config and developer material into separate docs, add checked-in screenshots plus a short WebM walkthrough, and add a generator so the media assets stay reproducible. Also fix the CLI discovery gap by letting `aman --help` show the top-level command surface while keeping implicit foreground `run` behavior, and align the settings, help, and about copy with the supported service-plus-diagnostics model. Validation: `PYTHONPATH=src python3 -m unittest tests.test_aman_cli tests.test_config_ui`; `PYTHONPATH=src python3 -m unittest discover -s tests -p 'test_*.py'`; `python3 -m py_compile src/*.py tests/*.py scripts/generate_docs_media.py`; `PYTHONPATH=src python3 -m aman --help`. Milestone 4 stays open in the roadmap because `docs/x11-ga/first-run-review-notes.md` still needs a real non-implementer walkthrough.
This commit is contained in:
parent
ed1b59240b
commit
359b5fbaf4
16 changed files with 788 additions and 411 deletions
484
README.md
484
README.md
|
|
@ -1,31 +1,11 @@
|
|||
# aman
|
||||
> Local amanuensis
|
||||
> Local amanuensis for X11 desktop dictation
|
||||
|
||||
Python X11 STT daemon that records audio, runs Whisper, applies local AI cleanup, and injects text.
|
||||
Aman is a local X11 dictation daemon for Linux desktops. The supported path is:
|
||||
install the portable bundle, save the first-run settings window once, then use
|
||||
a hotkey to dictate into the focused app.
|
||||
|
||||
## Target User
|
||||
|
||||
The canonical Aman user is a desktop professional who wants dictation and
|
||||
rewriting features without learning Python tooling.
|
||||
|
||||
- End-user path: portable X11 release bundle for mainstream distros.
|
||||
- Alternate package channels: Debian/Ubuntu `.deb` and Arch packaging inputs.
|
||||
- Developer path: Python/uv workflows.
|
||||
|
||||
Persona details and distribution policy are documented in
|
||||
[`docs/persona-and-distribution.md`](docs/persona-and-distribution.md).
|
||||
|
||||
## Release Channels
|
||||
|
||||
Aman is not GA yet for X11 users across distros. The maintained release
|
||||
channels are:
|
||||
|
||||
- Portable X11 bundle: current canonical end-user channel.
|
||||
- Debian/Ubuntu `.deb`: secondary packaged channel.
|
||||
- Arch `PKGBUILD` plus source tarball: secondary maintainer and power-user channel.
|
||||
- Python wheel and sdist: current developer and integrator channel.
|
||||
|
||||
## GA Support Matrix
|
||||
## Supported Path
|
||||
|
||||
| Surface | Contract |
|
||||
| --- | --- |
|
||||
|
|
@ -37,103 +17,12 @@ channels are:
|
|||
| Representative GA validation families | Debian/Ubuntu, Arch, Fedora, openSUSE |
|
||||
| Portable installer prerequisite | System CPython `3.10`, `3.11`, or `3.12` |
|
||||
|
||||
## Install (Portable Bundle)
|
||||
Distribution policy and user persona details live in
|
||||
[`docs/persona-and-distribution.md`](docs/persona-and-distribution.md).
|
||||
|
||||
Download `aman-x11-linux-<version>.tar.gz` and
|
||||
`aman-x11-linux-<version>.tar.gz.sha256`, install the runtime dependencies for
|
||||
your distro, then install the bundle:
|
||||
## 60-Second Quickstart
|
||||
|
||||
```bash
|
||||
sha256sum -c aman-x11-linux-<version>.tar.gz.sha256
|
||||
tar -xzf aman-x11-linux-<version>.tar.gz
|
||||
cd aman-x11-linux-<version>
|
||||
./install.sh
|
||||
```
|
||||
|
||||
The installer writes the user service, updates `~/.local/bin/aman`, and runs
|
||||
`systemctl --user enable --now aman` automatically.
|
||||
On first service start, Aman opens the graphical settings window if
|
||||
`~/.config/aman/config.json` does not exist yet.
|
||||
|
||||
Upgrade by extracting the newer bundle and running its `install.sh` again.
|
||||
Config and cache are preserved by default.
|
||||
|
||||
Uninstall with:
|
||||
|
||||
```bash
|
||||
~/.local/share/aman/current/uninstall.sh
|
||||
```
|
||||
|
||||
Add `--purge` if you also want to remove `~/.config/aman/` and
|
||||
`~/.cache/aman/`.
|
||||
|
||||
Detailed install, upgrade, uninstall, and conflict guidance lives in
|
||||
[`docs/portable-install.md`](docs/portable-install.md).
|
||||
|
||||
## Secondary Channels
|
||||
|
||||
### Debian/Ubuntu (`.deb`)
|
||||
|
||||
Download a release artifact and install it:
|
||||
|
||||
```bash
|
||||
sudo apt install ./aman_<version>_<arch>.deb
|
||||
systemctl --user daemon-reload
|
||||
systemctl --user enable --now aman
|
||||
```
|
||||
|
||||
### Arch Linux
|
||||
|
||||
Use the generated packaging inputs (`PKGBUILD` + source tarball) in `dist/arch/`
|
||||
or your own packaging pipeline.
|
||||
|
||||
## Daily-Use And Support Modes
|
||||
|
||||
- Supported daily-use path: install Aman, then run it as a `systemd --user`
|
||||
service.
|
||||
- Supported manual path: use `aman run` in the foreground while setting up,
|
||||
debugging, or collecting support logs.
|
||||
|
||||
## Recovery Sequence
|
||||
|
||||
When Aman does not behave as expected, use this order:
|
||||
|
||||
1. Run `aman doctor --config ~/.config/aman/config.json`.
|
||||
2. Run `aman self-check --config ~/.config/aman/config.json`.
|
||||
3. Inspect `journalctl --user -u aman -f`.
|
||||
4. Re-run Aman in the foreground with `aman run --config ~/.config/aman/config.json --verbose`.
|
||||
|
||||
See [`docs/runtime-recovery.md`](docs/runtime-recovery.md) for the failure IDs,
|
||||
example output, and the common recovery branches behind this sequence.
|
||||
|
||||
## Diagnostics
|
||||
|
||||
- `aman doctor` is the fast, read-only preflight for config, X11 session,
|
||||
audio runtime, input resolution, hotkey availability, injection backend
|
||||
selection, and service prerequisites.
|
||||
- `aman self-check` is the deeper, still read-only installed-system readiness
|
||||
check. It includes every `doctor` check plus managed model cache, cache
|
||||
writability, service unit/state, and startup readiness.
|
||||
- The tray `Run Diagnostics` action runs the same deeper `self-check` path and
|
||||
logs any non-`ok` results.
|
||||
- Exit code `0` means every check finished as `ok` or `warn`. Exit code `2`
|
||||
means at least one check finished as `fail`.
|
||||
|
||||
Example output:
|
||||
|
||||
```text
|
||||
[OK] config.load: loaded config from /home/user/.config/aman/config.json
|
||||
[WARN] model.cache: managed editor model is not cached at /home/user/.cache/aman/models/Qwen2.5-1.5B-Instruct-Q4_K_M.gguf | next_step: start Aman once on a networked connection so it can download the managed editor model, then rerun `aman self-check --config /home/user/.config/aman/config.json`
|
||||
[FAIL] service.state: user service is installed but failed to start | next_step: inspect `journalctl --user -u aman -f` to see why aman.service is failing
|
||||
overall: fail
|
||||
```
|
||||
|
||||
## Runtime Dependencies
|
||||
|
||||
- X11
|
||||
- PortAudio runtime (`libportaudio2` or distro equivalent)
|
||||
- GTK3 and AppIndicator runtime (`gtk3`, `libayatana-appindicator3`)
|
||||
- Python GTK and X11 bindings (`python3-gi`/`python-gobject`, `python-xlib`)
|
||||
First, install the runtime dependencies for your distro:
|
||||
|
||||
<details>
|
||||
<summary>Ubuntu/Debian</summary>
|
||||
|
|
@ -171,292 +60,105 @@ sudo zypper install -y portaudio gtk3 libayatana-appindicator3-1 python3-gobject
|
|||
|
||||
</details>
|
||||
|
||||
## Quickstart (Portable Bundle)
|
||||
Then install Aman and run the first dictation:
|
||||
|
||||
For supported daily use on the portable bundle:
|
||||
|
||||
1. Install the runtime dependencies for your distro.
|
||||
2. Download and extract the portable release bundle.
|
||||
3. Run `./install.sh` from the extracted bundle.
|
||||
4. Save the first-run settings window.
|
||||
5. Validate the install:
|
||||
1. Verify and extract the portable bundle.
|
||||
2. Run `./install.sh`.
|
||||
3. When `Aman Settings (Required)` opens, choose your microphone and keep
|
||||
`Clipboard paste (recommended)` unless you have a reason to change it.
|
||||
4. Click `Apply`.
|
||||
5. Put your cursor in any text field.
|
||||
6. Press the hotkey once, say `hello from Aman`, then press the hotkey again.
|
||||
|
||||
```bash
|
||||
sha256sum -c aman-x11-linux-<version>.tar.gz.sha256
|
||||
tar -xzf aman-x11-linux-<version>.tar.gz
|
||||
cd aman-x11-linux-<version>
|
||||
./install.sh
|
||||
```
|
||||
|
||||
## What Success Looks Like
|
||||
|
||||
- On first launch, Aman opens the `Aman Settings (Required)` window.
|
||||
- After you save settings, the tray returns to `Idle`.
|
||||
- During dictation, the tray cycles `Idle -> Recording -> STT -> AI Processing -> Idle`.
|
||||
- The focused text field receives text similar to `Hello from Aman.`
|
||||
|
||||
## Visual Proof
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
[Watch the first-run walkthrough (WebM)](docs/media/first-run-demo.webm)
|
||||
|
||||
## Validate Your Install
|
||||
|
||||
Run the supported checks in this order:
|
||||
|
||||
```bash
|
||||
aman doctor --config ~/.config/aman/config.json
|
||||
aman self-check --config ~/.config/aman/config.json
|
||||
```
|
||||
|
||||
If you need the manual foreground path for setup or support:
|
||||
- `aman doctor` is the fast, read-only preflight for config, X11 session,
|
||||
audio runtime, input resolution, hotkey availability, injection backend
|
||||
selection, and service prerequisites.
|
||||
- `aman self-check` is the deeper, still read-only installed-system readiness
|
||||
check. It includes every `doctor` check plus managed model cache, cache
|
||||
writability, service unit/state, and startup readiness.
|
||||
- Exit code `0` means every check finished as `ok` or `warn`. Exit code `2`
|
||||
means at least one check finished as `fail`.
|
||||
|
||||
```bash
|
||||
aman run --config ~/.config/aman/config.json
|
||||
```
|
||||
## Troubleshooting
|
||||
|
||||
On first launch, Aman opens a graphical settings window automatically.
|
||||
It includes sections for:
|
||||
- Settings window did not appear:
|
||||
run `aman run --config ~/.config/aman/config.json` once in the foreground.
|
||||
- No tray icon after saving settings:
|
||||
run `aman self-check --config ~/.config/aman/config.json`.
|
||||
- Hotkey does not start recording:
|
||||
run `aman doctor --config ~/.config/aman/config.json` and pick a different
|
||||
hotkey in Settings if needed.
|
||||
- Microphone test fails or no audio is captured:
|
||||
re-open Settings, choose another input device, then rerun `aman doctor`.
|
||||
- Text was recorded but not injected:
|
||||
run `aman doctor`, then `aman run --config ~/.config/aman/config.json --verbose`.
|
||||
|
||||
- microphone input
|
||||
- hotkey
|
||||
- output backend
|
||||
- writing profile
|
||||
- output safety policy
|
||||
- runtime strategy (managed vs custom Whisper path)
|
||||
- help/about actions
|
||||
Use [`docs/runtime-recovery.md`](docs/runtime-recovery.md) for the full failure
|
||||
map and escalation flow.
|
||||
|
||||
## Config
|
||||
## Install, Upgrade, and Uninstall
|
||||
|
||||
Create `~/.config/aman/config.json` (or let `aman` create it automatically on first start if missing):
|
||||
The canonical end-user guide lives in
|
||||
[`docs/portable-install.md`](docs/portable-install.md).
|
||||
|
||||
```json
|
||||
{
|
||||
"config_version": 1,
|
||||
"daemon": { "hotkey": "Cmd+m" },
|
||||
"recording": { "input": "0" },
|
||||
"stt": {
|
||||
"provider": "local_whisper",
|
||||
"model": "base",
|
||||
"device": "cpu",
|
||||
"language": "auto"
|
||||
},
|
||||
"models": {
|
||||
"allow_custom_models": false,
|
||||
"whisper_model_path": ""
|
||||
},
|
||||
"injection": {
|
||||
"backend": "clipboard",
|
||||
"remove_transcription_from_clipboard": false
|
||||
},
|
||||
"safety": {
|
||||
"enabled": true,
|
||||
"strict": false
|
||||
},
|
||||
"ux": {
|
||||
"profile": "default",
|
||||
"show_notifications": true
|
||||
},
|
||||
"advanced": {
|
||||
"strict_startup": true
|
||||
},
|
||||
"vocabulary": {
|
||||
"replacements": [
|
||||
{ "from": "Martha", "to": "Marta" },
|
||||
{ "from": "docker", "to": "Docker" }
|
||||
],
|
||||
"terms": ["Systemd", "Kubernetes"]
|
||||
}
|
||||
}
|
||||
```
|
||||
- Fresh install, upgrade, uninstall, and purge behavior are documented there.
|
||||
- The same guide covers distro-package conflicts and portable-installer
|
||||
recovery steps.
|
||||
|
||||
`config_version` is required and currently must be `1`. Legacy unversioned
|
||||
configs are migrated automatically on load.
|
||||
## Daily Use and Support
|
||||
|
||||
Recording input can be a device index (preferred) or a substring of the device
|
||||
name.
|
||||
If `recording.input` is explicitly set and cannot be resolved, startup fails
|
||||
instead of falling back to a default device.
|
||||
- Supported daily-use path: let the `systemd --user` service keep Aman running.
|
||||
- Supported manual path: use `aman run` in the foreground for setup, support,
|
||||
or debugging.
|
||||
- Tray menu actions are: `Settings...`, `Help`, `About`, `Pause Aman` /
|
||||
`Resume Aman`, `Reload Config`, `Run Diagnostics`, `Open Config Path`, and
|
||||
`Quit`.
|
||||
- If required settings are not saved, Aman enters a `Settings Required` tray
|
||||
state and does not capture audio.
|
||||
|
||||
Config validation is strict: unknown fields are rejected with a startup error.
|
||||
Validation errors include the exact field and an example fix snippet.
|
||||
## Secondary Channels
|
||||
|
||||
Profile options:
|
||||
- Portable X11 bundle: current canonical end-user channel.
|
||||
- Debian/Ubuntu `.deb`: secondary packaged channel.
|
||||
- Arch `PKGBUILD` plus source tarball: secondary maintainer and power-user
|
||||
channel.
|
||||
- Python wheel and sdist: developer and integrator channel.
|
||||
|
||||
- `ux.profile=default`: baseline cleanup behavior.
|
||||
- `ux.profile=fast`: lower-latency AI generation settings.
|
||||
- `ux.profile=polished`: same cleanup depth as default.
|
||||
- `safety.enabled=true`: enables fact-preservation checks (names/numbers/IDs/URLs).
|
||||
- `safety.strict=false`: fallback to safer draft when fact checks fail.
|
||||
- `safety.strict=true`: reject output when fact checks fail.
|
||||
- `advanced.strict_startup=true`: keep fail-fast startup validation behavior.
|
||||
## More Docs
|
||||
|
||||
Transcription language:
|
||||
|
||||
- `stt.language=auto` (default) enables Whisper auto-detection.
|
||||
- You can pin language with Whisper codes (for example `en`, `es`, `pt`, `ja`, `zh`) or common names like `English`/`Spanish`.
|
||||
- If a pinned language hint is rejected by the runtime, Aman logs a warning and retries with auto-detect.
|
||||
|
||||
Hotkey notes:
|
||||
|
||||
- Use one key plus optional modifiers (for example `Cmd+m`, `Super+m`, `Ctrl+space`).
|
||||
- `Super` and `Cmd` are equivalent aliases for the same modifier.
|
||||
|
||||
AI cleanup is always enabled and uses the locked local Qwen2.5-1.5B GGUF model
|
||||
downloaded to `~/.cache/aman/models/` during daemon initialization.
|
||||
Prompts are structured with semantic XML tags for both system and user messages
|
||||
to improve instruction adherence and output consistency.
|
||||
Cleanup runs in two local passes:
|
||||
- pass 1 drafts cleaned text and labels ambiguity decisions (correction/literal/spelling/filler)
|
||||
- pass 2 audits those decisions conservatively and emits final `cleaned_text`
|
||||
This keeps Aman in dictation mode: it does not execute editing instructions embedded in transcript text.
|
||||
Before Aman reports `ready`, local llama runs a tiny warmup completion so the
|
||||
first real transcription is faster.
|
||||
If warmup fails and `advanced.strict_startup=true`, startup fails fast.
|
||||
With `advanced.strict_startup=false`, Aman logs a warning and continues.
|
||||
Model downloads use a network timeout and SHA256 verification before activation.
|
||||
Cached models are checksum-verified on startup; mismatches trigger a forced
|
||||
redownload.
|
||||
|
||||
Provider policy:
|
||||
|
||||
- `Aman-managed` mode (recommended) is the canonical supported UX:
|
||||
Aman handles model lifecycle and safe defaults for you.
|
||||
- `Expert mode` is opt-in and exposes a custom Whisper model path for advanced users.
|
||||
- Editor model/provider configuration is intentionally not exposed in config.
|
||||
- Custom Whisper paths are only active with `models.allow_custom_models=true`.
|
||||
|
||||
Use `-v/--verbose` to enable DEBUG logs, including recognized/processed
|
||||
transcript text and llama.cpp logs (`llama::` prefix). Without `-v`, logs are
|
||||
INFO level.
|
||||
|
||||
Vocabulary correction:
|
||||
|
||||
- `vocabulary.replacements` is deterministic correction (`from -> to`).
|
||||
- `vocabulary.terms` is a preferred spelling list used as hinting context.
|
||||
- Wildcards are intentionally rejected (`*`, `?`, `[`, `]`, `{`, `}`) to avoid ambiguous rules.
|
||||
- Rules are deduplicated case-insensitively; conflicting replacements are rejected.
|
||||
|
||||
STT hinting:
|
||||
|
||||
- Vocabulary is passed to Whisper as compact `hotwords` only when that argument
|
||||
is supported by the installed `faster-whisper` runtime.
|
||||
- Aman enables `word_timestamps` when supported and runs a conservative
|
||||
alignment heuristic pass (self-correction/restart detection) before the editor
|
||||
stage.
|
||||
|
||||
Fact guard:
|
||||
|
||||
- Aman runs a deterministic fact-preservation verifier after editor output.
|
||||
- If facts are changed/invented and `safety.strict=false`, Aman falls back to the safer aligned draft.
|
||||
- If facts are changed/invented and `safety.strict=true`, processing fails and output is not injected.
|
||||
|
||||
## systemd user service
|
||||
|
||||
```bash
|
||||
make install-service
|
||||
```
|
||||
|
||||
Service notes:
|
||||
|
||||
- The supported daily-use path is the user service.
|
||||
- The portable installer writes and enables the user unit automatically.
|
||||
- The local developer unit launched by `make install-service` still resolves
|
||||
`aman` from `PATH`.
|
||||
- Package installs should provide the `aman` command automatically.
|
||||
- Use `aman run --config ~/.config/aman/config.json` in the foreground for
|
||||
setup, support, or debugging.
|
||||
- Start recovery with `aman doctor`, then `aman self-check`, before inspecting
|
||||
`systemctl --user status aman` and `journalctl --user -u aman -f`.
|
||||
- See [`docs/runtime-recovery.md`](docs/runtime-recovery.md) for the expected
|
||||
diagnostic IDs and next steps.
|
||||
|
||||
## Usage
|
||||
|
||||
- Press the hotkey once to start recording.
|
||||
- Press it again to stop and run STT.
|
||||
- Press `Esc` while recording to cancel without processing.
|
||||
- `Esc` is only captured during active recording.
|
||||
- Recording start is aborted if the cancel listener cannot be armed.
|
||||
- Transcript contents are logged only when `-v/--verbose` is used.
|
||||
- Tray menu includes: `Settings...`, `Help`, `About`, `Pause/Resume Aman`, `Reload Config`, `Run Diagnostics`, `Open Config Path`, and `Quit`.
|
||||
- If required settings are not saved, Aman enters a `Settings Required` tray mode and does not capture audio.
|
||||
|
||||
Wayland note:
|
||||
|
||||
- Running under Wayland currently exits with a message explaining that it is not supported yet.
|
||||
|
||||
Injection backends:
|
||||
|
||||
- `clipboard`: copy to clipboard and inject via Ctrl+Shift+V (GTK clipboard + XTest)
|
||||
- `injection`: type the text with simulated keypresses (XTest)
|
||||
- `injection.remove_transcription_from_clipboard`: when `true` and backend is `clipboard`, restores/clears the clipboard after paste so the transcript is not kept there
|
||||
|
||||
Editor stage:
|
||||
|
||||
- Canonical local llama.cpp editor model (managed by Aman).
|
||||
- Runtime flow is explicit: `ASR -> Alignment Heuristics -> Editor -> Fact Guard -> Vocabulary -> Injection`.
|
||||
|
||||
Build and packaging (maintainers):
|
||||
|
||||
```bash
|
||||
make build
|
||||
make package
|
||||
make package-portable
|
||||
make package-deb
|
||||
make package-arch
|
||||
make runtime-check
|
||||
make release-check
|
||||
```
|
||||
|
||||
`make package-portable` builds `dist/aman-x11-linux-<version>.tar.gz` plus its
|
||||
`.sha256` file.
|
||||
|
||||
`make package-deb` installs Python dependencies while creating the package.
|
||||
For offline packaging, set `AMAN_WHEELHOUSE_DIR` to a directory containing the
|
||||
required wheels.
|
||||
|
||||
Benchmarking (STT bypass, always dry):
|
||||
|
||||
```bash
|
||||
aman bench --text "draft a short email to Marta confirming lunch" --repeat 10 --warmup 2
|
||||
aman bench --text-file ./bench-input.txt --repeat 20 --json
|
||||
```
|
||||
|
||||
`bench` does not capture audio and never injects text to desktop apps. It runs
|
||||
the processing path from input transcript text through alignment/editor/fact-guard/vocabulary cleanup and
|
||||
prints timing summaries.
|
||||
|
||||
Model evaluation lab (dataset + matrix sweep):
|
||||
|
||||
```bash
|
||||
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl
|
||||
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --output benchmarks/results/latest.json
|
||||
aman sync-default-model --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
|
||||
```
|
||||
|
||||
`eval-models` runs a structured model/parameter sweep over a JSONL dataset and
|
||||
outputs latency + quality metrics (including hybrid score, pass-1/pass-2 latency breakdown,
|
||||
and correction safety metrics for `I mean` and spelling-disambiguation cases).
|
||||
When `--heuristic-dataset` is provided, the report also includes alignment-heuristic
|
||||
quality metrics (exact match, token-F1, rule precision/recall, per-tag breakdown).
|
||||
`sync-default-model` promotes the report winner to the managed default model constants
|
||||
using the artifact registry and can be run in `--check` mode for CI/release gates.
|
||||
|
||||
Control:
|
||||
|
||||
```bash
|
||||
make run
|
||||
make run config.example.json
|
||||
make doctor
|
||||
make self-check
|
||||
make runtime-check
|
||||
make eval-models
|
||||
make sync-default-model
|
||||
make check-default-model
|
||||
make check
|
||||
```
|
||||
|
||||
Developer setup (optional, `uv` workflow):
|
||||
|
||||
```bash
|
||||
uv sync --extra x11
|
||||
uv run aman run --config ~/.config/aman/config.json
|
||||
```
|
||||
|
||||
Developer setup (optional, `pip` workflow):
|
||||
|
||||
```bash
|
||||
make install-local
|
||||
aman run --config ~/.config/aman/config.json
|
||||
```
|
||||
|
||||
CLI (support and developer workflows):
|
||||
|
||||
```bash
|
||||
aman doctor --config ~/.config/aman/config.json --json
|
||||
aman self-check --config ~/.config/aman/config.json --json
|
||||
aman run --config ~/.config/aman/config.json
|
||||
aman bench --text "example transcript" --repeat 5 --warmup 1
|
||||
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl --json
|
||||
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --json
|
||||
aman sync-default-model --check --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
|
||||
aman version
|
||||
aman init --config ~/.config/aman/config.json --force
|
||||
```
|
||||
- Install, upgrade, uninstall: [docs/portable-install.md](docs/portable-install.md)
|
||||
- Runtime recovery and diagnostics: [docs/runtime-recovery.md](docs/runtime-recovery.md)
|
||||
- Config reference and advanced behavior: [docs/config-reference.md](docs/config-reference.md)
|
||||
- Developer, packaging, and benchmark workflows: [docs/developer-workflows.md](docs/developer-workflows.md)
|
||||
- Persona and distribution policy: [docs/persona-and-distribution.md](docs/persona-and-distribution.md)
|
||||
|
|
|
|||
154
docs/config-reference.md
Normal file
154
docs/config-reference.md
Normal file
|
|
@ -0,0 +1,154 @@
|
|||
# Config Reference
|
||||
|
||||
Use this document when you need the full Aman config shape and the advanced
|
||||
behavior notes that are intentionally kept out of the first-run README path.
|
||||
|
||||
## Example config
|
||||
|
||||
```json
|
||||
{
|
||||
"config_version": 1,
|
||||
"daemon": { "hotkey": "Cmd+m" },
|
||||
"recording": { "input": "0" },
|
||||
"stt": {
|
||||
"provider": "local_whisper",
|
||||
"model": "base",
|
||||
"device": "cpu",
|
||||
"language": "auto"
|
||||
},
|
||||
"models": {
|
||||
"allow_custom_models": false,
|
||||
"whisper_model_path": ""
|
||||
},
|
||||
"injection": {
|
||||
"backend": "clipboard",
|
||||
"remove_transcription_from_clipboard": false
|
||||
},
|
||||
"safety": {
|
||||
"enabled": true,
|
||||
"strict": false
|
||||
},
|
||||
"ux": {
|
||||
"profile": "default",
|
||||
"show_notifications": true
|
||||
},
|
||||
"advanced": {
|
||||
"strict_startup": true
|
||||
},
|
||||
"vocabulary": {
|
||||
"replacements": [
|
||||
{ "from": "Martha", "to": "Marta" },
|
||||
{ "from": "docker", "to": "Docker" }
|
||||
],
|
||||
"terms": ["Systemd", "Kubernetes"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`config_version` is required and currently must be `1`. Legacy unversioned
|
||||
configs are migrated automatically on load.
|
||||
|
||||
## Recording and validation
|
||||
|
||||
- `recording.input` can be a device index (preferred) or a substring of the
|
||||
device name.
|
||||
- If `recording.input` is explicitly set and cannot be resolved, startup fails
|
||||
instead of falling back to a default device.
|
||||
- Config validation is strict: unknown fields are rejected with a startup
|
||||
error.
|
||||
- Validation errors include the exact field and an example fix snippet.
|
||||
|
||||
## Profiles and runtime behavior
|
||||
|
||||
- `ux.profile=default`: baseline cleanup behavior.
|
||||
- `ux.profile=fast`: lower-latency AI generation settings.
|
||||
- `ux.profile=polished`: same cleanup depth as default.
|
||||
- `safety.enabled=true`: enables fact-preservation checks
|
||||
(names/numbers/IDs/URLs).
|
||||
- `safety.strict=false`: fallback to the safer aligned draft when fact checks
|
||||
fail.
|
||||
- `safety.strict=true`: reject output when fact checks fail.
|
||||
- `advanced.strict_startup=true`: keep fail-fast startup validation behavior.
|
||||
|
||||
Transcription language:
|
||||
|
||||
- `stt.language=auto` enables Whisper auto-detection.
|
||||
- You can pin language with Whisper codes such as `en`, `es`, `pt`, `ja`, or
|
||||
`zh`, or common names such as `English` / `Spanish`.
|
||||
- If a pinned language hint is rejected by the runtime, Aman logs a warning and
|
||||
retries with auto-detect.
|
||||
|
||||
Hotkey notes:
|
||||
|
||||
- Use one key plus optional modifiers, for example `Cmd+m`, `Super+m`, or
|
||||
`Ctrl+space`.
|
||||
- `Super` and `Cmd` are equivalent aliases for the same modifier.
|
||||
|
||||
## Managed versus expert mode
|
||||
|
||||
- `Aman-managed` mode is the canonical supported UX: Aman handles model
|
||||
lifecycle and safe defaults for you.
|
||||
- `Expert mode` is opt-in and exposes a custom Whisper model path for advanced
|
||||
users.
|
||||
- Editor model/provider configuration is intentionally not exposed in config.
|
||||
- Custom Whisper paths are only active with
|
||||
`models.allow_custom_models=true`.
|
||||
|
||||
Compatibility note:
|
||||
|
||||
- `ux.show_notifications` remains in the config schema for compatibility, but
|
||||
it is not part of the current supported first-run X11 surface and is not
|
||||
exposed in the settings window.
|
||||
|
||||
## Cleanup and model lifecycle
|
||||
|
||||
AI cleanup is always enabled and uses the locked local
|
||||
`Qwen2.5-1.5B-Instruct-Q4_K_M.gguf` model downloaded to
|
||||
`~/.cache/aman/models/` during daemon initialization.
|
||||
|
||||
- Prompts use semantic XML tags for both system and user messages.
|
||||
- Cleanup runs in two local passes:
|
||||
- pass 1 drafts cleaned text and labels ambiguity decisions
|
||||
(correction/literal/spelling/filler)
|
||||
- pass 2 audits those decisions conservatively and emits final
|
||||
`cleaned_text`
|
||||
- Aman stays in dictation mode: it does not execute editing instructions
|
||||
embedded in transcript text.
|
||||
- Before Aman reports `ready`, the local editor runs a tiny warmup completion
|
||||
so the first real transcription is faster.
|
||||
- If warmup fails and `advanced.strict_startup=true`, startup fails fast.
|
||||
- With `advanced.strict_startup=false`, Aman logs a warning and continues.
|
||||
- Model downloads use a network timeout and SHA256 verification before
|
||||
activation.
|
||||
- Cached models are checksum-verified on startup; mismatches trigger a forced
|
||||
redownload.
|
||||
|
||||
## Verbose logging and vocabulary
|
||||
|
||||
- `-v/--verbose` enables DEBUG logs, including recognized/processed transcript
|
||||
text and `llama::` logs.
|
||||
- Without `-v`, logs stay at INFO level.
|
||||
|
||||
Vocabulary correction:
|
||||
|
||||
- `vocabulary.replacements` is deterministic correction (`from -> to`).
|
||||
- `vocabulary.terms` is a preferred spelling list used as hinting context.
|
||||
- Wildcards are intentionally rejected (`*`, `?`, `[`, `]`, `{`, `}`) to avoid
|
||||
ambiguous rules.
|
||||
- Rules are deduplicated case-insensitively; conflicting replacements are
|
||||
rejected.
|
||||
|
||||
STT hinting:
|
||||
|
||||
- Vocabulary is passed to Whisper as compact `hotwords` only when that argument
|
||||
is supported by the installed `faster-whisper` runtime.
|
||||
- Aman enables `word_timestamps` when supported and runs a conservative
|
||||
alignment heuristic pass before the editor stage.
|
||||
|
||||
Fact guard:
|
||||
|
||||
- Aman runs a deterministic fact-preservation verifier after editor output.
|
||||
- If facts are changed or invented and `safety.strict=false`, Aman falls back
|
||||
to the safer aligned draft.
|
||||
- If facts are changed or invented and `safety.strict=true`, processing fails
|
||||
and output is not injected.
|
||||
94
docs/developer-workflows.md
Normal file
94
docs/developer-workflows.md
Normal file
|
|
@ -0,0 +1,94 @@
|
|||
# Developer And Maintainer Workflows
|
||||
|
||||
This document keeps build, packaging, development, and benchmarking material
|
||||
out of the first-run README path.
|
||||
|
||||
## Build and packaging
|
||||
|
||||
```bash
|
||||
make build
|
||||
make package
|
||||
make package-portable
|
||||
make package-deb
|
||||
make package-arch
|
||||
make runtime-check
|
||||
make release-check
|
||||
```
|
||||
|
||||
- `make package-portable` builds `dist/aman-x11-linux-<version>.tar.gz` plus
|
||||
its `.sha256` file.
|
||||
- `make package-deb` installs Python dependencies while creating the package.
|
||||
- For offline Debian packaging, set `AMAN_WHEELHOUSE_DIR` to a directory
|
||||
containing the required wheels.
|
||||
|
||||
## Developer setup
|
||||
|
||||
`uv` workflow:
|
||||
|
||||
```bash
|
||||
uv sync --extra x11
|
||||
uv run aman run --config ~/.config/aman/config.json
|
||||
```
|
||||
|
||||
`pip` workflow:
|
||||
|
||||
```bash
|
||||
make install-local
|
||||
aman run --config ~/.config/aman/config.json
|
||||
```
|
||||
|
||||
## Support and control commands
|
||||
|
||||
```bash
|
||||
make run
|
||||
make run config.example.json
|
||||
make doctor
|
||||
make self-check
|
||||
make runtime-check
|
||||
make eval-models
|
||||
make sync-default-model
|
||||
make check-default-model
|
||||
make check
|
||||
```
|
||||
|
||||
CLI examples:
|
||||
|
||||
```bash
|
||||
aman doctor --config ~/.config/aman/config.json --json
|
||||
aman self-check --config ~/.config/aman/config.json --json
|
||||
aman run --config ~/.config/aman/config.json
|
||||
aman bench --text "example transcript" --repeat 5 --warmup 1
|
||||
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl --json
|
||||
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --json
|
||||
aman sync-default-model --check --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
|
||||
aman version
|
||||
aman init --config ~/.config/aman/config.json --force
|
||||
```
|
||||
|
||||
## Benchmarking
|
||||
|
||||
```bash
|
||||
aman bench --text "draft a short email to Marta confirming lunch" --repeat 10 --warmup 2
|
||||
aman bench --text-file ./bench-input.txt --repeat 20 --json
|
||||
```
|
||||
|
||||
`bench` does not capture audio and never injects text to desktop apps. It runs
|
||||
the processing path from input transcript text through
|
||||
alignment/editor/fact-guard/vocabulary cleanup and prints timing summaries.
|
||||
|
||||
## Model evaluation
|
||||
|
||||
```bash
|
||||
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl
|
||||
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --output benchmarks/results/latest.json
|
||||
aman sync-default-model --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
|
||||
```
|
||||
|
||||
- `eval-models` runs a structured model/parameter sweep over a JSONL dataset
|
||||
and outputs latency plus quality metrics.
|
||||
- When `--heuristic-dataset` is provided, the report also includes
|
||||
alignment-heuristic quality metrics.
|
||||
- `sync-default-model` promotes the report winner to the managed default model
|
||||
constants and can be run in `--check` mode for CI and release gates.
|
||||
|
||||
Dataset and artifact details live in [`benchmarks/README.md`](../benchmarks/README.md).
|
||||
BIN
docs/media/first-run-demo.webm
Normal file
BIN
docs/media/first-run-demo.webm
Normal file
Binary file not shown.
BIN
docs/media/settings-window.png
Normal file
BIN
docs/media/settings-window.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 69 KiB |
BIN
docs/media/tray-menu.png
Normal file
BIN
docs/media/tray-menu.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 30 KiB |
|
|
@ -2,6 +2,9 @@
|
|||
|
||||
This is the canonical end-user install path for Aman on X11.
|
||||
|
||||
For the shortest first-run path, screenshots, and the expected tray/dictation
|
||||
result, start with the quickstart in [`README.md`](../README.md).
|
||||
|
||||
## Supported environment
|
||||
|
||||
- X11 desktop session
|
||||
|
|
|
|||
|
|
@ -39,7 +39,12 @@ GA signoff bar. The GA signoff sections are required for `v1.0.0` and later.
|
|||
- `make runtime-check` passes.
|
||||
- [`docs/runtime-recovery.md`](./runtime-recovery.md) matches the shipped diagnostic IDs and next-step wording.
|
||||
- [`docs/x11-ga/runtime-validation-report.md`](./x11-ga/runtime-validation-report.md) contains current automated evidence and release-specific manual validation entries.
|
||||
12. GA validation signoff (`v1.0.0` and later):
|
||||
12. GA first-run UX signoff (`v1.0.0` and later):
|
||||
- `README.md` leads with the supported first-run path and expected visible result.
|
||||
- `docs/media/settings-window.png`, `docs/media/tray-menu.png`, and `docs/media/first-run-demo.webm` are current and linked from the README.
|
||||
- [`docs/x11-ga/first-run-review-notes.md`](./x11-ga/first-run-review-notes.md) contains a non-implementer walkthrough and the questions it surfaced.
|
||||
- `aman --help` exposes the main command surface directly.
|
||||
13. GA validation signoff (`v1.0.0` and later):
|
||||
- Validation evidence exists for Debian/Ubuntu, Arch, Fedora, and openSUSE.
|
||||
- The portable installer, upgrade path, and uninstall path are validated.
|
||||
- End-user docs and release notes match the shipped artifact set.
|
||||
|
|
|
|||
|
|
@ -2,6 +2,23 @@
|
|||
|
||||
Use this guide when Aman is installed but not behaving correctly.
|
||||
|
||||
## First-run troubleshooting
|
||||
|
||||
- Settings window did not appear:
|
||||
run `aman run --config ~/.config/aman/config.json` once in the foreground so
|
||||
you can complete first-run setup.
|
||||
- No tray icon after saving settings:
|
||||
run `aman self-check --config ~/.config/aman/config.json` and confirm the
|
||||
user service is enabled and active.
|
||||
- Hotkey does not start recording:
|
||||
run `aman doctor --config ~/.config/aman/config.json`, then choose a
|
||||
different hotkey in Settings if `hotkey.parse` is not `ok`.
|
||||
- Microphone test failed:
|
||||
re-open Settings, choose another input device, then rerun `aman doctor`.
|
||||
- Text was transcribed but not injected:
|
||||
run `aman doctor`, then rerun `aman run --config ~/.config/aman/config.json --verbose`
|
||||
to inspect the output backend in the foreground.
|
||||
|
||||
## Command roles
|
||||
|
||||
- `aman doctor --config ~/.config/aman/config.json` is the fast, read-only preflight for config, X11 session, audio runtime, input device resolution, hotkey availability, injection backend selection, and service prerequisites.
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ Even if install and runtime reliability are strong, Aman will not feel GA until
|
|||
- first launch
|
||||
- choosing a microphone
|
||||
- triggering the first dictation
|
||||
- expected tray or notification behavior
|
||||
- expected tray behavior
|
||||
- expected injected text result
|
||||
- Add a "validate your install" flow using `aman doctor` and `aman self-check`.
|
||||
- Add screenshots for the settings window and tray menu.
|
||||
|
|
@ -63,6 +63,6 @@ Even if install and runtime reliability are strong, Aman will not feel GA until
|
|||
## Evidence required to close
|
||||
|
||||
- Updated README and linked support docs.
|
||||
- Screenshots and demo artifact checked into the release or docs surface.
|
||||
- Screenshots and demo artifact checked into the docs surface.
|
||||
- A reviewer walk-through from someone who did not implement the docs rewrite.
|
||||
- A short list of first-run questions found during review and how the docs resolved them.
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ The current gaps are:
|
|||
- The X11 support contract and service-versus-foreground split are now documented, but the public release surface still needs the remaining trust and support work from milestones 4 and 5.
|
||||
- Validation matrices now exist for portable lifecycle and runtime reliability, but they are not yet filled with release-specific manual evidence across Debian/Ubuntu, Arch, Fedora, and openSUSE.
|
||||
- Incomplete trust surface. The project still needs a real license file, real maintainer/contact metadata, real project URLs, published release artifacts, and public checksums.
|
||||
- Incomplete first-run story. The product describes a settings window and tray workflow, but there is no short happy path, no expected-result walkthrough, and no visual proof that the experience is real.
|
||||
- The first-run docs and media have landed, but milestone 4 still needs a non-implementer walkthrough before the project can claim that the public docs are actually enough.
|
||||
- Diagnostics are now the canonical recovery path, but milestone 3 still needs release-specific X11 evidence for restart, offline-start, tray diagnostics, and recovery scenarios.
|
||||
- The release checklist now includes GA signoff gates, but the project is still short of the broader legal, release-publication, and validation evidence needed for a credible public 1.0 release.
|
||||
|
||||
|
|
@ -100,7 +100,12 @@ Any future docs, tray copy, and release notes should point users to this same se
|
|||
[`runtime-validation-report.md`](./runtime-validation-report.md) are filled
|
||||
with real X11 validation evidence.
|
||||
- [ ] [Milestone 4: First-Run UX and Support Docs](./04-first-run-ux-and-support-docs.md)
|
||||
Turn the product from "documented by the author" into "understandable by a new user."
|
||||
Implementation landed on 2026-03-12: the README is now end-user-first,
|
||||
first-run assets live under `docs/media/`, deep config and maintainer content
|
||||
moved into linked docs, and `aman --help` exposes the top-level commands
|
||||
directly. Leave this milestone open until
|
||||
[`first-run-review-notes.md`](./first-run-review-notes.md) contains a real
|
||||
non-implementer walkthrough.
|
||||
- [ ] [Milestone 5: GA Candidate Validation and Release](./05-ga-candidate-validation-and-release.md)
|
||||
Close the remaining trust, legal, release, and validation work for a public 1.0 launch.
|
||||
|
||||
|
|
|
|||
24
docs/x11-ga/first-run-review-notes.md
Normal file
24
docs/x11-ga/first-run-review-notes.md
Normal file
|
|
@ -0,0 +1,24 @@
|
|||
# First-Run Review Notes
|
||||
|
||||
Use this file to capture the non-implementer walkthrough required to close
|
||||
milestone 4.
|
||||
|
||||
## Review template
|
||||
|
||||
- Reviewer:
|
||||
- Date:
|
||||
- Environment:
|
||||
- Entry point used:
|
||||
- Did the reviewer use only the public docs? yes / no
|
||||
|
||||
## First-run questions or confusions
|
||||
|
||||
- Question:
|
||||
- Where it appeared:
|
||||
- How the docs or product resolved it:
|
||||
|
||||
## Remaining gaps
|
||||
|
||||
- Gap:
|
||||
- Severity:
|
||||
- Suggested follow-up:
|
||||
338
scripts/generate_docs_media.py
Normal file
338
scripts/generate_docs_media.py
Normal file
|
|
@ -0,0 +1,338 @@
|
|||
#!/usr/bin/env python3
|
||||
from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from PIL import Image, ImageDraw, ImageFont
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parents[1]
|
||||
MEDIA_DIR = ROOT / "docs" / "media"
|
||||
FONT_REGULAR = "/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf"
|
||||
FONT_BOLD = "/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf"
|
||||
|
||||
|
||||
def font(size: int, *, bold: bool = False) -> ImageFont.ImageFont:
|
||||
candidate = FONT_BOLD if bold else FONT_REGULAR
|
||||
try:
|
||||
return ImageFont.truetype(candidate, size=size)
|
||||
except OSError:
|
||||
return ImageFont.load_default()
|
||||
|
||||
|
||||
def draw_round_rect(draw: ImageDraw.ImageDraw, box, radius: int, *, fill, outline=None, width=1):
|
||||
draw.rounded_rectangle(box, radius=radius, fill=fill, outline=outline, width=width)
|
||||
|
||||
|
||||
def draw_background(size: tuple[int, int], *, light=False) -> Image.Image:
|
||||
w, h = size
|
||||
image = Image.new("RGBA", size, "#0d111b" if not light else "#e5e8ef")
|
||||
draw = ImageDraw.Draw(image)
|
||||
for y in range(h):
|
||||
mix = y / max(1, h - 1)
|
||||
if light:
|
||||
color = (
|
||||
int(229 + (240 - 229) * mix),
|
||||
int(232 + (241 - 232) * mix),
|
||||
int(239 + (246 - 239) * mix),
|
||||
255,
|
||||
)
|
||||
else:
|
||||
color = (
|
||||
int(13 + (30 - 13) * mix),
|
||||
int(17 + (49 - 17) * mix),
|
||||
int(27 + (79 - 27) * mix),
|
||||
255,
|
||||
)
|
||||
draw.line((0, y, w, y), fill=color)
|
||||
draw.ellipse((60, 70, 360, 370), fill=(43, 108, 176, 90))
|
||||
draw.ellipse((w - 360, h - 340, w - 40, h - 20), fill=(14, 116, 144, 70))
|
||||
draw.ellipse((w - 260, 40, w - 80, 220), fill=(244, 114, 182, 50))
|
||||
return image
|
||||
|
||||
|
||||
def paste_center(base: Image.Image, overlay: Image.Image, top: int) -> tuple[int, int]:
|
||||
x = (base.width - overlay.width) // 2
|
||||
base.alpha_composite(overlay, (x, top))
|
||||
return (x, top)
|
||||
|
||||
|
||||
def draw_text_block(
|
||||
draw: ImageDraw.ImageDraw,
|
||||
origin: tuple[int, int],
|
||||
lines: list[str],
|
||||
*,
|
||||
fill,
|
||||
title=None,
|
||||
title_fill=None,
|
||||
line_gap=12,
|
||||
body_font=None,
|
||||
title_font=None,
|
||||
):
|
||||
x, y = origin
|
||||
title_font = title_font or font(26, bold=True)
|
||||
body_font = body_font or font(22)
|
||||
if title:
|
||||
draw.text((x, y), title, font=title_font, fill=title_fill or fill)
|
||||
y += title_font.size + 10
|
||||
for line in lines:
|
||||
draw.text((x, y), line, font=body_font, fill=fill)
|
||||
y += body_font.size + line_gap
|
||||
|
||||
|
||||
def build_settings_window() -> Image.Image:
|
||||
base = draw_background((1440, 900))
|
||||
window = Image.new("RGBA", (1180, 760), (248, 250, 252, 255))
|
||||
draw = ImageDraw.Draw(window)
|
||||
draw_round_rect(draw, (0, 0, 1179, 759), 26, fill="#f8fafc", outline="#cbd5e1", width=2)
|
||||
draw_round_rect(draw, (0, 0, 1179, 74), 26, fill="#182130")
|
||||
draw.rectangle((0, 40, 1179, 74), fill="#182130")
|
||||
draw.text((32, 22), "Aman Settings (Required)", font=font(28, bold=True), fill="#f8fafc")
|
||||
draw.text((970, 24), "Cancel", font=font(20), fill="#cbd5e1")
|
||||
draw_round_rect(draw, (1055, 14, 1146, 58), 16, fill="#0f766e")
|
||||
draw.text((1080, 24), "Apply", font=font(20, bold=True), fill="#f8fafc")
|
||||
|
||||
draw_round_rect(draw, (26, 94, 1154, 160), 18, fill="#fff7d6", outline="#facc15")
|
||||
draw_text_block(
|
||||
draw,
|
||||
(48, 112),
|
||||
["Aman needs saved settings before it can start recording from the tray."],
|
||||
fill="#4d3a00",
|
||||
)
|
||||
|
||||
draw_round_rect(draw, (26, 188, 268, 734), 20, fill="#eef2f7", outline="#d7dee9")
|
||||
sections = ["General", "Audio", "Runtime & Models", "Help", "About"]
|
||||
y = 224
|
||||
for index, label in enumerate(sections):
|
||||
active = index == 0
|
||||
fill = "#dbeafe" if active else "#eef2f7"
|
||||
outline = "#93c5fd" if active else "#eef2f7"
|
||||
draw_round_rect(draw, (46, y, 248, y + 58), 16, fill=fill, outline=outline)
|
||||
draw.text((68, y + 16), label, font=font(22, bold=active), fill="#0f172a")
|
||||
y += 76
|
||||
|
||||
draw_round_rect(draw, (300, 188, 1154, 734), 20, fill="#ffffff", outline="#d7dee9")
|
||||
draw_text_block(draw, (332, 220), [], title="General", fill="#0f172a", title_font=font(30, bold=True))
|
||||
|
||||
labels = [
|
||||
("Trigger hotkey", "Super+m"),
|
||||
("Text injection", "Clipboard paste (recommended)"),
|
||||
("Transcription language", "Auto detect"),
|
||||
("Profile", "Default"),
|
||||
]
|
||||
y = 286
|
||||
for label, value in labels:
|
||||
draw.text((332, y), label, font=font(22, bold=True), fill="#0f172a")
|
||||
draw_round_rect(draw, (572, y - 8, 1098, y + 38), 14, fill="#f8fafc", outline="#cbd5e1")
|
||||
draw.text((596, y + 4), value, font=font(20), fill="#334155")
|
||||
y += 92
|
||||
|
||||
draw_round_rect(draw, (332, 480, 1098, 612), 18, fill="#f0fdf4", outline="#86efac")
|
||||
draw_text_block(
|
||||
draw,
|
||||
(360, 512),
|
||||
[
|
||||
"Supported first-run path:",
|
||||
"1. Pick the microphone you want to use.",
|
||||
"2. Keep the recommended clipboard backend.",
|
||||
"3. Click Apply and wait for the tray to return to Idle.",
|
||||
],
|
||||
fill="#166534",
|
||||
body_font=font(20),
|
||||
)
|
||||
|
||||
draw_round_rect(draw, (332, 638, 1098, 702), 18, fill="#e0f2fe", outline="#7dd3fc")
|
||||
draw.text(
|
||||
(360, 660),
|
||||
"After setup, put your cursor in a text field and say: hello from Aman",
|
||||
font=font(20, bold=True),
|
||||
fill="#155e75",
|
||||
)
|
||||
|
||||
background = base.copy()
|
||||
paste_center(background, window, 70)
|
||||
return background.convert("RGB")
|
||||
|
||||
|
||||
def build_tray_menu() -> Image.Image:
|
||||
base = draw_background((1280, 900), light=True)
|
||||
draw = ImageDraw.Draw(base)
|
||||
draw_round_rect(draw, (0, 0, 1279, 54), 0, fill="#111827")
|
||||
draw.text((42, 16), "X11 Session", font=font(20, bold=True), fill="#e5e7eb")
|
||||
draw_round_rect(draw, (1038, 10, 1180, 42), 14, fill="#1f2937", outline="#374151")
|
||||
draw.text((1068, 17), "Idle", font=font(18, bold=True), fill="#e5e7eb")
|
||||
|
||||
menu = Image.new("RGBA", (420, 520), (255, 255, 255, 255))
|
||||
menu_draw = ImageDraw.Draw(menu)
|
||||
draw_round_rect(menu_draw, (0, 0, 419, 519), 22, fill="#ffffff", outline="#cbd5e1", width=2)
|
||||
items = [
|
||||
"Settings...",
|
||||
"Help",
|
||||
"About",
|
||||
"Pause Aman",
|
||||
"Reload Config",
|
||||
"Run Diagnostics",
|
||||
"Open Config Path",
|
||||
"Quit",
|
||||
]
|
||||
y = 26
|
||||
for label in items:
|
||||
highlighted = label == "Run Diagnostics"
|
||||
if highlighted:
|
||||
draw_round_rect(menu_draw, (16, y - 6, 404, y + 40), 14, fill="#dbeafe")
|
||||
menu_draw.text((34, y), label, font=font(22, bold=highlighted), fill="#0f172a")
|
||||
y += 58
|
||||
if label in {"About", "Run Diagnostics"}:
|
||||
menu_draw.line((24, y - 10, 396, y - 10), fill="#e2e8f0", width=2)
|
||||
|
||||
paste_center(base, menu, 118)
|
||||
return base.convert("RGB")
|
||||
|
||||
|
||||
def build_terminal_scene() -> Image.Image:
|
||||
image = Image.new("RGB", (1280, 720), "#0b1220")
|
||||
draw = ImageDraw.Draw(image)
|
||||
draw_round_rect(draw, (100, 80, 1180, 640), 24, fill="#0f172a", outline="#334155", width=2)
|
||||
draw_round_rect(draw, (100, 80, 1180, 132), 24, fill="#111827")
|
||||
draw.rectangle((100, 112, 1180, 132), fill="#111827")
|
||||
draw.text((136, 97), "Terminal", font=font(26, bold=True), fill="#e2e8f0")
|
||||
draw.text((168, 192), "$ sha256sum -c aman-x11-linux-0.1.0.tar.gz.sha256", font=font(22), fill="#86efac")
|
||||
draw.text((168, 244), "aman-x11-linux-0.1.0.tar.gz: OK", font=font(22), fill="#cbd5e1")
|
||||
draw.text((168, 310), "$ tar -xzf aman-x11-linux-0.1.0.tar.gz", font=font(22), fill="#86efac")
|
||||
draw.text((168, 362), "$ cd aman-x11-linux-0.1.0", font=font(22), fill="#86efac")
|
||||
draw.text((168, 414), "$ ./install.sh", font=font(22), fill="#86efac")
|
||||
draw.text((168, 482), "Installed aman.service and started the user service.", font=font(22), fill="#cbd5e1")
|
||||
draw.text((168, 534), "Waiting for first-run settings...", font=font(22), fill="#7dd3fc")
|
||||
draw.text((128, 30), "1. Install the portable bundle", font=font(34, bold=True), fill="#f8fafc")
|
||||
return image
|
||||
|
||||
|
||||
def build_editor_scene(*, badge: str | None = None, text: str = "", subtitle: str) -> Image.Image:
|
||||
image = draw_background((1280, 720), light=True).convert("RGB")
|
||||
draw = ImageDraw.Draw(image)
|
||||
draw_round_rect(draw, (84, 64, 1196, 642), 26, fill="#ffffff", outline="#cbd5e1", width=2)
|
||||
draw_round_rect(draw, (84, 64, 1196, 122), 26, fill="#f8fafc")
|
||||
draw.rectangle((84, 94, 1196, 122), fill="#f8fafc")
|
||||
draw.text((122, 84), "Focused editor", font=font(24, bold=True), fill="#0f172a")
|
||||
draw.text((122, 158), subtitle, font=font(26, bold=True), fill="#0f172a")
|
||||
draw_round_rect(draw, (996, 80, 1144, 116), 16, fill="#111827")
|
||||
draw.text((1042, 89), "Idle", font=font(18, bold=True), fill="#e5e7eb")
|
||||
|
||||
if badge:
|
||||
fill = {"Recording": "#dc2626", "STT": "#2563eb", "AI Processing": "#0f766e"}[badge]
|
||||
draw_round_rect(draw, (122, 214, 370, 262), 18, fill=fill)
|
||||
draw.text((150, 225), badge, font=font(24, bold=True), fill="#f8fafc")
|
||||
|
||||
draw_round_rect(draw, (122, 308, 1158, 572), 22, fill="#f8fafc", outline="#d7dee9")
|
||||
if text:
|
||||
draw.multiline_text((156, 350), text, font=font(34), fill="#0f172a", spacing=18)
|
||||
else:
|
||||
draw.text((156, 366), "Cursor ready for dictation...", font=font(32), fill="#64748b")
|
||||
return image
|
||||
|
||||
|
||||
def build_demo_webm(settings_png: Path, tray_png: Path, output: Path) -> None:
|
||||
scenes = [
|
||||
("01-install.png", build_terminal_scene(), 3.0),
|
||||
("02-settings.png", Image.open(settings_png).resize((1280, 800)).crop((0, 40, 1280, 760)), 4.0),
|
||||
("03-tray.png", Image.open(tray_png).resize((1280, 900)).crop((0, 90, 1280, 810)), 3.0),
|
||||
(
|
||||
"04-editor-ready.png",
|
||||
build_editor_scene(
|
||||
subtitle="2. Press the hotkey and say: hello from Aman",
|
||||
text="",
|
||||
),
|
||||
3.0,
|
||||
),
|
||||
(
|
||||
"05-recording.png",
|
||||
build_editor_scene(
|
||||
badge="Recording",
|
||||
subtitle="Tray and status now show recording",
|
||||
text="",
|
||||
),
|
||||
1.5,
|
||||
),
|
||||
(
|
||||
"06-stt.png",
|
||||
build_editor_scene(
|
||||
badge="STT",
|
||||
subtitle="Aman transcribes the audio locally",
|
||||
text="",
|
||||
),
|
||||
1.5,
|
||||
),
|
||||
(
|
||||
"07-processing.png",
|
||||
build_editor_scene(
|
||||
badge="AI Processing",
|
||||
subtitle="Cleanup and injection finish automatically",
|
||||
text="",
|
||||
),
|
||||
1.5,
|
||||
),
|
||||
(
|
||||
"08-result.png",
|
||||
build_editor_scene(
|
||||
subtitle="3. The text lands in the focused app",
|
||||
text="Hello from Aman.",
|
||||
),
|
||||
4.0,
|
||||
),
|
||||
]
|
||||
|
||||
with tempfile.TemporaryDirectory() as td:
|
||||
temp_dir = Path(td)
|
||||
concat = temp_dir / "scenes.txt"
|
||||
concat_lines: list[str] = []
|
||||
for name, image, duration in scenes:
|
||||
frame_path = temp_dir / name
|
||||
image.convert("RGB").save(frame_path, format="PNG")
|
||||
concat_lines.append(f"file '{frame_path.as_posix()}'")
|
||||
concat_lines.append(f"duration {duration}")
|
||||
concat_lines.append(f"file '{(temp_dir / scenes[-1][0]).as_posix()}'")
|
||||
concat.write_text("\n".join(concat_lines) + "\n", encoding="utf-8")
|
||||
subprocess.run(
|
||||
[
|
||||
"ffmpeg",
|
||||
"-y",
|
||||
"-f",
|
||||
"concat",
|
||||
"-safe",
|
||||
"0",
|
||||
"-i",
|
||||
str(concat),
|
||||
"-vf",
|
||||
"fps=24,format=yuv420p",
|
||||
"-c:v",
|
||||
"libvpx-vp9",
|
||||
"-b:v",
|
||||
"0",
|
||||
"-crf",
|
||||
"34",
|
||||
str(output),
|
||||
],
|
||||
check=True,
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
MEDIA_DIR.mkdir(parents=True, exist_ok=True)
|
||||
settings_png = MEDIA_DIR / "settings-window.png"
|
||||
tray_png = MEDIA_DIR / "tray-menu.png"
|
||||
demo_webm = MEDIA_DIR / "first-run-demo.webm"
|
||||
|
||||
build_settings_window().save(settings_png, format="PNG")
|
||||
build_tray_menu().save(tray_png, format="PNG")
|
||||
build_demo_webm(settings_png, tray_png, demo_webm)
|
||||
print(f"wrote {settings_png}")
|
||||
print(f"wrote {tray_png}")
|
||||
print(f"wrote {demo_webm}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
14
src/aman.py
14
src/aman.py
|
|
@ -997,7 +997,17 @@ def _sync_default_model_command(args: argparse.Namespace) -> int:
|
|||
|
||||
|
||||
def _build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser()
|
||||
parser = argparse.ArgumentParser(
|
||||
description=(
|
||||
"Aman is an X11 dictation daemon for Linux desktops. "
|
||||
"Use `run` for foreground setup/support, `doctor` for fast preflight checks, "
|
||||
"and `self-check` for deeper installed-system readiness."
|
||||
),
|
||||
epilog=(
|
||||
"Supported daily use is the systemd --user service. "
|
||||
"For recovery: doctor -> self-check -> journalctl -> aman run --verbose."
|
||||
),
|
||||
)
|
||||
subparsers = parser.add_subparsers(dest="command")
|
||||
|
||||
run_parser = subparsers.add_parser(
|
||||
|
|
@ -1129,6 +1139,8 @@ def _parse_cli_args(argv: list[str]) -> argparse.Namespace:
|
|||
"version",
|
||||
"init",
|
||||
}
|
||||
if normalized_argv and normalized_argv[0] in {"-h", "--help"}:
|
||||
return parser.parse_args(normalized_argv)
|
||||
if not normalized_argv or normalized_argv[0] not in known_commands:
|
||||
normalized_argv = ["run", *normalized_argv]
|
||||
return parser.parse_args(normalized_argv)
|
||||
|
|
|
|||
|
|
@ -86,7 +86,7 @@ class ConfigWindow:
|
|||
banner.set_show_close_button(False)
|
||||
banner.set_message_type(Gtk.MessageType.WARNING)
|
||||
banner_label = Gtk.Label(
|
||||
label="Aman needs saved settings before it can start recording."
|
||||
label="Aman needs saved settings before it can start recording from the tray."
|
||||
)
|
||||
banner_label.set_xalign(0.0)
|
||||
banner_label.set_line_wrap(True)
|
||||
|
|
@ -219,9 +219,6 @@ class ConfigWindow:
|
|||
grid.attach(profile_label, 0, 5, 1, 1)
|
||||
grid.attach(self._profile_combo, 1, 5, 1, 1)
|
||||
|
||||
self._show_notifications_check = Gtk.CheckButton(label="Enable tray notifications")
|
||||
self._show_notifications_check.set_hexpand(True)
|
||||
grid.attach(self._show_notifications_check, 1, 6, 1, 1)
|
||||
return grid
|
||||
|
||||
def _build_audio_page(self) -> Gtk.Widget:
|
||||
|
|
@ -382,13 +379,17 @@ class ConfigWindow:
|
|||
"- Press your hotkey to start recording.\n"
|
||||
"- Press the hotkey again to stop and process.\n"
|
||||
"- Press Esc while recording to cancel.\n\n"
|
||||
"Model/runtime tips:\n"
|
||||
"Supported path:\n"
|
||||
"- Daily use runs through the tray and user service.\n"
|
||||
"- Aman-managed mode (recommended) handles model lifecycle for you.\n"
|
||||
"- Expert mode lets you set custom Whisper model paths.\n\n"
|
||||
"- Expert mode keeps custom Whisper paths available for advanced users.\n\n"
|
||||
"Recovery:\n"
|
||||
"- Use Run Diagnostics from the tray for a deeper self-check.\n"
|
||||
"- If that is not enough, run aman doctor, then aman self-check.\n"
|
||||
"- Next escalations are journalctl --user -u aman and aman run --verbose.\n\n"
|
||||
"Safety tips:\n"
|
||||
"- Keep fact guard enabled to prevent accidental name/number changes.\n"
|
||||
"- Strict safety blocks output on fact violations.\n\n"
|
||||
"Use the tray menu for pause/resume, config reload, and diagnostics."
|
||||
"- Strict safety blocks output on fact violations."
|
||||
)
|
||||
)
|
||||
help_text.set_xalign(0.0)
|
||||
|
|
@ -412,7 +413,7 @@ class ConfigWindow:
|
|||
title.set_xalign(0.0)
|
||||
box.pack_start(title, False, False, 0)
|
||||
|
||||
subtitle = Gtk.Label(label="Local amanuensis for desktop dictation and rewriting.")
|
||||
subtitle = Gtk.Label(label="Local amanuensis for X11 desktop dictation and rewriting.")
|
||||
subtitle.set_xalign(0.0)
|
||||
subtitle.set_line_wrap(True)
|
||||
box.pack_start(subtitle, False, False, 0)
|
||||
|
|
@ -445,7 +446,6 @@ class ConfigWindow:
|
|||
if profile not in {"default", "fast", "polished"}:
|
||||
profile = "default"
|
||||
self._profile_combo.set_active_id(profile)
|
||||
self._show_notifications_check.set_active(bool(self._config.ux.show_notifications))
|
||||
self._strict_startup_check.set_active(bool(self._config.advanced.strict_startup))
|
||||
self._safety_enabled_check.set_active(bool(self._config.safety.enabled))
|
||||
self._safety_strict_check.set_active(bool(self._config.safety.strict))
|
||||
|
|
@ -570,7 +570,6 @@ class ConfigWindow:
|
|||
cfg.injection.remove_transcription_from_clipboard = self._remove_clipboard_check.get_active()
|
||||
cfg.stt.language = self._language_combo.get_active_id() or "auto"
|
||||
cfg.ux.profile = self._profile_combo.get_active_id() or "default"
|
||||
cfg.ux.show_notifications = self._show_notifications_check.get_active()
|
||||
cfg.advanced.strict_startup = self._strict_startup_check.get_active()
|
||||
cfg.safety.enabled = self._safety_enabled_check.get_active()
|
||||
cfg.safety.strict = self._safety_strict_check.get_active() and cfg.safety.enabled
|
||||
|
|
@ -623,8 +622,10 @@ def show_help_dialog() -> None:
|
|||
dialog.set_title("Aman Help")
|
||||
dialog.format_secondary_text(
|
||||
"Press your hotkey to record, press it again to process, and press Esc while recording to "
|
||||
"cancel. Keep fact guard enabled to prevent accidental fact changes. Aman-managed mode is "
|
||||
"the canonical supported path; expert mode exposes custom Whisper model paths for advanced users."
|
||||
"cancel. Daily use runs through the tray and user service. Use Run Diagnostics or "
|
||||
"the doctor -> self-check -> journalctl -> aman run --verbose flow when something breaks. "
|
||||
"Aman-managed mode is the canonical supported path; expert mode exposes custom Whisper model paths "
|
||||
"for advanced users."
|
||||
)
|
||||
dialog.run()
|
||||
dialog.destroy()
|
||||
|
|
@ -642,7 +643,7 @@ def _present_about_dialog(parent) -> None:
|
|||
about = Gtk.AboutDialog(transient_for=parent, modal=True)
|
||||
about.set_program_name("Aman")
|
||||
about.set_version("pre-release")
|
||||
about.set_comments("Local amanuensis for desktop dictation and rewriting.")
|
||||
about.set_comments("Local amanuensis for X11 desktop dictation and rewriting.")
|
||||
about.set_license("MIT")
|
||||
about.set_wrap_license(True)
|
||||
about.run()
|
||||
|
|
|
|||
|
|
@ -115,6 +115,28 @@ class _FakeBenchEditorStage:
|
|||
|
||||
|
||||
class AmanCliTests(unittest.TestCase):
|
||||
def test_parse_cli_args_help_flag_uses_top_level_parser(self):
|
||||
out = io.StringIO()
|
||||
|
||||
with patch("sys.stdout", out), self.assertRaises(SystemExit) as exc:
|
||||
aman._parse_cli_args(["--help"])
|
||||
|
||||
self.assertEqual(exc.exception.code, 0)
|
||||
rendered = out.getvalue()
|
||||
self.assertIn("run", rendered)
|
||||
self.assertIn("doctor", rendered)
|
||||
self.assertIn("self-check", rendered)
|
||||
self.assertIn("systemd --user service", rendered)
|
||||
|
||||
def test_parse_cli_args_short_help_flag_uses_top_level_parser(self):
|
||||
out = io.StringIO()
|
||||
|
||||
with patch("sys.stdout", out), self.assertRaises(SystemExit) as exc:
|
||||
aman._parse_cli_args(["-h"])
|
||||
|
||||
self.assertEqual(exc.exception.code, 0)
|
||||
self.assertIn("self-check", out.getvalue())
|
||||
|
||||
def test_parse_cli_args_defaults_to_run_command(self):
|
||||
args = aman._parse_cli_args(["--dry-run"])
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue