Add Vosk keystroke eval tooling and findings
This commit is contained in:
parent
8c1f7c1e13
commit
510d280b74
15 changed files with 2219 additions and 0 deletions
47
README.md
47
README.md
|
|
@ -294,6 +294,51 @@ aman bench --text-file ./bench-input.txt --repeat 20 --json
|
|||
the processing path from input transcript text through alignment/editor/fact-guard/vocabulary cleanup and
|
||||
prints timing summaries.
|
||||
|
||||
Internal Vosk exploration (fixed-phrase dataset collection):
|
||||
|
||||
```bash
|
||||
aman collect-fixed-phrases \
|
||||
--phrases-file exploration/vosk/fixed_phrases/phrases.txt \
|
||||
--out-dir exploration/vosk/fixed_phrases \
|
||||
--samples-per-phrase 10
|
||||
```
|
||||
|
||||
This internal command prompts each allowed phrase and records labeled WAV
|
||||
samples with manual start/stop (Enter to start, Enter to stop). It does not run
|
||||
Vosk decoding and does not execute desktop commands. Output includes:
|
||||
- `exploration/vosk/fixed_phrases/samples/`
|
||||
- `exploration/vosk/fixed_phrases/manifest.jsonl`
|
||||
|
||||
Internal Vosk exploration (keystroke dictation: literal vs NATO):
|
||||
|
||||
```bash
|
||||
# collect literal-key dataset
|
||||
aman collect-fixed-phrases \
|
||||
--phrases-file exploration/vosk/keystrokes/literal/phrases.txt \
|
||||
--out-dir exploration/vosk/keystrokes/literal \
|
||||
--samples-per-phrase 10
|
||||
|
||||
# collect NATO-key dataset
|
||||
aman collect-fixed-phrases \
|
||||
--phrases-file exploration/vosk/keystrokes/nato/phrases.txt \
|
||||
--out-dir exploration/vosk/keystrokes/nato \
|
||||
--samples-per-phrase 10
|
||||
|
||||
# evaluate both grammars across available Vosk models
|
||||
aman eval-vosk-keystrokes \
|
||||
--literal-manifest exploration/vosk/keystrokes/literal/manifest.jsonl \
|
||||
--nato-manifest exploration/vosk/keystrokes/nato/manifest.jsonl \
|
||||
--intents exploration/vosk/keystrokes/intents.json \
|
||||
--output-dir exploration/vosk/keystrokes/eval_runs \
|
||||
--models-file exploration/vosk/keystrokes/models.example.json
|
||||
```
|
||||
|
||||
`eval-vosk-keystrokes` writes a structured report (`summary.json`) with:
|
||||
- intent accuracy and unknown-rate by grammar
|
||||
- per-intent/per-letter confusion tables
|
||||
- latency (avg/p50/p95), RTF, and model-load time
|
||||
- strict grammar compliance checks (out-of-grammar hypotheses hard-fail the model run)
|
||||
|
||||
Model evaluation lab (dataset + matrix sweep):
|
||||
|
||||
```bash
|
||||
|
|
@ -344,6 +389,8 @@ aman run --config ~/.config/aman/config.json
|
|||
aman doctor --config ~/.config/aman/config.json --json
|
||||
aman self-check --config ~/.config/aman/config.json --json
|
||||
aman bench --text "example transcript" --repeat 5 --warmup 1
|
||||
aman collect-fixed-phrases --phrases-file exploration/vosk/fixed_phrases/phrases.txt --out-dir exploration/vosk/fixed_phrases --samples-per-phrase 10
|
||||
aman eval-vosk-keystrokes --literal-manifest exploration/vosk/keystrokes/literal/manifest.jsonl --nato-manifest exploration/vosk/keystrokes/nato/manifest.jsonl --intents exploration/vosk/keystrokes/intents.json --output-dir exploration/vosk/keystrokes/eval_runs --json
|
||||
aman build-heuristic-dataset --input benchmarks/heuristics_dataset.raw.jsonl --output benchmarks/heuristics_dataset.jsonl --json
|
||||
aman eval-models --dataset benchmarks/cleanup_dataset.jsonl --matrix benchmarks/model_matrix.small_first.json --heuristic-dataset benchmarks/heuristics_dataset.jsonl --heuristic-weight 0.25 --json
|
||||
aman sync-default-model --check --report benchmarks/results/latest.json --artifacts benchmarks/model_artifacts.json --constants src/constants.py
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue