Add vocabulary correction pipeline and example config

This commit is contained in:
Thales Maciel 2026-02-25 10:03:32 -03:00
parent f9224621fa
commit c3503fbbde
9 changed files with 865 additions and 23 deletions

View file

@ -1,6 +1,6 @@
# lel
Python X11 STT daemon that records audio, runs Whisper, and injects text. It can optionally run local AI post-processing before injection.
Python X11 STT daemon that records audio, runs Whisper, applies local AI cleanup, and injects text.
## Requirements
@ -92,21 +92,50 @@ Create `~/.config/lel/config.json`:
"stt": { "model": "base", "device": "cpu" },
"injection": { "backend": "clipboard" },
"ai": { "enabled": true },
"logging": { "log_transcript": false }
"logging": { "log_transcript": false },
"vocabulary": {
"replacements": [
{ "from": "Martha", "to": "Marta" },
{ "from": "docker", "to": "Docker" }
],
"terms": ["Systemd", "Kubernetes"],
"max_rules": 500,
"max_terms": 500
},
"domain_inference": { "enabled": true, "mode": "auto" }
}
```
Recording input can be a device index (preferred) or a substring of the device
name.
`ai.enabled` controls local cleanup. When enabled, the LLM model is downloaded
on first use to `~/.cache/lel/models/` and uses the locked Llama-3.2-3B GGUF
model.
`ai.enabled` is accepted for compatibility but currently has no runtime effect.
AI cleanup is always enabled and uses the locked local Llama-3.2-3B GGUF model
downloaded to `~/.cache/lel/models/` on first use.
`logging.log_transcript` controls whether recognized/processed text is written
to logs. This is disabled by default. `-v/--verbose` also enables transcript
logging and llama.cpp logs; llama logs are prefixed with `llama::`.
Vocabulary correction:
- `vocabulary.replacements` is deterministic correction (`from -> to`).
- `vocabulary.terms` is a preferred spelling list used as hinting context.
- Wildcards are intentionally rejected (`*`, `?`, `[`, `]`, `{`, `}`) to avoid ambiguous rules.
- Rules are deduplicated case-insensitively; conflicting replacements are rejected.
- Limits are bounded by `max_rules` and `max_terms`.
Domain inference:
- `domain_inference.mode` currently supports `auto`.
- Domain context is advisory only and is used to improve cleanup prompts.
- When confidence is low, it falls back to `general` context.
STT hinting:
- Vocabulary is passed to Whisper as `hotwords`/`initial_prompt` only when those
arguments are supported by the installed `faster-whisper` runtime.
## systemd user service
```bash