Use in-process Llama cleanup

2026-02-24 12:46:11 -03:00 · 2026-02-24 12:46:11 -03:00 · a83a843e1a
commit a83a843e1a
parent 548be49112
7 changed files with 235 additions and 116 deletions
--- a/README.md
+++ b/README.md
@ -7,8 +7,9 @@ Python X11 STT daemon that records audio, runs Whisper, logs the transcript, and
 - X11 (not Wayland)
 - `sounddevice` (PortAudio)
 - `faster-whisper`
+- `llama-cpp-python`
 - Tray icon deps: `gtk3`, `libayatana-appindicator3`
- Python deps: `pillow`, `python-xlib`, `faster-whisper`, `PyGObject`, `sounddevice`
+- Python deps: `pillow`, `python-xlib`, `faster-whisper`, `llama-cpp-python`, `PyGObject`, `sounddevice`

 System packages (example names): `portaudio`/`libportaudio2`.

@ -35,19 +36,18 @@ Create `~/.config/lel/config.json`:
  "daemon": { "hotkey": "Cmd+m" },
  "recording": { "input": "0" },
  "stt": { "model": "base", "device": "cpu" },
-  "injection": { "backend": "clipboard" },
-
-  "ai_cleanup": {
-    "model": "llama3.2:3b",
-    "base_url": "http://localhost:11434",
-    "api_key": ""
-  }
+  "injection": { "backend": "clipboard" }
 }
 ```

 Recording input can be a device index (preferred) or a substring of the device
 name.

+The LLM model is downloaded on first startup to `~/.cache/lel/models/` and uses
+the locked Llama-3.2-3B GGUF model.
+Pass `-v/--verbose` to see verbose logs, including llama.cpp loader logs; these
+messages are prefixed with `llama::`.
+
 ## systemd user service

 ```bash