Switch to sounddevice recording
This commit is contained in:
parent
afdf088d17
commit
b6c0fc0793
9 changed files with 250 additions and 468 deletions
16
AGENTS.md
16
AGENTS.md
|
|
@ -2,15 +2,15 @@
|
|||
|
||||
## Project Structure & Module Organization
|
||||
|
||||
- `lel.sh` is the primary entrypoint; it records audio, runs `whisper`, and prints the transcript.
|
||||
- `env/` is a local Python virtual environment (optional) used to install runtime dependencies.
|
||||
- There are no separate source, test, or asset directories at this time.
|
||||
- `src/leld.py` is the primary entrypoint (X11 transcription daemon).
|
||||
- `src/recorder.py` handles audio capture using PortAudio via `sounddevice`.
|
||||
- `src/stt.py` wraps faster-whisper for transcription.
|
||||
|
||||
## Build, Test, and Development Commands
|
||||
|
||||
- `./lel.sh` streams transcription from the microphone until you press Enter.
|
||||
- Example with overrides: `WHISPER_MODEL=small WHISPER_LANG=pt WHISPER_DEVICE=cuda ./lel.sh`.
|
||||
- Dependencies expected on PATH: `ffmpeg` and `whisper` (the OpenAI Whisper CLI).
|
||||
- Install deps: `uv sync`.
|
||||
- Run daemon: `uv run python3 src/leld.py --config ~/.config/lel/config.json`.
|
||||
- Open settings: `uv run python3 src/leld.py --settings --config ~/.config/lel/config.json`.
|
||||
|
||||
## Coding Style & Naming Conventions
|
||||
|
||||
|
|
@ -30,7 +30,5 @@
|
|||
|
||||
## Configuration Tips
|
||||
|
||||
- Audio input is controlled via `WHISPER_FFMPEG_IN` (default `pulse:default`), e.g., `alsa:default`.
|
||||
- Streaming is on by default; set `WHISPER_STREAM=0` to transcribe after recording.
|
||||
- Segment duration for streaming is `WHISPER_SEGMENT_SEC` (default `5`).
|
||||
- Audio input is controlled via `WHISPER_FFMPEG_IN` (device index or name).
|
||||
- Model, language, device, and extra args can be set with `WHISPER_MODEL`, `WHISPER_LANG`, `WHISPER_DEVICE`, and `WHISPER_EXTRA_ARGS`.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue