| Age | Commit message (Collapse) | Author |
|
Integrated AssemblyAI as the third transcription backend alongside OpenAI
API and local-whisper, now set as the default due to superior speaker
diarization capabilities (up to 50 speakers).
New Features:
- AssemblyAI backend with automatic speaker labeling
- Backend switching UI via C-; T b (completing-read interface)
- Universal speech model supporting 99 languages
- API key management through auth-source/authinfo.gpg
Implementation:
- Created scripts/assemblyai-transcribe (upload → poll → format workflow)
- Updated transcription-config.el with multi-backend support
- Added cj/--get-assemblyai-api-key for secure credential retrieval
- Refactored process environment handling from if to pcase
- Added cj/transcription-switch-backend interactive command
Testing:
- Created test-transcription-config--transcription-script-path.el
- 5 unit tests covering all 3 backends (100% passing)
- Followed quality-engineer.org guidelines (test pure functions only)
- Investigated 18 test failures: documented cleanup in todo.org
Files Modified:
- modules/transcription-config.el - Multi-backend support and UI
- scripts/assemblyai-transcribe - NEW: AssemblyAI integration script
- tests/test-transcription-config--transcription-script-path.el - NEW
- todo.org - Added test cleanup task (Method 3, priority C)
- docs/NOTES.org - Comprehensive session notes added
Successfully tested with 33KB and 4.1MB audio files (3s and 9s processing).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Changed transcription submenu keybindings:
- C-; t t → C-; t a (transcribe audio)
- C-; t b → C-; t v (view transcriptions)
- C-; t k → unchanged (kill transcription)
More intuitive mnemonics: a=audio, v=view, k=kill
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Implemented full transcription system with local Whisper and OpenAI API
support. Includes comprehensive test suite (60 tests) and reorganized
keybindings for better discoverability.
Features:
- Async transcription (non-blocking workflow)
- Desktop notifications (started/complete/error)
- Output: audio.txt (transcript) + audio.log (process logs)
- Modeline integration showing active transcription count
- Dired integration (press T on audio files)
- Process management and tracking
Scripts:
- install-whisper.sh: Install Whisper via AUR or pip
- uninstall-whisper.sh: Clean removal with cache cleanup
- local-whisper: Offline transcription using installed Whisper
- oai-transcribe: Cloud transcription via OpenAI API
Tests (60 passing):
- Audio file detection (16 tests)
- Path generation logic (11 tests)
- Log cleanup behavior (5 tests)
- Duration formatting (9 tests)
- Active counter & modeline (11 tests)
- Integration workflows (8 tests)
Keybindings:
- Reorganized gcal to C-; g submenu (s/t/r/c)
- Added C-; t transcription submenu (t/b/k)
- Dired: T to transcribe file at point
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|