summaryrefslogtreecommitdiff
path: root/modules/transcription-config.el
AgeCommit message (Collapse)Author
2025-11-14fix(recording): Fix phone call audio capture with amix filterCraig Jennings
Phone calls were not capturing the remote person's voice due to severe volume loss (44 dB) when using the amerge+pan FFmpeg filter combination. Changes: - Replace amerge+pan with amix filter (provides 44 dB volume improvement) - Increase default system volume from 0.5 to 2.0 for better capture levels - Add diagnostic tool to show active audio playback (C-; r w) - Add integration test with real voice recording - Fix batch mode compatibility for test execution The amix filter properly mixes microphone and system monitor inputs without the massive volume loss that amerge+pan caused. Verified with automated integration test showing perfect transcription of test audio. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12perf: Merge performance branch - org-agenda cache, tests, and inbox zeroCraig Jennings
This squash merge combines 4 commits from the performance branch: ## Performance Improvements - **org-agenda cache**: Cache org-agenda file list to reduce rebuild time - Eliminates redundant file system scans on each agenda view - Added tests for cache invalidation and updates - **org-refile cache**: Optimize org-refile target building (15-20s → instant) - Cache eliminates bottleneck when capturing tasks ## Test Suite Improvements - Fixed all 18 failing tests → 0 failures (107 test files passing) - Deleted 9 orphaned test files (filesystem lib, dwim-shell-security, org-gcal-mock) - Fixed missing dependencies (cj/custom-keymap, user-constants) - Fixed duplicate test definitions and wrong variable names - Adjusted benchmark timing thresholds for environment variance - Added comprehensive tests for org-agenda cache functionality ## Documentation & Organization - **todo.org recovery**: Restored 1,176 lines lost in truncation - Recovered Methods 4, 5, 6 + Resolved + Inbox sections - Removed 3 duplicate TODO entries - **Inbox zero**: Triaged 12 inbox items → 0 items - Completed: 3 tasks marked DONE (tests, transcription) - Relocated: 4 tasks to appropriate V2MOM Methods - Deleted: 4 duplicates/vague tasks - Merged: 1 task as subtask ## Files Changed - 58 files changed, 29,316 insertions(+), 2,104 deletions(-) - Tests: All 107 test files passing - Codebase: Cleaner, better organized, fully tested 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-06feat: Add AssemblyAI transcription backend with speaker diarizationCraig Jennings
Integrated AssemblyAI as the third transcription backend alongside OpenAI API and local-whisper, now set as the default due to superior speaker diarization capabilities (up to 50 speakers). New Features: - AssemblyAI backend with automatic speaker labeling - Backend switching UI via C-; T b (completing-read interface) - Universal speech model supporting 99 languages - API key management through auth-source/authinfo.gpg Implementation: - Created scripts/assemblyai-transcribe (upload → poll → format workflow) - Updated transcription-config.el with multi-backend support - Added cj/--get-assemblyai-api-key for secure credential retrieval - Refactored process environment handling from if to pcase - Added cj/transcription-switch-backend interactive command Testing: - Created test-transcription-config--transcription-script-path.el - 5 unit tests covering all 3 backends (100% passing) - Followed quality-engineer.org guidelines (test pure functions only) - Investigated 18 test failures: documented cleanup in todo.org Files Modified: - modules/transcription-config.el - Multi-backend support and UI - scripts/assemblyai-transcribe - NEW: AssemblyAI integration script - tests/test-transcription-config--transcription-script-path.el - NEW - todo.org - Added test cleanup task (Method 3, priority C) - docs/NOTES.org - Comprehensive session notes added Successfully tested with 33KB and 4.1MB audio files (3s and 9s processing). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04fix: Update transcription keybindings for clarityCraig Jennings
Changed transcription submenu keybindings: - C-; t t → C-; t a (transcribe audio) - C-; t b → C-; t v (view transcriptions) - C-; t k → unchanged (kill transcription) More intuitive mnemonics: a=audio, v=view, k=kill 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-04feat: Add complete async audio transcription workflowCraig Jennings
Implemented full transcription system with local Whisper and OpenAI API support. Includes comprehensive test suite (60 tests) and reorganized keybindings for better discoverability. Features: - Async transcription (non-blocking workflow) - Desktop notifications (started/complete/error) - Output: audio.txt (transcript) + audio.log (process logs) - Modeline integration showing active transcription count - Dired integration (press T on audio files) - Process management and tracking Scripts: - install-whisper.sh: Install Whisper via AUR or pip - uninstall-whisper.sh: Clean removal with cache cleanup - local-whisper: Offline transcription using installed Whisper - oai-transcribe: Cloud transcription via OpenAI API Tests (60 passing): - Audio file detection (16 tests) - Path generation logic (11 tests) - Log cleanup behavior (5 tests) - Duration formatting (9 tests) - Active counter & modeline (11 tests) - Integration workflows (8 tests) Keybindings: - Reorganized gcal to C-; g submenu (s/t/r/c) - Added C-; t transcription submenu (t/b/k) - Dired: T to transcribe file at point 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>