aboutsummaryrefslogtreecommitdiff
path: root/modules/transcription-config.el
Commit message (Collapse)AuthorAgeFilesLines
* refactor(load-graph): make hidden module dependencies explicitCraig Jennings12 days1-3/+0
| | | | | | | | | | | | | Phase 2 of the load-graph project. I fixed the seven hidden dependencies the classification surfaced, so each module declares what it uses instead of relying on init order. - system-defaults now requires host-environment and user-constants at runtime. They were eval-when-compile only, but env-bsd-p and user-home-dir are read at load, so the compiled module couldn't load standalone. - custom-buffer-file, dev-fkeys, calendar-sync, and video-audio-recording require keybindings and drop their (when (boundp 'cj/custom-keymap) ...) shims. The shim silently dropped the C-; binding when the module loaded before keybindings. The explicit require makes the dependency real. - flycheck-config and mail-config require keybindings for their cj/custom-keymap bindings (a use-package :map and a direct keymap-set). - Removed a dead eval-when-compile (defvar cj/custom-keymap) in transcription-config; nothing there used the variable. No init.el load-order change. keybindings and the foundation modules already load before these, so the requires are no-ops at startup and only fix standalone and test loading. I verified each fix with a fresh emacs --batch (require 'X), then swept all modules standalone: every one loads or fails only with a clear missing-package message. Full make test, make validate-modules, and an init smoke all pass. Module headers and the inventory's hidden-dependency section are updated to mark the seven resolved.
* docs(load-graph): classify remaining domain and optional modulesload-graph-classify-endCraig Jennings12 days1-0/+9
| | | | | | Final classification batch: the last 19 modules — linear-config, local-repository, lorem-optimum, mail-config, markdown-config, music-config, pdf-config, quick-video-capture, reconcile-open-repos, restclient-config, slack-config, system-commands, telega-config, tramp-config, transcription-config, video-audio-recording, vterm-config, weather-config, wrap-up. I annotated each header, added a Batch 9 table to the inventory, and extended the validation allowlist. 101 of 102 modules are now classified; only elfeed-config remains, deferred on its test fix. Two more hidden dependencies turned up. video-audio-recording uses the boundp shim for its C-; r binding, and mail-config registers C-; e directly without requiring keybindings, so it errors standalone rather than degrading. Both recorded for Phase 2.
* refactor(auth): consolidate the auth-source secret lookup into one helperCraig Jennings2026-05-221-8/+4
| | | | | | | | The auth-source-search + funcall-the-secret block was copied four times: calendar-sync--calendar-url, cj/auth-source-secret (ai-config), cj/--auth-source-password (transcription), and cj/slack--get-credential. Each searched authinfo, pulled :secret, and called it when the netrc backend returned a function. I pulled that into cj/auth-source-secret-value in system-lib (a leaf, so calendar-sync doesn't have to depend on ai-config and drag in the gptel stack). It takes an optional user and returns the secret or nil. The four callers now delegate to it: ai-config layers its required-secret error on top, and the others keep their nil-on-miss behavior. With the direct auth-source-search calls gone, I dropped the now-unused (require 'auth-source) from transcription, slack, and calendar-sync. The helper's autoload covers it. The transcription tests that exercise the delegated path stay green, and the primitive and the error wrapper get their own tests.
* feat(transcription): extend dired T to transcribe videos via ffmpeg, with testsCraig Jennings2026-05-141-17/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pressing `T' in dired/dirvish on an audio file already transcribed it; on a video file it bounced with "Not an audio file". Real recordings ship as .mp4 / .mkv at least as often as raw .m4a, so the one-key flow ended at the wrong place. Pipeline now: - audio path -> direct into `cj/--start-transcription-process' (unchanged). - video path -> async ffmpeg extracts the audio track to a temp .mp3 under `temporary-file-directory' (libmp3lame, VBR q:a 4, ~165kbps -- right size for speech, accepted by every backend), then transcribes that file with the temp marked for cleanup after the transcription sentinel fires. Surface changes: - `cj/video-file-extensions' added to user-constants.el (mp4, mkv, mov, webm, avi, m4v, wmv, flv, mpg, mpeg, 3gp, ogv). - New predicates `cj/--video-file-p' / `cj/--media-file-p'. - New `cj/--extract-audio-from-video' (async ffmpeg with success callback; surfaces `cj/--notify' on failure; user-errors if ffmpeg isn't on PATH). - `cj/--start-transcription-process' gains optional `cleanup-file'. Sentinel deletes it after the existing logic runs. Backwards compatible -- the audio flow doesn't pass it. - `cj/transcribe-audio' renamed to `cj/transcribe-media' (dispatcher on audio vs video). `cj/transcribe-audio-at-point' renamed to `cj/transcribe-media-at-point'. Both old names kept as `defalias' so M-x history and any external references still work. - `T' in dired-mode-map + dirvish-mode-map points at `cj/transcribe-media-at-point'. - Module commentary USAGE block updated. 15 new ERT tests in `tests/test-transcription-video.el' cover the predicates (happy/boundary/error), ffmpeg invocation (correct args + missing-ffmpeg path), the dispatcher (audio direct, video via extraction, non-media rejected), the aliases, and the T binding. One existing test in `test-transcription-status-and-commands.el' updated to stub the new delegate name. Verified locally that ffmpeg is on PATH with libmp3lame, and that the exact arg list my code uses produces a valid MP3 from a synthetic test video.
* refactor: clear transcription C-; T menu, move telega launcher to C-; TCraig Jennings2026-05-141-22/+5
| | | | | | | | | | | | | | | | The transcription menu wasn't earning its top-level keymap slot -- the commands (transcribe-audio, switch-backend, view-transcriptions, kill-transcription) are run rarely enough that `M-x' is fine. Drop the `cj/transcribe-map' keymap, its `(keymap-set cj/custom-keymap "T" ...)' binding, and the which-key labels. Commands stay callable by name. That frees `C-; T' for telega, where the mnemonic actually fits. Move the launcher from `C-; G' to `C-; T'. Update the which-key label, the module commentary, and the keymap-binding test assertion. The dashboard `g' single-letter binding stays put -- `t' there is vterm, so dashboard letters and the global `C-;' prefix don't share a key space anyway.
* refactor(transcription): extract running-transcriptions and format-entryCraig Jennings2026-04-191-20/+25
| | | | | | | | | | | Two cleanups round out the transcription-config refactor: - cj/--running-transcriptions: the 'status = running' filter used by cleanup and count helpers is now one function. Existing counter tests cover both callers. - cj/--format-transcription-entry: the 13-line dolist body inside cj/transcriptions-buffer becomes a testable pure function. 6 tests cover status-face mapping, basename-only rendering, duration format, trailing newline.
* refactor(transcription): extract four sentinel side-effect helpersCraig Jennings2026-04-191-40/+44
| | | | | | | | | | | | | | | | Break cj/--transcription-sentinel's seven inline side-effects into named helpers: - cj/--write-transcript-on-success: writes process output to .txt on success - cj/--append-to-log: appends event marker + process output to log - cj/--update-transcription-status: marks tracking-list entry complete/error - cj/--notify-completion: sends success or critical notification Also: switch the tautological (cj/--should-keep-log t) to use the local success-p (equivalent but matches the function signature), and rename the unused audio-file sentinel arg to _audio-file. Sentinel shrinks from 48 lines with 7 inline blocks to 14 lines of straight-line helper calls. 10 tests cover the extracted helpers.
* refactor(transcription): extract init-log-file and track-transcriptionCraig Jennings2026-04-191-14/+16
| | | | | | | | | | | Pull two more helpers out of cj/--start-transcription-process: - cj/--init-log-file: writes the initial log header with timestamp, backend, audio file, script path - cj/--track-transcription: pushes a running-status entry and refreshes the modeline Start-process shrinks from 58 lines with 4 levels of nesting to ~25 lines mostly at depth 1-2. 10 tests cover the extracted helpers.
* refactor(transcription): extract cj/--build-process-environmentCraig Jennings2026-04-191-9/+15
| | | | | | Pull the per-backend env-var assembly out of cj/--start-transcription-process into a standalone pure function. 9 tests cover: the three backends, parent-env preservation, non-mutation, missing-key user-error, unknown-backend error.
* refactor(transcription): consolidate backends into descriptor alistCraig Jennings2026-04-191-39/+29
| | | | | | | | | | | | | | Introduce cj/--transcription-backends alist mapping each backend to (:script :auth-host :env-var). Replace: - two near-identical cj/--get-{openai,assemblyai}-api-key functions with a single parameterized cj/--auth-source-password helper - the pcase in cj/--transcription-script-path with an alist lookup - the pcase block in cj/--start-transcription-process that assembled the API-key env var with an alist-driven assembly Adding a new backend is now a single line in the alist. The existing tests plus retargeted API-key tests (now 10, covering the parameterized helper and the descriptor data) verify no behavior change.
* fix(transcription): add T keybinding to dirvish-mode-mapCraig Jennings2026-01-291-1/+3
| | | | | Dirvish uses its own keymap rather than inheriting from dired-mode-map, so the transcription keybinding needs to be explicitly added.
* fix(recording): Fix phone call audio capture with amix filterCraig Jennings2025-11-141-1/+4
| | | | | | | | | | | | | | | | Phone calls were not capturing the remote person's voice due to severe volume loss (44 dB) when using the amerge+pan FFmpeg filter combination. Changes: - Replace amerge+pan with amix filter (provides 44 dB volume improvement) - Increase default system volume from 0.5 to 2.0 for better capture levels - Add diagnostic tool to show active audio playback (C-; r w) - Add integration test with real voice recording - Fix batch mode compatibility for test execution The amix filter properly mixes microphone and system monitor inputs without the massive volume loss that amerge+pan caused. Verified with automated integration test showing perfect transcription of test audio.
* perf: Merge performance branch - org-agenda cache, tests, and inbox zeroCraig Jennings2025-11-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This squash merge combines 4 commits from the performance branch: ## Performance Improvements - **org-agenda cache**: Cache org-agenda file list to reduce rebuild time - Eliminates redundant file system scans on each agenda view - Added tests for cache invalidation and updates - **org-refile cache**: Optimize org-refile target building (15-20s → instant) - Cache eliminates bottleneck when capturing tasks ## Test Suite Improvements - Fixed all 18 failing tests → 0 failures (107 test files passing) - Deleted 9 orphaned test files (filesystem lib, dwim-shell-security, org-gcal-mock) - Fixed missing dependencies (cj/custom-keymap, user-constants) - Fixed duplicate test definitions and wrong variable names - Adjusted benchmark timing thresholds for environment variance - Added comprehensive tests for org-agenda cache functionality ## Documentation & Organization - **todo.org recovery**: Restored 1,176 lines lost in truncation - Recovered Methods 4, 5, 6 + Resolved + Inbox sections - Removed 3 duplicate TODO entries - **Inbox zero**: Triaged 12 inbox items → 0 items - Completed: 3 tasks marked DONE (tests, transcription) - Relocated: 4 tasks to appropriate V2MOM Methods - Deleted: 4 duplicates/vague tasks - Merged: 1 task as subtask ## Files Changed - 58 files changed, 29,316 insertions(+), 2,104 deletions(-) - Tests: All 107 test files passing - Codebase: Cleaner, better organized, fully tested
* feat: Add AssemblyAI transcription backend with speaker diarizationCraig Jennings2025-11-061-16/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Integrated AssemblyAI as the third transcription backend alongside OpenAI API and local-whisper, now set as the default due to superior speaker diarization capabilities (up to 50 speakers). New Features: - AssemblyAI backend with automatic speaker labeling - Backend switching UI via C-; T b (completing-read interface) - Universal speech model supporting 99 languages - API key management through auth-source/authinfo.gpg Implementation: - Created scripts/assemblyai-transcribe (upload → poll → format workflow) - Updated transcription-config.el with multi-backend support - Added cj/--get-assemblyai-api-key for secure credential retrieval - Refactored process environment handling from if to pcase - Added cj/transcription-switch-backend interactive command Testing: - Created test-transcription-config--transcription-script-path.el - 5 unit tests covering all 3 backends (100% passing) - Followed quality-engineer.org guidelines (test pure functions only) - Investigated 18 test failures: documented cleanup in todo.org Files Modified: - modules/transcription-config.el - Multi-backend support and UI - scripts/assemblyai-transcribe - NEW: AssemblyAI integration script - tests/test-transcription-config--transcription-script-path.el - NEW - todo.org - Added test cleanup task (Method 3, priority C) - docs/NOTES.org - Comprehensive session notes added Successfully tested with 33KB and 4.1MB audio files (3s and 9s processing).
* fix: Update transcription keybindings for clarityCraig Jennings2025-11-041-4/+4
| | | | | | | | | Changed transcription submenu keybindings: - C-; t t → C-; t a (transcribe audio) - C-; t b → C-; t v (view transcriptions) - C-; t k → unchanged (kill transcription) More intuitive mnemonics: a=audio, v=view, k=kill
* feat: Add complete async audio transcription workflowCraig Jennings2025-11-041-0/+326
Implemented full transcription system with local Whisper and OpenAI API support. Includes comprehensive test suite (60 tests) and reorganized keybindings for better discoverability. Features: - Async transcription (non-blocking workflow) - Desktop notifications (started/complete/error) - Output: audio.txt (transcript) + audio.log (process logs) - Modeline integration showing active transcription count - Dired integration (press T on audio files) - Process management and tracking Scripts: - install-whisper.sh: Install Whisper via AUR or pip - uninstall-whisper.sh: Clean removal with cache cleanup - local-whisper: Offline transcription using installed Whisper - oai-transcribe: Cloud transcription via OpenAI API Tests (60 passing): - Audio file detection (16 tests) - Path generation logic (11 tests) - Log cleanup behavior (5 tests) - Duration formatting (9 tests) - Active counter & modeline (11 tests) - Integration workflows (8 tests) Keybindings: - Reorganized gcal to C-; g submenu (s/t/r/c) - Added C-; t transcription submenu (t/b/k) - Dired: T to transcribe file at point