aboutsummaryrefslogtreecommitdiff
path: root/docs/workflows/process-meeting-transcript.org
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-02-12 12:06:33 -0600
committerCraig Jennings <c@cjennings.net>2026-02-12 12:06:33 -0600
commit5df64c76d386fd2de863d21a2b1269d53e1a39f9 (patch)
tree7ed3e3a4cccbafe109ff8c87ec8b1e3cfb120141 /docs/workflows/process-meeting-transcript.org
parent24a681c0696fbdad9c32073ffd24cf7218296ed2 (diff)
downloadarchangel-5df64c76d386fd2de863d21a2b1269d53e1a39f9.tar.gz
archangel-5df64c76d386fd2de863d21a2b1269d53e1a39f9.zip
fix: archzfs key prompt hang, test false positive, add local distribution
- Change archzfs SigLevel to Never (pacstrap -K empty keyring caused interactive GPG prompt blocking unattended installs) - Fix pgrep matching avahi-daemon's [archangel.local] in full-test.sh - Bump install timeout to 30min for DKMS builds - Add ~/downloads/isos and archsetup inbox to build-release distribution - Sync templates
Diffstat (limited to 'docs/workflows/process-meeting-transcript.org')
-rw-r--r--docs/workflows/process-meeting-transcript.org301
1 files changed, 301 insertions, 0 deletions
diff --git a/docs/workflows/process-meeting-transcript.org b/docs/workflows/process-meeting-transcript.org
new file mode 100644
index 0000000..647e55f
--- /dev/null
+++ b/docs/workflows/process-meeting-transcript.org
@@ -0,0 +1,301 @@
+#+TITLE: Process Meeting Transcript Workflow
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-02-03
+
+* Overview
+
+This workflow defines the process for processing meeting recordings from start to finish: finding recordings, extracting audio, transcribing via AssemblyAI, identifying speakers, correcting errors, and archiving files.
+
+* When to Use This Workflow
+
+Trigger this workflow when:
+- Craig says "process the transcript" or "process the recording" or similar
+- New recording files (.mkv) appear in ~/sync/recordings/ after meetings
+- Craig wants to process meeting recordings into labeled transcripts
+
+* Prerequisites
+
+- Recording file(s) exist in ~/sync/recordings/ (*.mkv)
+- Calendar files available at ~/.emacs.d/data/*cal.org for meeting titles
+- AssemblyAI transcription script at ~/.emacs.d/scripts/assemblyai-transcribe
+- AssemblyAI API key stored in ~/.authinfo.gpg (machine api.assemblyai.com)
+- ffmpeg available for audio extraction
+
+* The Workflow
+
+** Step 1: Identify Engagement and Write Session Context
+
+Before starting transcript processing:
+
+1. *Identify which engagement this meeting belongs to:*
+ - DeepSat (default for current work)
+ - Vineti (historical)
+ - Salesforce (historical)
+ - If unclear, ask Craig
+
+2. *Set destination paths based on engagement:*
+ - Assets: ~{engagement}/assets/~ (e.g., ~deepsat/assets/~)
+ - Meetings: ~{engagement}/meetings/~ (e.g., ~deepsat/meetings/~)
+ - Knowledge: ~{engagement}/knowledge.org~ for reference
+
+3. Update docs/session-context.org with current status:
+ - Note that we're about to process a meeting transcript
+ - Get meeting name by checking ~/.emacs.d/data/*cal.org (match date/time to transcript timestamp)
+ - If meeting not found in calendar, ask Craig for the meeting title
+
+4. Ask Craig if he wants to compact the conversation context:
+ - Transcript processing can use significant context
+ - Compacting now preserves the session context file for recovery
+
+** Step 2: Find Recording Files
+
+Find and match recording files with calendar events:
+
+1. **List recordings:** Find all .mkv files in ~/sync/recordings/
+ #+begin_src bash
+ ls -la ~/sync/recordings/*.mkv
+ #+end_src
+
+2. **Extract timestamps:** Parse date/time from each filename (format: YYYY-MM-DD_HH-MM-SS.mkv)
+
+3. **Match with calendar:** Check ~/.emacs.d/data/*cal.org for meetings at those times
+ #+begin_src bash
+ cat ~/.emacs.d/data/dcal.org | grep -A2 "YYYY-MM-DD"
+ #+end_src
+
+4. **Present selection table to Craig:**
+ | Filename | Meeting / Date-Time |
+ |-----------------------------+--------------------------------|
+ | 2026-02-03_10-00-00.mkv | DeepSat Standup (from calendar)|
+ | 2026-02-03_14-30-00.mkv | 2026-02-03 14:30 (no match) |
+
+5. **Craig selects files:** One, several, or all files to process
+
+6. **Queue for processing:** Selected files ordered oldest → newest for serial processing
+
+** Step 3: Extract Audio
+
+For each selected recording file, extract audio for transcription:
+
+#+begin_src bash
+ffmpeg -i ~/sync/recordings/FILENAME.mkv -vn -ac 1 -c:a aac -b:a 96k /tmp/FILENAME.m4a
+#+end_src
+
+Settings:
+- =-vn= : no video (audio only)
+- =-ac 1= : mono channel (sufficient for speech, smaller file)
+- =-c:a aac= : AAC codec
+- =-b:a 96k= : 96kbps bitrate (sufficient for speech transcription)
+
+Output: /tmp/FILENAME.m4a (temporary, deleted after transcription)
+
+** Step 4: Transcribe with AssemblyAI
+
+1. **Run transcription:**
+ #+begin_src bash
+ ~/.emacs.d/scripts/assemblyai-transcribe /tmp/FILENAME.m4a > ~/sync/recordings/FILENAME.txt
+ #+end_src
+
+2. **Clean up:** Delete intermediate .m4a file after successful transcription
+ #+begin_src bash
+ rm /tmp/FILENAME.m4a
+ #+end_src
+
+3. **Output format:** The script produces speaker-diarized output:
+ #+begin_example
+ Speaker A: First speaker's text here.
+ Speaker B: Second speaker's response.
+ Speaker A: First speaker continues.
+ #+end_example
+
+4. Continue to speaker identification workflow below.
+
+** Step 5: Locate Files
+
+Confirm the transcript and recording files are ready:
+
+1. **Verify transcript exists:**
+ #+begin_src bash
+ ls -la ~/sync/recordings/FILENAME.txt
+ #+end_src
+
+2. **Verify recording exists:**
+ #+begin_src bash
+ ls -la ~/sync/recordings/FILENAME.mkv
+ #+end_src
+
+3. **Get meeting title:** If not already known from Step 2, check calendar
+ - Calendar location: ~/.emacs.d/data/*cal.org
+ - Match the meeting time to the transcript timestamp
+
+** Step 6: Read and Analyze Transcript
+
+1. Read the full transcript file
+
+2. Identify speakers by analyzing context clues:
+ - Names mentioned in conversation ("Thanks, Ryan")
+ - Role references ("as the developer", "on the IT side")
+ - Project-specific knowledge (who works on what)
+ - Previous meeting context (known attendees)
+ - Speaking order patterns
+
+3. Build a speaker identification table:
+ | Speaker | Person | Evidence |
+ |---------|--------|----------|
+ | A | Name | Clues... |
+
+** Step 7: Confirm Speaker Identifications
+
+Present the speaker identification table to Craig for confirmation:
+- List each speaker label and proposed name
+- Include the evidence/reasoning
+- Ask about any uncertain identifications
+- Note any new people to add to notes.org contacts
+
+** Step 8: Create Labeled Transcript
+
+1. Replace all speaker labels with actual names
+
+2. Correct transcription errors:
+ - Common mishearings (names, technical terms, company names)
+ - Known substitutions from this project:
+ - "Vanetti" → "Vineti"
+ - "Fresh" → "Vrezh"
+ - "Clean4" / "clone" → "CLIN 4"
+ - "Vascan" → "Vazgan"
+ - "Hike" / "Ike" → "Hayk"
+ - "High Tech" → "HyeTech"
+ - "Java software" → "JAMA software"
+ - "JSON" (person) → "Jason"
+ - "their S" / "ress" → "Nerses"
+ - Technical terms specific to DeepSat (GovCloud, AFRL, SOUTHCOM, etc.)
+
+3. Save to engagement assets folder:
+ - Location: ~{engagement}/assets/~ (e.g., ~deepsat/assets/~)
+ - Filename: YYYY-MM-DD-meeting-name.txt
+ - Example: deepsat/assets/2026-02-03-standup-ipm-grooming.txt
+
+** Step 9: Copy Recording to Meetings Folder
+
+1. Ensure engagement meetings folder exists and pattern is in .gitignore (~*/meetings/*.mkv~)
+
+2. Copy the .mkv file with descriptive name:
+ #+begin_src bash
+ cp ~/sync/recordings/YYYY-MM-DD_HH-MM-SS.mkv {engagement}/meetings/YYYY-MM-DD_HH-MM-meeting-name.mkv
+ #+end_src
+ Example: ~deepsat/meetings/2026-02-03_11-02-standup-ipm-grooming.mkv~
+
+3. Verify the copy succeeded
+
+** Step 10: Update Session Context with Meeting Summary
+
+Add a meeting summary section to docs/session-context.org including:
+
+1. **Attendees** - List all participants
+
+2. **Key Decisions** - Important choices made
+
+3. **Action Items** - Tasks assigned, especially for Craig
+
+4. **New Information** - Things learned that should be noted
+
+5. **New Contacts** - People to add to notes.org
+
+** Step 11: Write Session Context File
+
+Update docs/session-context.org with:
+- Files created this session (transcript, recording)
+- Summary of what was processed
+- Next steps (file to assets, update notes.org, etc.)
+
+*** Context Management (for multiple files)
+
+When processing multiple recordings in a queue:
+
+1. **After completing each file's workflow**, update docs/session-context.org with:
+ - Files processed so far
+ - Current position in queue
+ - Summary of meeting just processed
+
+2. **Ask Craig if compact is needed** before starting next file:
+ - Transcript processing uses significant context
+ - Compacting preserves session context for recovery
+
+3. **If autocompact occurs**, reread session-context.org to:
+ - Resume at correct position in queue
+ - Avoid reprocessing already-completed files
+
+** Step 12: Clean Up Source Files
+
+After successful completion of all previous steps, delete the source files from ~/sync/recordings/:
+
+1. **Delete the original recording:**
+ #+begin_src bash
+ rm ~/sync/recordings/FILENAME.mkv
+ #+end_src
+
+2. **Delete the raw transcript** (if generated):
+ #+begin_src bash
+ rm ~/sync/recordings/FILENAME.txt
+ #+end_src
+
+This step happens last to ensure all files are safely copied/processed before deletion. If anything goes wrong earlier in the workflow, the source files remain intact for retry.
+
+* Output Files
+
+| File | Location | Purpose |
+|--------------------+-------------------------------------------------------+------------------------------------|
+| Labeled transcript | {engagement}/assets/YYYY-MM-DD-meeting-name.txt | Corrected transcript for reference |
+| Meeting recording | {engagement}/meetings/YYYY-MM-DD_HH-MM-meeting-name.mkv | Video for review (gitignored) |
+| Session context | docs/session-context.org | Crash recovery, meeting summary |
+| Knowledge base | {engagement}/knowledge.org | Team, infrastructure, corrections |
+
+* Common Transcription Errors
+
+Keep this list updated as new patterns emerge:
+
+| Heard As | Correct | Context |
+|---------------+---------------+------------------------------------------------|
+| Vanetti | Vineti | Company where Craig, Nerses, Eric, Ryan worked |
+| Fresh | Vrezh | Developer name |
+| Clean4, clone | CLIN 4 | Contract milestone |
+| Vascan | Vazgan | MagicalLabs AI team member |
+| Hike, Ike | Hayk | CTO name |
+| High Tech | HyeTech | Armenian tech community org |
+| Java software | JAMA software | Requirements traceability tool |
+| JSON (person) | Jason | DevSecOps or advisor |
+| their S, ress | Nerses | CEO name |
+| sir Keith | Sarkis | BD/investor relations |
+| Fastgas | MagicalLabs | Armenian AI contractor |
+| Sitelix | Cytellix | CMMC security/compliance partner |
+
+* Tips
+
+1. **Read the whole transcript first** - Context from later in the meeting often helps identify speakers from earlier
+
+2. **Use the calendar** - Meeting names help set expectations for who attended
+
+3. **Check engagement knowledge.org** - Team roster and transcription corrections specific to this engagement
+
+4. **Ask about unknowns** - If a new person appears, ask Craig for context
+
+5. **Note new learnings** - Update engagement knowledge.org with new contacts, corrections, or context after processing
+
+* Validation Checklist
+
+- [ ] Engagement identified and destination paths set
+- [ ] Session context written before starting
+- [ ] Recording files listed and matched with calendar
+- [ ] Craig selected files to process
+- [ ] Audio extracted to .m4a (mono, 96k AAC)
+- [ ] AssemblyAI transcription completed
+- [ ] Intermediate .m4a file deleted
+- [ ] Transcript file verified
+- [ ] All speakers identified
+- [ ] Speaker identifications confirmed with Craig
+- [ ] Transcript corrected and saved to {engagement}/assets/
+- [ ] Recording copied to {engagement}/meetings/ with proper name
+- [ ] Session context updated with meeting summary
+- [ ] New contacts/info flagged for {engagement}/knowledge.org update
+- [ ] (If multiple files) Queue position tracked in session context
+- [ ] Source files deleted from ~/sync/recordings/