From 459d426a23f6a96b66c60f202b577d67547f34e8 Mon Sep 17 00:00:00 2001 From: Craig Jennings Date: Fri, 22 May 2026 18:25:29 -0500 Subject: chore(ai): archive session record, sweep completed tasks, queue follow-ups Archived the session record. Moved six completed tasks from Open Work to Resolved: the 2026-05-04 audit-pass parent, the two commits.md overlay tasks, the make-remove feature, the mcp/ install-pipeline doc, and the wrap-it-up GitHub-host quick fix. Queued the one lint judgment and the task-review staleness note in the inbox for next-session processing. --- .ai/session-context.org | 69 - ...2-18-23-audit-sweep-and-bundle-sync-tooling.org | 85 + inbox/lint-followups.org | 5 + todo.org | 2009 ++++++++++---------- 4 files changed, 1092 insertions(+), 1076 deletions(-) delete mode 100644 .ai/session-context.org create mode 100644 .ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org diff --git a/.ai/session-context.org b/.ai/session-context.org deleted file mode 100644 index 2cb293b..0000000 --- a/.ai/session-context.org +++ /dev/null @@ -1,69 +0,0 @@ -#+TITLE: Session — language-bundle startup self-sync -#+DATE: 2026-05-22 - -* Summary - -** Active Goal - -Build =scripts/sync-language-bundle.sh=: a per-project language-bundle freshness check wired into startup Phase A. Detect the installed bundle by fingerprint, auto-fix rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), surface drift in project-customizable files (=settings.json=, =CLAUDE.md=) without writing. - -** Decisions - -- Per-project self-sync is a *script called from startup with an absolute rulesets path*, not a make target. A make target on the boot path adds a Makefile-parse + target-interface layer for no benefit; calling the script directly is the same accepted dependency the =.ai/= rsync already has. -- Bundle detection is *fingerprint-based, no marker file*: a project "has" language == if any of =languages//claude/rules/*.md= (the distinguishing rules, e.g. =elisp.md=, =python-testing.md=) exist in the project's =.claude/rules/=. Naturally scopes to opted-in projects. -- Auto-fix scope = rulesets-owned files only (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=). Surface-only = =settings.json=, =CLAUDE.md= (project may customize). User chose to fold hooks into auto-fix. -- Exit codes: 0 = no bundle / clean / auto-fixed; 3 = manual action recommended (settings/CLAUDE drift); 1 = usage error. Quiet when clean (like =task-review-staleness.sh=). - -** Data Collected / Findings - -- =install-lang.sh= COPIES (cp), does not symlink — so there are no per-project language symlinks, only content drift. The only real symlinks are machine-global =~/.claude/= (handled by =make doctor=/=install=). -- =diff-lang.sh= already walks the installer's file set (generic rules, language =.claude/= tree, githooks); reused its comparison shape. -- bats tests live in =scripts/tests/*.bats=; =make test= runs them. Style mirrored from =audit.bats= (temp HOME/project, real script, real =languages/= source). - -** Files Modified - -- =scripts/sync-language-bundle.sh= (new) — the self-sync script. -- =scripts/tests/sync-language-bundle.bats= (new) — 11 tests, all green. -- =claude-templates/.ai/workflows/startup.org= + =.ai/workflows/startup.org= mirror — Phase A step 12 + Phase C surfacing bullet. - -Shipped: =1ceed70 feat(startup): sync language bundles per project on session launch=, pushed to =origin/main=. - -Two ripe-fruit follow-ups (committed locally, not yet pushed): -- =a785f54 docs(workflows): document GitHub-family assumption in wrap-it-up Step 3.5= (item #1, =:quick:= [#A]). -- =a4389e8 docs(skills): keep review-code praise honest and unforced= (item #2 + Craig's adjacent "don't explain praise" edit, bundled per Craig's choice). -- =1477642 chore(todo): close GH-assumption and review-code strengths tasks=. - -Mid-session rule change: Craig edited =claude-rules/commits.md= (still uncommitted) to *decouple voice patterns from the approval gate* — publish artifacts always run =/voice personal= (39 patterns) regardless of =.ai/= tracking; the =.ai/=-tracking check now decides only whether the gate fires. Applied to item #2 onward. The feature commit + item #1 predate it (used =/voice= general); item #1 is effectively compliant anyway (already first-person + contractions, no semicolons). - -All pushed to origin/main (=1ceed70..6c91a4e=), tree clean except the live session-context. =6c91a4e docs(commits): decouple voice patterns from the approval gate= committed Craig's rule edit; it's symlinked into =~/.claude/rules/= so it's already live. - -** Audit-pass cluster (2026-05-04) — in progress, area-by-area - -Approach: area-by-area, my choice of order, check-in between batches, freshness-check each item against current reality (humanizer→voice, skills→commands all happened since the audit). Inventory confirmed nearly every referenced artifact still exists (as skill, command, or rule file); only =humanizer= refs are genuinely stale. - -Done (all pushed through =1825226=): -- *Code review* (3) — three-strengths, CI-trust scoping, CLAUDE.md citation modes. -- *PR responses* (4) — respond-to-review commit-language + thread-resolution; two respond-to-cj-comments items moot (path already gone, humanizer/emacsclient superseded by /voice + VERIFY). -- *Browser testing* (1 of 3) — headed/headless decision tables in both playwright skills (2e9d5b0). DEFERRED: #1 networkidle/locator refactor (touches helper code, no tests) and #3 emoji sweep (~30 occurrences, 7 files) — both spread-heavy, held for focused passes given the concurrent-edit warning. -- *Debugging/RCA* (3) — debug env/recent-change capture, root-cause-trace boundary-only defense, five-whys evidence+counterfactual per link (3916dc4). - -** Next Steps - -- Continue the cluster, next area my choice (~29 items left across ~12 areas; Browser testing, Security, Global rules, Hooks, Languages, Architecture, C4, etc.). -- The 3 real bundle-bearing projects (chime, gloss = elisp; work = python) self-heal language-bundle drift at their own next startup via step 12 — Craig opted to let them. - -* Session Log - -** Setup + investigation - -Session opened in rulesets. Startup ran clean (40 ok doctor, 0 .ai/ drift after a fleet =make audit APPLY=1= synced 27 projects). Earlier this session: reconciled all 26 AI projects (pushed emacs-wttrin's pending commit, committed google-contacts.el's .gitignore). - -Design discussion converged on a per-project language-bundle self-sync for startup. Confirmed via reading =install-lang.sh= / =diff-lang.sh= that bundles are copied (not symlinked) and detectable by fingerprint rule files. User approved option-2 behavior with hooks folded into auto-fix. - -Now implementing TDD: bats test first, then the script, then wire into startup Phase A (canonical =claude-templates/.ai/workflows/startup.org= + rsync mirror). - -** Implementation + smoke test - -Wrote the bats test (10 cases) red, then the script green. One scaffold bug: the test helper didn't seed CLAUDE.md, so the elisp bundle (which ships one) tripped the surface check. Fixed the helper — all green. - -Smoke-tested against a *copy* of chime's real =.claude/= (no writes to the real project): it correctly auto-fixed 6 stale generic rules + a drifted =validate-el.sh=, but flagged =CLAUDE.md (missing)= → exit 3 on every run. That's wrong: CLAUDE.md is seed-only in install-lang (never overwritten without FORCE), so a missing/differing one isn't actionable via =make install-=, and =diff-lang.sh= already skips it. Dropped CLAUDE.md from the surface set entirely (settings.json only), added a regression test asserting an absent CLAUDE.md is not drift. Re-smoke: chime copy converges silently on re-run, exit 0. 11 tests green, shellcheck clean, =make lint= clean, full =make test= green (302 pytest + ERT + all bats). diff --git a/.ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org b/.ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org new file mode 100644 index 0000000..e3a060a --- /dev/null +++ b/.ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org @@ -0,0 +1,85 @@ +#+TITLE: Session — fleet reconcile, bundle-sync tooling, and the 2026-05-04 audit sweep +#+DATE: 2026-05-22 + +* Summary + +** Active Goal + +A long, multi-strand session. (1) Reconciled all 26 AI projects against their remotes and synced =.ai/= templates fleet-wide. (2) Built the per-project bundle-sync tooling: =sync-language-bundle.sh= (startup freshness for language bundles, later generalized to team overlays), =make install-team=, and =make remove=. (3) Completed the entire 2026-05-04 audit cluster — all 55 items dispositioned, including splitting DeepSat's publishing rules out of the global =commits.md= into an installable team overlay and deploying it to =~/projects/work=. (4) Documented the =mcp/= install pipeline. (5) Committed Craig's in-flight WIP (emacs live-reload rule, task-audit workflow, voice + review-code refinements). The original entry point was the language-bundle self-sync (item below); everything else grew from "what's next / any ripe fruit" between completed pieces. + +** Decisions + +- Per-project self-sync is a *script called from startup with an absolute rulesets path*, not a make target. A make target on the boot path adds a Makefile-parse + target-interface layer for no benefit; calling the script directly is the same accepted dependency the =.ai/= rsync already has. +- Bundle detection is *fingerprint-based, no marker file*: a project "has" language == if any of =languages//claude/rules/*.md= (the distinguishing rules, e.g. =elisp.md=, =python-testing.md=) exist in the project's =.claude/rules/=. Naturally scopes to opted-in projects. +- Auto-fix scope = rulesets-owned files only (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=). Surface-only = =settings.json=, =CLAUDE.md= (project may customize). User chose to fold hooks into auto-fix. +- Exit codes: 0 = no bundle / clean / auto-fixed; 3 = manual action recommended (settings/CLAUDE drift); 1 = usage error. Quiet when clean (like =task-review-staleness.sh=). + +** Data Collected / Findings + +- =install-lang.sh= COPIES (cp), does not symlink — so there are no per-project language symlinks, only content drift. The only real symlinks are machine-global =~/.claude/= (handled by =make doctor=/=install=). +- =diff-lang.sh= already walks the installer's file set (generic rules, language =.claude/= tree, githooks); reused its comparison shape. +- bats tests live in =scripts/tests/*.bats=; =make test= runs them. Style mirrored from =audit.bats= (temp HOME/project, real script, real =languages/= source). + +** Files Modified + +- =scripts/sync-language-bundle.sh= (new) — the self-sync script. +- =scripts/tests/sync-language-bundle.bats= (new) — 11 tests, all green. +- =claude-templates/.ai/workflows/startup.org= + =.ai/workflows/startup.org= mirror — Phase A step 12 + Phase C surfacing bullet. + +Shipped: =1ceed70 feat(startup): sync language bundles per project on session launch=, pushed to =origin/main=. + +Two ripe-fruit follow-ups (committed locally, not yet pushed): +- =a785f54 docs(workflows): document GitHub-family assumption in wrap-it-up Step 3.5= (item #1, =:quick:= [#A]). +- =a4389e8 docs(skills): keep review-code praise honest and unforced= (item #2 + Craig's adjacent "don't explain praise" edit, bundled per Craig's choice). +- =1477642 chore(todo): close GH-assumption and review-code strengths tasks=. + +Mid-session rule change: Craig edited =claude-rules/commits.md= (still uncommitted) to *decouple voice patterns from the approval gate* — publish artifacts always run =/voice personal= (39 patterns) regardless of =.ai/= tracking; the =.ai/=-tracking check now decides only whether the gate fires. Applied to item #2 onward. The feature commit + item #1 predate it (used =/voice= general); item #1 is effectively compliant anyway (already first-person + contractions, no semicolons). + +All pushed to origin/main (=1ceed70..6c91a4e=), tree clean except the live session-context. =6c91a4e docs(commits): decouple voice patterns from the approval gate= committed Craig's rule edit; it's symlinked into =~/.claude/rules/= so it's already live. + +** Audit-pass cluster (2026-05-04) — in progress, area-by-area + +Approach: area-by-area, my choice of order, check-in between batches, freshness-check each item against current reality (humanizer→voice, skills→commands all happened since the audit). Inventory confirmed nearly every referenced artifact still exists (as skill, command, or rule file); only =humanizer= refs are genuinely stale. + +Done (all pushed through =1825226=): +- *Code review* (3) — three-strengths, CI-trust scoping, CLAUDE.md citation modes. +- *PR responses* (4) — respond-to-review commit-language + thread-resolution; two respond-to-cj-comments items moot (path already gone, humanizer/emacsclient superseded by /voice + VERIFY). +- *Browser testing* (1 of 3) — headed/headless decision tables in both playwright skills (2e9d5b0). DEFERRED: #1 networkidle/locator refactor (touches helper code, no tests) and #3 emoji sweep (~30 occurrences, 7 files) — both spread-heavy, held for focused passes given the concurrent-edit warning. +- *Debugging/RCA* (3) — debug env/recent-change capture, root-cause-trace boundary-only defense, five-whys evidence+counterfactual per link (3916dc4). + +*** Most of the cluster cleared — 48 items, all pushed through efcc8e5 + +Continued area-by-area with parallel subagents (batches of 3-4, every diff reviewed) for independent command/skill/rule files. Also done + pushed since the line above: Tests/TDD (2, 1 moot), Frontend (2), Security (2), Pairwise (2), V2MOM (3), Prompt-eng (2), Codify (1), Branch-workflow (4), C4 (2), Architecture (8), Languages (4), Global-rules verification/testing/subagents (4), Hooks README (1). + +** Audit cluster essentially complete — 53 of 55 items done + pushed (through 81280b7) + +Code items finished after the prose batches: Hooks #2/#3 (built a pytest harness under hooks/tests + read_referenced_file helper + shlex rm parsing; 54 tests, wired into make test, /review-code Approve), Browser #1 networkidle/locator refactor + #3 emoji sweep (node --check + py_compile clean), and the earlier-missed brainstorm item (timebox + fresh-sources + Assumptions section). + +** Audit cluster fully closed — both commits.md items shipped + deployed + +- *#1012 (split)* — shipped 3cb467e: DeepSat publishing steps moved out of global commits.md into =teams/deepsat/claude/rules/publishing.md=; commits.md uses seams (startup-extras pattern); added =install-team= (targeted copy, never globally symlinked) + generalized =sync-language-bundle.sh= to keep team overlays fresh (process_bundle function, team syncs only its own rule; 3 new bats). Both /review-code Approve. +- *#1019 (/voice fallback)* — shipped ca6a213: Single-skill gate now handles /voice being unavailable (walk patterns inline, flag, don't block). +- *Cross-project deploy (done from here)* — =make install-team TEAM=deepsat PROJECT=~/projects/work= + a sync refresh: work now has the split commits.md + publishing.md overlay (both match canonical; .claude/ gitignored there so nothing to commit). Handoff note dropped at =~/projects/work/inbox/2026-05-22-1708-from-rulesets-deepsat-publishing-overlay-installed.org=. + +** Next Steps + +- *memory-sync [#A]* (=DOING=) — held by Craig: dotfiles are being split out of archsetup into their own repo, so the stow target is in flux. Revisit once that settles; the VERIFY now records the new dotfiles repo as the target, not =archsetup/dotfiles/common/=. +- Open carryover: =create-documentation= skill, =/update-skills= skill (both [#C]). +- Bundle-bearing projects (chime, gloss, work) self-heal language + team-overlay drift at their next startup via the step-12 sync. +- The 2026-05-04 audit cluster is fully closed (parent archived). =teams/deepsat/= overlay is deployed to =~/projects/work= and kept fresh by startup sync. + +* Session Log + +** Setup + investigation + +Session opened in rulesets. Startup ran clean (40 ok doctor, 0 .ai/ drift after a fleet =make audit APPLY=1= synced 27 projects). Earlier this session: reconciled all 26 AI projects (pushed emacs-wttrin's pending commit, committed google-contacts.el's .gitignore). + +Design discussion converged on a per-project language-bundle self-sync for startup. Confirmed via reading =install-lang.sh= / =diff-lang.sh= that bundles are copied (not symlinked) and detectable by fingerprint rule files. User approved option-2 behavior with hooks folded into auto-fix. + +Now implementing TDD: bats test first, then the script, then wire into startup Phase A (canonical =claude-templates/.ai/workflows/startup.org= + rsync mirror). + +** Implementation + smoke test + +Wrote the bats test (10 cases) red, then the script green. One scaffold bug: the test helper didn't seed CLAUDE.md, so the elisp bundle (which ships one) tripped the surface check. Fixed the helper — all green. + +Smoke-tested against a *copy* of chime's real =.claude/= (no writes to the real project): it correctly auto-fixed 6 stale generic rules + a drifted =validate-el.sh=, but flagged =CLAUDE.md (missing)= → exit 3 on every run. That's wrong: CLAUDE.md is seed-only in install-lang (never overwritten without FORCE), so a missing/differing one isn't actionable via =make install-=, and =diff-lang.sh= already skips it. Dropped CLAUDE.md from the surface set entirely (settings.json only), added a regression test asserting an absent CLAUDE.md is not drift. Re-smoke: chime copy converges silently on re-run, exit 0. 11 tests green, shellcheck clean, =make lint= clean, full =make test= green (302 pytest + ERT + all bats). diff --git a/inbox/lint-followups.org b/inbox/lint-followups.org index 42b0eb4..ae8591e 100644 --- a/inbox/lint-followups.org +++ b/inbox/lint-followups.org @@ -8,3 +8,8 @@ ** TODO line 2070 — misplaced-heading — Possibly misplaced heading line * 2026-05-21 Thu — Task-review health: 12 top-level [#A]/[#B]/[#C] tasks unreviewed for >30 days (daily review may have slipped) + +* 2026-05-22 lint-org follow-ups — todo.org +** TODO line 1454 — misplaced-heading — Possibly misplaced heading line + +* 2026-05-22 Fri — Task-review health: 10 top-level [#A]/[#B]/[#C] tasks unreviewed for >30 days (daily review may have slipped) diff --git a/todo.org b/todo.org index 347194a..f3c4b48 100644 --- a/todo.org +++ b/todo.org @@ -7,25 +7,6 @@ Project-scoped (not the global =~/sync/org/roam/inbox.org= list). * Rulesets Open Work -** DONE [#A] Split team-specific publishing rules out of commits.md :commits: -CLOSED: [2026-05-22 Fri] -Shipped 3cb467e. Moved the DeepSat publishing steps (Linear ticket-state, the Slack notification protocol + channel ID, the GHE host, the team merge norm, the Linear ticket-body structure) out of the global =claude-rules/commits.md= into =teams/deepsat/claude/rules/publishing.md=. The global file keeps the universal skeleton and uses seams ("run the project's publishing overlay here if present") like startup-extras. Added =install-team= (targeted per-project copy, keyed on PROJECT, never globally symlinked) and generalized =sync-language-bundle.sh= to keep team overlays fresh at startup (3 new bats; make test green). - -Remaining deploy step (cross-project, surfaced to Craig): install the overlay into the DeepSat work project — =make install-team TEAM=deepsat PROJECT== — so it actually loads there. - -** DONE [#A] Define a /voice-unavailable fallback in the commits.md publish flow :commits: -CLOSED: [2026-05-22 Fri] -Added an "If =/voice= is unavailable" paragraph to the Single-skill gate in =commits.md=: walk the same patterns inline (the flow already names which matter), state the skill was unavailable and the pass was applied by hand ("/voice unavailable — patterns walked inline"), and flag the missing skill for install. The gate is the pattern walk, not the tooling. The original "=humanizer= unavailable" framing was moot (humanizer → /voice). - -** DONE [#A] wrap-it-up Step 3.5 assumes GitHub-family remote :chore:quick: -CLOSED: [2026-05-22 Fri] -:PROPERTIES: -:LAST_REVIEWED: 2026-05-20 -:END: -Documented the assumption inline at =wrap-it-up.org= Step 3.5 (chose the lightweight path over a provider-agnostic rewrite): the =gh= lookup expects a GitHub-family host, holds today via DeepSat on GHE, flagged for update if a future Linear project lands on GitLab/Gitea/Bitbucket. -Triggered by: 2026-05-16 wrap-it-up github.com cleanup (audit of the same file). - -Step 3.5 (Linear ticket-state hygiene) at =wrap-it-up.org:207= says "the project's GitHub remote — use =gh pr list ...=". Currently fine in practice: the step is Linear-gated, and the only Linear-using project is DeepSat (on =deepsat.ghe.com=, a GitHub-family host where =gh= works). Would break if a future Linear-using project lived on a non-GitHub host (gitlab, gitea, bitbucket). Either drop the GitHub-family assumption (provider-agnostic lookup, harder) or document the assumption explicitly so future projects know the step needs an update if they don't fit. ** DOING [#A] Check that memories are sync'd across machines via git.m :PROPERTIES: :LAST_REVIEWED: 2026-05-20 @@ -697,1365 +678,1379 @@ The skill should reject: public/library/API docs: =llms.txt= or markdown export is valuable, but normal human navigation remains primary. -** DONE [#C] Review pass: tighten skills and rulesets after 2026-05-04 audit -CLOSED: [2026-05-22 Fri] +** TODO [#C] Build =/update-skills= skill for keeping forks in sync with upstream :PROPERTIES: :LAST_REVIEWED: 2026-05-20 :END: -All 55 grouped-index items dispositioned (2026-05-22): ~49 edited across skills, commands, rule files, hooks, and the two playwright skills; several came out moot post-audit (humanizer→voice, skills→commands, typescript ruleset added); the two commits.md items shipped as the team-overlay split + /voice fallback. Freshness-checked each item against current reality before editing. - -Source notes used in this pass: -- C4 official docs: C4 is notation-independent; System Context and Container - diagrams are enough for most teams; every diagram needs title, key/legend, - explicit element types, and audience-appropriate abstraction. - [[https://c4model.com/diagrams][C4 diagrams]], - [[https://c4model.com/diagrams/notation][C4 notation]], - [[https://c4model.com/abstractions/component][C4 component]] -- arc42 docs: quality requirements need measurable scenarios; section 10 - should reference top quality goals and capture lesser quality requirements - with specific measures. [[https://docs.arc42.org/section-10/][arc42 section 10]], - [[https://quality.arc42.org/articles/specify-quality-requirements][specifying quality requirements]] -- ADR references: ADRs capture one justified architecturally significant - decision and its rationale; Nygard's original guidance emphasizes short, - numbered, repository-stored records and superseding rather than rewriting old - decisions. [[https://adr.github.io/][adr.github.io]], - [[https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions][Nygard ADR article]] -- Playwright docs: prefer user-visible locators and web assertions; locators - auto-wait and retry; =networkidle= is discouraged for testing readiness. - [[https://playwright.dev/docs/best-practices][Playwright best practices]], - [[https://playwright.dev/docs/locators][Playwright locators]], - [[https://playwright.dev/docs/next/api/class-page][Playwright page API]] -- OWASP references: Top 10 2021 includes Broken Access Control, - Cryptographic Failures, Injection, Insecure Design, Security - Misconfiguration, Vulnerable and Outdated Components, Identification and - Authentication Failures, Software and Data Integrity Failures, Security - Logging and Monitoring Failures, and SSRF; WSTG adds a broader testing map - across configuration, identity, authn/z, sessions, input validation, error - handling, cryptography, business logic, client-side, and API testing. - [[https://owasp.org/Top10/2021/][OWASP Top 10 2021]], - [[https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/][OWASP WSTG]] -- V2MOM references: Salesforce calls the last M "Measures" and emphasizes a - simple alignment document with prioritized Methods, explicit Obstacles, and - measurable outcomes. [[https://trailhead.salesforce.com/content/learn/modules/selfmotivation/get-focused-with-your-personal-v2mom][Salesforce Trailhead personal V2MOM]], - [[https://www.salesforce.com/blog/?p=12][Salesforce V2MOM alignment]] -- Prompt research: the cited Meincke paper is titled "Call Me A Jerk: - Persuading AI to Comply with Objectionable Requests"; its scope is - persuasion increasing compliance with objectionable requests, not a general - proof that persuasion framing improves prompt quality. - [[https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5357179][SSRN paper]] -- Combinatorial testing references: NIST supports t-way combinatorial testing - and notes pairwise is one covering strength, with higher-strength arrays - useful for failures requiring more interacting factors. - [[https://www.nist.gov/publications/practical-combinatorial-testing-beyond-pairwise][NIST beyond pairwise]], - [[https://www.nist.gov/publications/combinatorial-software-testing][NIST combinatorial testing]] -*** Grouped index (for batching by area) +The rulesets repo has a growing set of forks (=arch-decide= from +wshobson/agents, =playwright-js= from lackeyjb/playwright-skill, =playwright-py= +from anthropics/skills/webapp-testing). Over time, upstream releases fixes, +new templates, or scope expansions that we'd want to pull in without losing +our local modifications. A skill should handle this deliberately rather than +by manual re-cloning. -Each item below is a one-line summary of a sub-TODO further down. Tick the box when the matching sub-TODO is moved to =DONE=. Items are grouped by area so they can be batched (e.g., "do all Playwright items in one session"). +*** 2026-05-16 Sat @ 01:14:40 -0500 Specification +#+begin_src cj: comment +write the specification here. +#+end_src -**** Browser testing -- [X] [#A] =playwright-js=: locator/assertion-first guidance (replace raw CSS, =networkidle=) -- [X] [#B] =playwright-js= + =playwright-py=: reconcile headless/visible defaults -- [X] [#B] =playwright-js= + =playwright-py=: remove emoji console markers from examples +*** 2026-05-16 Sat @ 01:14:20 -0500 original goals and decisions +**** Design decisions (agreed) -**** Frontend / UI -- [X] [#B] =frontend-design=: WCAG 2.2 alignment, accessibility non-optional -- [X] [#B] =frontend-design=: harmonize aesthetic guidance with anti-pattern rules +- *Upstream tracking:* per-fork manifest =.skill-upstream= (YAML or JSON): + - =url= (GitHub URL) + - =ref= (branch or tag) + - =subpath= (path inside the upstream repo when it's a monorepo) + - =last_synced_commit= (updated on successful sync) +- *Local modifications:* 3-way merge. Requires a pristine baseline snapshot of + the upstream-at-time-of-fork. Store under =.skill-upstream/baseline/= or + similar; committed to the rulesets repo so the merge base is reproducible. +- *Apply changes:* skill edits files directly with per-file confirmation. +- *Conflict policy:* per-hunk prompt inside the skill. When a 3-way merge + produces a conflict, the skill walks each conflicting hunk and asks Craig: + keep-local / take-upstream / both / skip. Editor-independent; works on + machines where Emacs isn't available. Fallback when baseline is missing + or corrupt (can't run 3-way merge): write =.local=, =.upstream=, + =.baseline= files side-by-side and surface as manual review. -**** Security -- [X] [#A] =security-check=: OWASP 2021 + WSTG coverage -- [X] [#B] =security-check=: tooling and offline/network caveats +**** V1 Scope -**** Combinatorial testing -- [X] [#B] =pairwise-tests=: t-way escalation guidance beyond pairwise -- [X] [#B] =pairwise-tests=: clarify negative value syntax + generator availability +- [ ] Skill at =~/code/rulesets/update-skills/= +- [ ] Discovery: scan sibling skill dirs for =.skill-upstream= manifests +- [ ] Helper script (bash or python) to: + - Clone each upstream at =ref= shallowly into =/tmp/= + - Compare current skill state vs latest upstream vs stored baseline + - Classify each file: =unchanged= / =upstream-only= / =local-only= / =both-changed= + - For =both-changed=: run =git merge-file --stdout =; + if clean, write result directly; if conflicts, parse the conflict-marker + output and feed each hunk into the per-hunk prompt loop +- [ ] Per-hunk prompt loop: + - Show base / local / upstream side-by-side for each conflicting hunk + - Ask: keep-local / take-upstream / both (concatenate) / skip (leave marker) + - Assemble resolved hunks into the final file content +- [ ] Per-fork summary output with file-level classification table +- [ ] Per-file confirmation flow (yes / no / show-diff) BEFORE per-hunk loop +- [ ] On successful sync: update =last_synced_commit= in the manifest +- [ ] =--dry-run= to preview without writing -**** V2MOM -- [X] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment) -- [X] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog -- [X] [#B] =create-v2mom=: mitigation/owner fields for Obstacles +**** V2+ (deferred) -**** Prompt engineering -- [X] [#A] =prompt-engineering=: correct/narrow Meincke citation -- [X] [#B] =prompt-engineering=: eval-harness requirement for production prompts +- [ ] Track upstream *releases* (tags) not just branches, so skill can propose + "upgrade from v1.2 to v1.3" with release notes pulled in +- [ ] Generate patch files as an alternative apply method (for users who prefer + =git apply= / =patch= over in-place edits) +- [ ] Non-interactive mode (=--non-interactive= / CI): skip conflict resolution, + emit side-by-side files for later manual review +- [ ] Auto-run on a schedule via Claude Code background agent +- [ ] Summary of aggregate upstream activity across all forks (which forks have + upstream changes waiting, which don't) +- [ ] Optional editor integration: on machines with Emacs, offer + =M-x smerge-ediff= as an alternate path for users who prefer ediff over + per-hunk prompts -**** Codify -- [X] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md= +**** Initial forks to enumerate (for manifest bootstrap) -**** Code review -- [X] [#A] =review-code=: resolve local-verification vs CI boundary -- [X] [#B] =review-code=: =CLAUDE.md= citation scope for public artifacts -- [X] [#B] =review-code=: relax three-strengths rule for tiny/failing diffs +- [ ] =arch-decide= → =wshobson/agents= :: =plugins/documentation-generation/skills/architecture-decision-records= :: MIT +- [ ] =playwright-js= → =lackeyjb/playwright-skill= :: =skills/playwright-skill= :: MIT +- [ ] =playwright-py= → =anthropics/skills= :: =skills/webapp-testing= :: Apache-2.0 -**** PR / review responses -- [X] [#A] =respond-to-review=: remove review-process language from commit messages -- [X] [#B] =respond-to-review=: use unresolved threads + resolution state -- [X] [#B] =respond-to-cj-comments=: drop personal absolute paths from public-writing (moot — already clean) -- [X] [#B] =respond-to-cj-comments=: fallback when =humanizer= or =emacsclient= unavailable (moot — superseded by /voice + VERIFY pattern) +**** Open questions -**** Branch workflow -- [X] [#A] =finish-branch=: fix base-branch detection -- [X] [#B] =finish-branch=: worktree-aware pull/merge safety -- [X] [#B] =start-work=: tool-availability + ceremony-scaling rules -- [X] [#B] =start-work=: claim-before-justify rollback risk +- [ ] What happens when upstream *renames* a file we fork? Skill would see + "file gone from upstream, still present locally" — drop, keep, or prompt? +- [ ] What happens when upstream splits into multiple forks (e.g., a plugin + reshuffles its structure)? Probably out of scope for v1; manual migration. +- [ ] Rate-limit / offline mode: if GitHub is unreachable, should skill fail + or degrade gracefully? Likely degrade; print warning per fork. -**** Tests / TDD -- [X] [#B] =add-tests=: fix missing =typescript-testing.md= reference or add ruleset (moot — ruleset now exists) -- [X] [#B] =add-tests=: explicit exceptions to "all three categories per function" +** TODO [#B] Build /research-writer — clean-room synthesis for research-backed long-form +SCHEDULED: <2026-05-15 Fri> -**** Debugging / RCA -- [X] [#B] =debug=: capture environment + recent-change context before hypotheses -- [X] [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries -- [X] [#B] =five-whys=: require evidence + counterfactual validation per why +Gap in current rulesets: between =brainstorm= (idea refinement → design doc) +and =arch-document= (arc42 technical docs), there's no skill for +research-backed long-form prose — blog posts, essays, white papers, +proposals with data backing, article-length content with citations. -**** Brainstorming -- [X] [#B] =brainstorm=: timebox + research/source rules for high-stakes designs +Craig writes documents across many contexts (defense-contractor work, +personal, technical, proposals). The gap is real. -**** Architecture -- [X] [#B] =arch-decide=: timeless examples, drop unverifiable claims -- [X] [#B] =arch-decide=: standardize statuses + immutability language -- [X] [#B] =arch-design=: threat modeling + privacy/compliance as first-class inputs -- [X] [#B] =arch-design=: separate paradigms from tactical patterns -- [X] [#B] =arch-document=: arc42/Q42 quality scenarios -- [X] [#B] =arch-document=: staleness + ownership metadata for generated docs -- [X] [#B] =arch-evaluate=: confidence levels for framework-agnostic findings -- [X] [#B] =arch-evaluate=: report skipped tool checks explicitly +*Evaluated 2026-04-19:* ComposioHQ/awesome-claude-skills has a +=content-research-writer= skill (540 lines, 14 KB) that attempts this. *Not +adopting:* +- Parent repo has no LICENSE file — reuse legally ambiguous +- Bloated: 540 lines of prose-scaffolding with no tooling +- No citation-style enforcement (APA/Chicago/IEEE/MLA) +- No source-quality heuristics (primary vs secondary, peer-review, recency) +- Fictional example citations in the skill itself (models the hallucination + failure mode a citation-focused skill should prevent) +- No citation-verification step +- Overlaps with =humanizer= at polish with no composition guidance -**** C4 modeling -- [X] [#A] =c4-analyze= + =c4-diagram=: notation/output fallback (not draw.io-only) -- [X] [#B] =c4-analyze= + =c4-diagram=: clarify abstraction boundaries +*Patterns worth lifting clean-room (from their better parts):* +- Folder convention =~/writing//= with =outline.md=, + =research.md=, versioned drafts, =sources/= +- Section-by-section feedback loop (outline validated → per-section + research validated → per-section draft validated) +- Hook alternatives pattern (generate three hook variants with rationale) -**** Global rules -- [X] [#B] =commits.md=: split DeepSat/Linear/Slack-specific from global rules → promoted to a top-level task (deferred for Craig) -- [X] [#A] =commits.md= + publish flows: =humanizer=-unavailable fallback → promoted to a top-level task (deferred; humanizer premise moot) -- [X] [#B] =verification.md=: explicit "unable to verify" reporting standard -- [X] [#B] =testing.md=: property-based + mutation testing as escalation paths -- [X] [#B] =testing.md=: soften absolute TDD with explicit spike protocol -- [X] [#B] =subagents.md=: capability/availability + cost checks +*Additions for the clean-room version (v1):* +- Citation-style selection (APA / Chicago / MLA / IEEE / custom) with + style-specific examples and a pick-one step up front +- Source-quality heuristics: primary > secondary; peer-reviewed; recency + thresholds by domain; publisher reputation; funding transparency +- Citation-verification discipline: fetch real sources, never fabricate, + mark unverifiable claims with =[citation needed]= rather than inventing +- Composition hand-off to =/humanizer= at the polish stage +- Classification awareness: if the working directory or context signals + defense / regulated territory, flag any sentence that might touch CUI + or classified material before emission -**** Languages -- [X] [#A] =python-testing.md=: revisit in-memory SQLite guidance -- [X] [#B] =python-testing.md=: separate "never mock ORM" from unit-test boundaries -- [X] [#B] =elisp.md=: drop tool-specific advice -- [X] [#B] =elisp-testing.md=: batch-mode + native-comp caveats +*Target:* ~150-200 lines, clean-room per blanket policy. -**** Hooks -- [X] [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets -- [X] [#A] =hooks/git-commit-confirm.py= + =gh-pr-create-confirm.py=: inspect message/body files referenced by =-F= / =--body-file= -- [X] [#B] =hooks/destructive-bash-confirm.py=: shell-aware command parsing (not regex) +*When to build:* wait for a real research-writing task to validate the +design against actual document patterns. Building preemptively risks +tuning for my guess at Craig's workflow rather than his real one. +Triggers that would prompt "let's build it now": +- Starting a white paper / proposal that needs citation discipline +- Writing a technical blog post with external references +- A pattern of hitting the same research-writing friction 3+ times -*** 2026-05-22 Fri @ 15:47:10 -0500 Made playwright guidance locator/assertion-first, dropped networkidle-as-readiness +Upstream reference (do not vendor): ComposioHQ/awesome-claude-skills +=content-research-writer/SKILL.md=. -Rewrote the readiness guidance in both =playwright-js/SKILL.md= and =playwright-py/SKILL.md=: reconnaissance now waits for a visible app landmark via a web assertion or locator (=expect(...).toBeVisible()= / =get_by_role(...).wait_for()=), not =networkidle= (which Playwright discourages). Updated the login/form examples to =getByLabel=/=getByRole= + web assertions, the API_REFERENCE.md waiting section, and =lib/helpers.js= defaults (=waitForPageReady= now defaults to =load= and prefers a caller-supplied landmark; =authenticate= races the success indicator over a =load= navigation). node --check passes. +** TODO [#C] Try Skill Seekers on a real DeepSat docs-briefing need +SCHEDULED: <2026-05-15 Fri> -*** 2026-05-22 Fri @ 14:23:02 -0500 Added headed/headless decision tables to both playwright skills +=Skill Seekers= ([[https://github.com/yusufkaraaslan/Skill_Seekers]]) is a Python +CLI + MCP server that ingests 18 source types (docs sites, PDFs, GitHub +repos, YouTube videos, Confluence, Notion, OpenAPI specs, etc.) and +exports to 20+ AI targets including Claude skills. MIT licensed, 12.9k +stars, active as of 2026-04-12. -Added matching purpose-based decision tables to =playwright-js/SKILL.md= (was "always visible") and =playwright-py/SKILL.md= Best Practices (was "always headless"). Each names its own default and points at the other skill, so the difference is deliberate, not a habit-flip: headed for interactive debugging, headless for CI/pytest. Also softened the absolutist "Always launch... headless" comment in the py example. +*Evaluated: 2026-04-19 — not adopted for rulesets.* Generates +*reference-style* skills (encyclopedic dumps of scraped source material), +not *operational* skills (opinionated how-we-do-things content). Doesn't +fit the rulesets curation pattern. -*** 2026-05-22 Fri @ 15:47:10 -0500 Removed emoji console markers from the playwright skills +*Next-trigger experiment (this TODO):* the next time a DeepSat task needs +Claude briefed deeply on a specific library, API, or docs site — try: +#+begin_src bash +pip install skill-seekers +skill-seekers create --target claude +#+end_src +Measure output quality vs hand-curated briefing. If usable, consider +installing as a persistent tool. If output is bloated / under-structured, +discard and stick with hand briefing. -Replaced every emoji status marker with a plain ASCII prefix across =playwright-js/= (run.js, lib/helpers.js, SKILL.md) and =playwright-py/= (SKILL.md, examples/*.py): 📦/⚡/📄/📥/🎭/🚀/📋/✅/❌/🔍/📸/✓/✗ → =[setup]=/=[run]=/=[ok]=/=[error]=/=[fail]= etc. Post-change emoji grep is clean (excluding node_modules); node --check and py_compile pass. +*Candidate first experiments (pick one from an actual need, don't invent):* +- A Django ORM reference skill scoped to the version DeepSat pins +- An OpenAPI-to-skill conversion for a partner-vendor API +- A React hooks reference skill for the frontend team's current patterns +- A specific AWS service's docs (e.g. GovCloud-flavored) -*** 2026-05-22 Fri @ 14:35:16 -0500 Made accessibility a non-optional WCAG 2.2 gate in frontend-design +*Patterns worth borrowing into rulesets even without adopting the tool:* +- Enhancement-via-agent pipeline (scrape raw → LLM pass → structured + SKILL.md). Applicable if we ever build internal-docs-to-skill tooling. +- Multi-target export abstraction (one knowledge extraction → many output + formats). Clean design for any future multi-AI-tool workflow. -Added an "Accessibility Gate (required before handoff)" section to =frontend-design/SKILL.md= covering keyboard operation, focus visibility, focus-not-obscured (2.2), target size (2.2), contrast, reduced motion, labels, and semantic structure — a baseline for all frontend work, not just interactive components. Rewrote the Build/Review phases to build accessibly as you go and clear the gate before handoff, and bumped =references/accessibility.md= from WCAG 2.1 to 2.2 with backing detail for the new criteria. +*Concerns to verify on actual use:* +- =LICENSE= has an unfilled =[Your Name/Username]= placeholder (MIT is + unambiguous, but sloppy for a 12k-star project) +- Default branch is =development=, not =main= — pin with care +- Heavy commercialization signals (website at skillseekersweb.com, + Trendshift promo, branded badges) — license might shift later; watch +- Companion =skill-seekers-configs= community repo has only 8 stars + despite main's 12.9k — ecosystem thinner than headline adoption -*** 2026-05-22 Fri @ 14:35:16 -0500 Added a "creative but bounded" section to frontend-design +** TODO [#C] Revisit =c4-*= rename if a second notation skill ships -Added a subsection under Frontend Aesthetics framing the bold/maximalist directions as tools, not obligations: domain fit, readability first, responsive stability, and no decorative effect that degrades the workflow. Reconciles rather than contradicts the maximalist encouragement (maximalism stays on the table as deliberate usable density), and ties the readability bullet to the new accessibility gate. +Current naming keeps =c4-analyze= and =c4-diagram= as-is (framework prefix +encodes the notation; "C4" is a discoverable brand). Suite membership is +surfaced via the description footer, not the name. -*** 2026-05-22 Fri @ 14:35:16 -0500 Updated security-check to OWASP Top 10 2021 + WSTG mapping +If a second notation-specific skill ever lands (=uml-*=, =erd-*=, =arc42-*=), +the compound pattern =arch-analyze-= / =arch-diagram-= +starts paying off: alphabetical clustering under 'a' amortizes across three+ +skills, and the hierarchy becomes regular. At that point, rename all +notation skills together in one pass. -Replaced the older six-category list in =.claude/commands/security-check.md= with the full Top 10 2021 set, each finding mapped to a 2021 category or WSTG area. Added the four missing categories (Insecure Design, Software and Data Integrity Failures, Security Logging and Monitoring Failures, SSRF) plus explicit checks for object/function-level authorization, SSRF on URL-fetch paths, update/plugin/dependency integrity, and logging/monitoring gaps. +Trigger: adding skill #2 in the notation family. Don't pre-rename. -*** 2026-05-22 Fri @ 14:35:16 -0500 Added scanner tooling + network caveats to security-check +Candidate future notation skills (not yet in scope — noted for when a +real need arrives, not pre-emptively): -Added an optional configured-scanners step (=gitleaks=/=trufflehog= secrets, =semgrep= source patterns, OSV scanner, lockfile-diff review) that supplements the manual scans, plus a network caveat: dependency audits that can't run (offline, tool absent, DB unreachable) must report "not run" naming the tool and reason, never read as a pass. Carried that into the no-issues summary. +- *UML* (Unified Modeling Language): OO design notation, 14 diagram types + in practice dominated by class / sequence / state / component. Common + in DoD / safety-critical / enterprise-architecture contexts. Tooling: + PlantUML (text-to-diagram), Mermaid UML, draw.io. Would likely split + into =uml-class=, =uml-sequence=, =uml-state= rather than one monolith + — different audiences, different inputs. +- *ERD* (Entity-Relationship Diagram): database schema modeling — + entities, attributes, cardinality. Crow's Foot notation dominates + practice; Chen is academic; IDEF1X is DoD-standard. Tooling: + dbdiagram.io, Mermaid ERD, PlantUML, ERAlchemy (code-to-ERD for SQL). + Natural fit as =erd-analyze= (extract from schema/migrations) and + =erd-diagram= (generate from prose/model definitions). +- *arc42*: already partially covered by =arch-document= (which emits + arc42-structured docs). A standalone =arc42-*= skill would be + redundant unless the arc42-specific visualizations need separation. -*** 2026-05-22 Fri @ 14:35:16 -0500 Added t-way escalation guidance to pairwise-tests +Each answers a different question: -Added an "Escalating Beyond Pairwise (t-way)" subsection: start with pairwise across the whole space, then escalate specific high-risk clusters to 3-way+ when history, safety, security, or domain coupling says a fault needs more than two interacting factors. Lists escalation triggers and shows the sub-model order syntax (={ A, B, C } @ 3=) vs a blanket =/o:3= bump, stressing targeted not uniform escalation. Cites NIST combinatorial-testing work. +- C4 → "What systems exist and how do they talk, at what zoom?" +- UML class/sequence → "What does the code look like / what happens when X runs?" +- ERD → "What's the database shape?" +- arc42 → "What's the full architecture document?" -*** 2026-05-22 Fri @ 14:35:16 -0500 Clarified PICT ~ syntax + honest generator-availability path in pairwise-tests +Deferred pending an actual need that's blocked on not having one of these. -Added a "~ prefix" explanation (PICT marker tagging a value as negative/invalid, not an arithmetic operator; PICT pairs negatives with valid values once and strips the marker before the SUT) and a stop-at-the-model rule: if neither the =pict= binary nor =pypict= is present, produce the model and stop rather than hand-writing a table and passing it off as PICT output. +*** DoD-specific notations (DeepSat context) -*** 2026-05-22 Fri @ 14:43:17 -0500 Renamed Metrics → Measures throughout create-v2mom +Defense-contractor work uses a narrower, different notation set than +commercial software. Document the trigger conditions and starting point +so a future decision to build doesn't have to re-derive the landscape. -Full rename across =.claude/commands/create-v2mom.md= (acronym expansions, Phase 7 heading, the "Measures must be measurable" principle, exit criteria, review questions, red flags, examples) to match Salesforce's official term. Kept the "vanity metrics" idiom intact — it's the anti-pattern term, not a section reference. +**** SysML (Systems Modeling Language) -*** 2026-05-22 Fri @ 14:43:17 -0500 Split strategy from execution in create-v2mom task migration +UML 2 profile, dominant in DoD systems engineering. Six diagrams account +for ~all practical use: -Rewrote Phase 8 (and tightened Phase 5.5): tasks stay in the backlog grouped by method, and each method gains a one-line link to where its tasks live, instead of transplanting the task tree into the V2MOM. Strategy (V2MOM) and execution (backlog) are now explicitly separate sources of truth, keeping the V2MOM concise. +- *Block Definition Diagram (BDD)* — structural; like UML class but for + system blocks (components, subsystems, hardware). +- *Internal Block Diagram (IBD)* — parts within a block and how they + connect (flow ports, interfaces). +- *Requirement diagram* — unique to SysML; traces requirements to + satisfying blocks. Essential in regulated environments. +- *Activity diagram* — behavioral flow. +- *State machine* — same shape as UML. +- *Sequence diagram* — same shape as UML. -*** 2026-05-22 Fri @ 14:43:17 -0500 Made create-v2mom obstacles operational (mitigation/owner/cadence) +SysML v1.x is in the field; v2 is emerging but not yet adopted at scale +(as of 2026-04). Tooling dominated by Cameo Systems Modeler / MagicDraw +and Enterprise Architect. Text-based option: PlantUML + =plantuml-sysml= +(git-friendly, growing niche). -Phase 6 now captures, per obstacle: name, manifestation, stakes, mitigation, owner, and review cadence — with a worked example per domain (health/finance/software), a "good obstacle" characteristic, a Phase 9 review question, and a red flag for candid-but-not-operational obstacles. An obstacle without a countermove is now flagged as an observation, not a plan. +*Candidate skills*: =sysml-bdd=, =sysml-ibd=, =sysml-requirement=, +=sysml-sequence=. Three or more in this cluster triggers the +=arch-*-= rename discussion from the parent entry. -*** 2026-05-22 Fri @ 14:43:17 -0500 Corrected and narrowed the Meincke citation in prompt-engineering +**** DoDAF / UAF (architecture frameworks) -Fixed the title to "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests" (SSRN abstract_id=5357179) in all three spots (frontmatter, Seven Principles intro, References). Reframed the ~33%→72% result as what it is — a prompt-safety caution that persuasion raises compliance with objectionable requests — explicitly not evidence that persuasion framing improves engineering prompt quality. Kept the seven principles as a tone vocabulary. +Not notations themselves — frameworks that specify *which* viewpoints a +program must deliver. Viewpoints are rendered using UML/SysML diagrams. -*** 2026-05-22 Fri @ 14:43:17 -0500 Added an eval-harness requirement to prompt-engineering critique mode +- *DoDAF (DoD Architecture Framework)* — legacy but still + contract-required on many programs. +- *UAF (Unified Architecture Framework)* — DoDAF/MODAF successor, + SysML-based. Gaining adoption on newer contracts. -Added critique step 7 + a checklist line: for fragile or reusable/production prompts, write 3-5 adversarial/edge inputs, run both the old and new prompt against each, and record the behavioral delta. A throwaway prompt can ship on the rewrite alone; a discipline/reused/production one can't. Without examples, "the rewrite is better" is an assertion, not a result. +Common required viewpoints (formal CDRL deliverables or PDR/CDR +review packages): -*** 2026-05-22 Fri @ 14:43:17 -0500 Added mandatory stale-entry + privacy pre-write checks to codify +- *OV-1* — High-Level Operational Concept Graphic. The "cartoon" showing + the system in operational context with icons, arrows, surrounding + actors/environment. *Universally asked for — informal or formal.* + Starting point for any DoD diagram skill. +- *OV-2* — Operational resource flows (nodes and flows). +- *OV-5a/b* — Operational activities. +- *SV-1* — Systems interfaces. Maps closely to C4 Container. +- *SV-2* — Systems resource flows. +- *SV-4* — Systems functionality. +- *SV-10b* — Systems state transitions. -Added a "Mandatory pre-write checks" block at the top of Phase 3 (Write) in =.claude/commands/codify.md=: a stale-entry scan (update/remove no-longer-true entries in place, don't append contradictions around them) and a privacy/leak check carrying both questions verbatim — "safe if the project were public?" and "belongs in private memory instead?" — routing private content to auto-memory. Gates, not background guidance. +*Informal ask ("send me an architecture diagram") → OV-1 + SV-1 satisfies +90% of the time.* Formal CDRL asks specify the viewpoint set contractually. -*** 2026-05-22 Fri @ 14:06:41 -0500 Scoped review-code's CI-trust rule to reviewing, not shipping +*C4 gap*: C4 is rare in DoD. C4 System Context ≈ OV-1 in intent but not +in visual convention. C4 Container ≈ SV-1. Expect a mapping step or +reviewer pushback if delivering C4-shaped artifacts to a DoD audience. -Expanded the False-Positive Filter bullet in =review-code/SKILL.md=: "trust CI, don't run builds" applies to reading a diff, not producing one. A pre-commit/pre-push flow still owes the local verification =verification.md= requires (run the suite or state "not run because..."). Closes the apparent contradiction with =verification.md= / =finish-branch=. +*Candidate skills*: =dodaf-ov1=, =dodaf-sv1= first (highest-value); +=uaf-viewpoint= if newer contracts require UAF. -*** 2026-05-22 Fri @ 14:06:41 -0500 Added private-vs-public CLAUDE.md citation modes to review-code +**** IDEF1X (data modeling) -Expanded the Content scope section in =review-code/SKILL.md= with two modes: a private/internal review cites =CLAUDE.md= directly; a public/team review translates the rule into the engineering reason it encodes and doesn't name the rules file (a teammate can act on the reason, not on a file they can't reach). Same principle =commits.md= states for personal tooling in public artifacts. +FIPS 184 — federal standard for data modeling. Used in classified DoD +data systems, intelligence databases, and anywhere the government +specifies the data model. Same shape language as Crow's Foot but with +different adornments and notation conventions. -*** 2026-05-22 Fri @ 13:48:14 -0500 Relaxed review-code "three strengths" to up-to-three-or-none +*Rule of thumb*: classified DoD data work → IDEF1X; unclassified +contractor work → Crow's Foot unless the contract specifies otherwise. -Changed all three "three minimum" spots in =review-code/SKILL.md= (Strengths section, Critical Rules DO list, Anti-Patterns) to "up to three specific; say none found on a tiny or weak diff." Reframed the old "No Strengths section" anti-pattern as "Skipping strengths out of laziness" so a substantive diff still demands them while a weak one can honestly report nothing notable. Landed alongside Craig's adjacent edit telling reviewers not to explain why a strength is good (sycophantic padding). +*Candidate skills*: =idef1x-diagram= / =idef1x-analyze= (parallel to a +future =erd-diagram= / =erd-analyze= pair). -*** 2026-05-22 Fri @ 14:12:24 -0500 Removed review-process language from respond-to-review commit guidance +**** Tooling baseline -Replaced the =fix: Address review — [description]= example (and the matching description-line phrasing) in =.claude/commands/respond-to-review.md= with "name the actual fix (=fix: validate export filename=), not the review that prompted it." Killed the non-ASCII dash and the process-in-commit pattern that conflicted with =commits.md=. +- *Cameo Systems Modeler / MagicDraw* (Dassault) — commercial SysML + dominant in DoD programs. +- *Enterprise Architect (Sparx)* — widely used for UML + SysML + DoDAF. +- *Rhapsody (IBM)* — SysML with code generation; strong in avionics / + embedded (FACE, ARINC). +- *Papyrus (Eclipse)* — open source SysML; free but clunkier. +- *PlantUML + plantuml-sysml* — text-based, version-controllable. Fits a + git-centric workflow better than any GUI tool. -*** 2026-05-22 Fri @ 14:12:24 -0500 Made respond-to-review fetch unresolved threads + resolve after verification +**** Highest-value starting point -Rewrote section 1 (Gather) in =.claude/commands/respond-to-review.md= to pull =reviewThreads= via =gh api graphql= with =isResolved=, skipping already-resolved threads so settled feedback isn't re-processed; top-level conversation comments still come from REST. Added a section-4 step: reply and resolve a thread only after the fix is verified, never before. +If DeepSat contracts regularly require architecture deliverables, the +highest-ROI first skill is =dodaf-ov1= (or whatever naming convention +the rename discussion lands on). OV-1 is the universal currency in +briefings, proposals, and reviews; it's the one artifact that shows up +in every program regardless of contract specifics. -*** 2026-05-22 Fri @ 14:12:24 -0500 Verified respond-to-cj-comments no longer embeds an absolute path (moot) +Trigger for building: an actual DoD deliverable that's blocked on not +having a skill to generate or check OV-1-shaped artifacts. Don't build +speculatively — defense-specific notations are narrow enough that each +skill should be driven by a concrete contract need, not aspiration. -Already resolved by a prior migration: =grep= for =/home/= and =/Users/= in =.claude/commands/respond-to-cj-comments.md= returns nothing. The public-writing section refers to the rules by name, not by local path. No edit needed. +** TODO [#C] Add =make uninstall-mcp= + =mcp/install.py --check= for symmetry -*** 2026-05-22 Fri @ 14:12:24 -0500 Closed respond-to-cj-comments humanizer/emacsclient fallback (largely moot) +Currently the MCP install pipeline only flows one direction. No way to remove rulesets-managed MCP servers in one command. No way to ask "what's the drift between =servers.json= and =claude mcp list=" without eyeballing. -Overtaken by two later changes: =/humanizer= was replaced by =/voice personal= (no =/humanizer= invocation remains), and the mandatory =emacsclient= summary-open was replaced by the in-place VERIFY-task pattern (workflow line ~262, Craig's 2026-05-12 standing instruction). Only a stale descriptive phrase remained — tidied "humanizer's signs of AI writing" to "the signs of AI writing." The original fresh-environment-fallback concern no longer applies as written. +*** =make uninstall-mcp= -*** 2026-05-22 Fri @ 14:51:37 -0500 Fixed finish-branch base-branch detection +Iterate over =servers.json=, run =claude mcp remove -s user= for each. Ignore "not registered" errors. Idempotent. -Rewrote Phase 2: resolve the base *branch name* in priority order (open PR's =baseRefName=, then =git symbolic-ref --short refs/remotes/origin/HEAD= stripped, then ask), and compute the merge-base *SHA* separately only where a commit range is needed. Made the branch-name-vs-merge-base distinction explicit, since the old command returned a SHA where a branch name was needed. +*** =mcp/install.py --check= -*** 2026-05-22 Fri @ 14:51:37 -0500 Made finish-branch merge safer + worktree-aware +Dry-run mode. Decrypt secrets, but instead of registering, print the drift report: -Added pre-flight checks to Option 1 (Merge Locally): dirty-tree refusal with no auto-stash, protected-branch awareness, upstream-gated =git pull --ff-only=, and merge-commit-vs-rebase as a team-policy choice instead of a hardcoded =--no-ff=. Replaced the fragile =git worktree list | grep = detection with a =git rev-parse --git-dir= vs =--git-common-dir= comparison plus =git worktree list --porcelain= for the path. +- Servers in =servers.json= not in =claude mcp list= → =MISSING= +- Servers in =claude mcp list= not in =servers.json= → =EXTRA= +- Servers in both → =ok= -*** 2026-05-22 Fri @ 14:51:37 -0500 Added tool-availability + ceremony-scale paths to start-work +Useful for diagnosing connection failures and for the eventual =make doctor= integration. -Added a "Tool availability" section (graceful degradation when Linear MCP / =gh= / =/voice= / Playwright are missing — do what's available, surface what isn't, don't block) and a "Ceremony scale" section (trivial / small / standard tiers so a two-line fix skips ticket+branch+gates unless asked). The =humanizer= reference in the original item is moot — the file already uses =/voice= throughout. +** TODO [#C] Update =README.org= with MCP install pipeline section -*** 2026-05-22 Fri @ 14:51:37 -0500 Resolved start-work claim-before-justify rollback risk +=README.org= covers global install, per-project language bundles, and design principles, but doesn't mention =make install-mcp= or the =mcp/= directory. Add a short section after "Per-project language bundles" describing the user-scope MCP install pattern (decrypt → expand → register) and pointing at the eventual =mcp/README.org=. -Split the claim by tracker type: personal todo.org claims defer to after the Justify gate (a killed task needs no rollback), while team trackers (Linear/GitHub) still claim first to signal intent but record prior state (status, assignee, label) so the Phase 2 rollback restores exactly it. Updated the per-tracker rollback steps and the matching anti-pattern. +** TODO [#C] Token-rotation helper for =@a-bonus/google-docs-mcp= OAuth refresh -*** 2026-05-22 Fri @ 14:28:41 -0500 Verified add-tests typescript-testing.md reference resolves (moot) +When a Google refresh token gets revoked (re-grant scopes, removed Connected App, account password reset), recovery is currently manual: run =npx -y @a-bonus/google-docs-mcp= with the right env, follow the URL in a browser, kill the process, base64-encode the new =token.json=, decrypt =secrets.env.gpg=, replace the var, re-encrypt. A small =mcp/refresh-google-docs-token.sh = would chain that into one command. -Resolved since the audit: =languages/typescript/claude/rules/typescript-testing.md= now exists, and =add-tests/SKILL.md:68= references it by bare filename, the same way it references =python-testing.md= (both get copied into a project's =.claude/rules/=). The "missing file" premise no longer holds. No edit needed. +*** Sketch -*** 2026-05-22 Fri @ 14:28:41 -0500 Added a category-exception protocol to add-tests +#+begin_src bash +# usage: mcp/refresh-google-docs-token.sh personal +profile="$1" +gpg -d ... | grep -v "GOOGLE_DOCS_${profile^^}_TOKEN_B64" > /tmp/secrets.env.tmp +GOOGLE_MCP_PROFILE="$profile" npx -y @a-bonus/google-docs-mcp & +xdg-open +# wait for ~/.config/google-docs-mcp/$profile/token.json to land +kill %1 +echo "GOOGLE_DOCS_${profile^^}_TOKEN_B64=$(base64 -w0 ~/.config/google-docs-mcp/$profile/token.json)" >> /tmp/secrets.env.tmp +gpg -c --cipher-algo AES256 -o mcp/secrets.env.gpg.new /tmp/secrets.env.tmp +mv mcp/secrets.env.gpg.new mcp/secrets.env.gpg +rm /tmp/secrets.env.tmp +#+end_src -Added an exception note to step 7 (proposal) in =add-tests/SKILL.md=: pure adapters, generated code, tiny pass-through wrappers, and framework glue may skip a category that would only re-test the framework, but the skip must be stated and justified in the plan and the behavior covered at integration/E2E level — never a silent omission. Step 12 (write) now points back to "honor documented category exceptions." +The flow tonight worked but took a handful of manual steps. One script collapses it. -*** 2026-05-22 Fri @ 14:25:37 -0500 Added environment + recent-change capture to debug Phase 1 +** TODO [#C] Decide on category-3 rule copies in the deepsat tree -Added a fourth Phase-1 step in =debug/SKILL.md=: record versions, feature-flag/config state, dataset/fixture, seed/clock, concurrency, and recent commits/config-infra changes. Noted that intermittent bugs usually live in environment/state transitions (and "what changed recently" is often the fastest route), while a deterministic local bug only needs a one-liner. Updated the phase's closing recap to include the context. +While symlinking personal-project =.claude/rules/= mirrors to the rulesets canonical on 2026-05-07, two locations didn't fit the "personal mirror → symlink" pattern and were left untouched pending judgment: -*** 2026-05-22 Fri @ 14:25:37 -0500 Constrained root-cause-trace defense-in-depth to boundaries +- =~/projects/work/deepsat/code/coding-rulesets/claude-rules/{testing,verification}.md= — looks like a vendored team-shared copy. +- =~/projects/work/deepsat/code/orchestration_dashboard_mvp/.claude/rules/{testing,verification}.md= — could be project-specific overrides. -Rewrote step b in =root-cause-trace/SKILL.md=: instead of "add a check at each layer that could have caught it," add one only at a layer that owns a boundary or invariant — ingress/trust, persistence, invariant-owning service, final render. Added the explicit rule that a pass-through function owning neither shouldn't get a duplicate null check (validation spam). Recast the three example layers as the boundary types. +For each: read the file, diff against the rulesets canonical, decide whether it's an intentional diverge (leave alone), stale (sync content), or should canonicalize (replace with symlink and accept the cross-repo dependency). The orchestration_dashboard_mvp pair is the project where Vrezh's PR review surfaced this whole thread, so any decision there has team-visibility implications. -*** 2026-05-22 Fri @ 14:25:37 -0500 Required evidence + counterfactual per why in five-whys +** TODO [#C] Audit language-specific rule files for cross-project duplication -Expanded step 2 in =five-whys/SKILL.md=: each link now owes an evidence field (a log/commit/metric/config you can point to) and a counterfactual check (remove this cause — does the symptom above plausibly not happen?). Framed the counterfactual as the main guard against monocausal storytelling, and updated the worked example to show both fields. +The four canonical rules (=commits=, =testing=, =verification=, =subagents=) are now symlinked across the five personal-project mirrors as of 2026-05-07. But several language-specific rule files exist in multiple project mirrors and may be duplicated or drifted: -*** 2026-05-22 Fri @ 15:51:59 -0500 Added timebox + fresh-sources rules to brainstorm +- =python-testing.md= in =~/projects/work/.claude/rules/= +- =typescript-testing.md= in =~/projects/work/deepsat/code/.claude/rules/= +- =elisp-testing.md= and =elisp.md= in =~/.emacs.d/=, =~/code/gloss/=, =~/code/chime/= -Phase 1 gained a "Timebox the dialogue" rule (aim for the one-sentence restatement in ~5-8 questions, then move on and park the rest as open questions). Phase 2 gained "Ground high-stakes claims in fresh sources" (check load-bearing claims about markets/regulations/tools/vendors/APIs against a current source; mark unverified ones as assumptions). The design-doc skeleton gained an "## Assumptions" section that distinguishes researched facts (with source) from assumptions (to confirm before building). +The Elisp pair is the most suspicious — three repos using essentially the same rules. Audit: diff these across the projects, check for drift, then decide whether to canonicalize them under =~/code/rulesets/claude-rules/languages//= and symlink, or leave them as project-local. -*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-decide examples timeless + required citations +** TODO [#C] Consolidate =claude-templates/Makefile= after fold :chore: -Dated the MongoDB multi-document-transaction example (scoped to 2024-01) with a backing reference, and added a "Cite, don't assert" Do: every concrete technical claim about a tool/version/platform carries a link, doc, version, or "checked YYYY-MM" date, or gets a domain-neutral placeholder — so unsourced "X can't do Y" doesn't rot into stale fact. - -*** 2026-05-22 Fri @ 14:59:32 -0500 Standardized arch-decide ADR statuses + immutability rule +Sibling follow-up from the fold child (2026-05-15). After the subtree merge, =rulesets/claude-templates/Makefile= still has its standalone =install= / =uninstall= / =list= / =test-scripts= targets. The =install= target's =bin/ai= logic is now duplicated in =rulesets/Makefile=. Both work; the redundancy is harmless but worth cleaning up. -Declared a canonical five-status set (Proposed, Accepted, Rejected, Deprecated, Superseded) with an explicit "no synonyms" line, and spelled out the immutability rule in the Don'ts: an accepted ADR's body is frozen, only status/link metadata changes, a changed decision gets a new superseding ADR and the old one stays as the historical record. +Options: +- *Delete* =claude-templates/Makefile= entirely — forces all install through rulesets root. Cleaner. +- *Strip down* to just =test-scripts= — the one piece not redundant with =rulesets/Makefile=. +- *Leave it* — slight redundancy, no functional harm. -*** 2026-05-22 Fri @ 14:59:32 -0500 Added Trust/Data/Compliance phase to arch-design +Triggered by: 2026-05-15 fold session's refactor audit (commit =2d645fc=). -Added a new Phase 4 (Trust, Data, and Compliance) before the paradigm shortlist: trust boundaries, data classification, abuse/misuse cases, privacy constraints, compliance evidence, and operational ownership — surfaced early so the architecture is drawn around them, not retrofitted by a downstream =security-check=. Threaded into the workflow list, brief template (new §6), review checklist, and anti-patterns. +** TODO [#C] Refactor =daily-prep.org= to delegate to =triage-intake.org= for the triage section -*** 2026-05-22 Fri @ 14:59:32 -0500 Split paradigms from tactical patterns in arch-design +=daily-prep.org= still does its own inline triage (Gmail × 3 accounts, Slack, Linear, GHE PRs, calendars) as part of the full prep flow. Now that =triage-intake.org= exists as a standalone scan over the same source set, daily-prep could call it and consume its synthesis instead of duplicating the source-scan logic — DRYs up a 57k-line workflow and keeps both flows in sync when sources change. -Split Phase 5's single mixed table into Step 1 (pick one paradigm: monolith/microservices/layered/event-driven/serverless/pipeline/space-based) and Step 2 (compose tactical patterns: DDD, hexagonal, CQRS, event sourcing — several or none, often per-module), with composition examples and an anti-pattern against treating DDD/CQRS as alternatives to a paradigm. Recommendation + brief now name a paradigm plus composed patterns. +Scope: +- Identify the sections in =daily-prep.org= that do the inline triage (the email / Slack / Linear / PR / calendar fan-out, plus the "Sources checked: ..." footer at the top of each generated prep doc). +- Replace those sections with "run =triage-intake.org=" and adapt the downstream sections (Heads-up, Day's Priorities, Carry-forwards) to read triage-intake's synthesis output rather than the inline scan results. +- Verify the generated prep doc still has the same shape (Heads-up + Day's Priorities + Carry-forwards + Sources checked). -*** 2026-05-22 Fri @ 14:59:32 -0500 Expanded arch-document quality scenarios to the Q42 six-part template +Origin: came up while authoring =triage-intake.org= on 2026-05-11. -Replaced §10's thin "Under [condition]..." template with the arc42/Q42 six-part structure (source, stimulus, environment, artifact, response, response measure), each glossed, with the cart-checkout example rewritten across all six parts. A one-line prose form stays acceptable once all six parts are recoverable. +* Rulesets Resolved +** DONE [#C] Fix =cj-scan= false positives on cj fences nested inside other =#+begin_*= blocks :bug: +CLOSED: [2026-05-15 Fri] -*** 2026-05-22 Fri @ 14:59:32 -0500 Added staleness/ownership metadata to arch-document output +=cj-scan.py= was matching =#+begin_src cj:= / =#+end_src= line-by-line +without awareness of enclosing block scopes. A cj fence embedded inside a +=#+begin_example= block (typically when documenting what the == +(for any == other than =cj:= via the more-specific cj-open regex, which +is checked first), it enters a wrapper state where every line is treated as +content until the matching =#+end_= closer fires. Inside a wrapper, cj +fence patterns and legacy inline =cj:= lines are both suppressed. -*** 2026-05-22 Fri @ 14:59:32 -0500 Added confidence levels to arch-evaluate findings +Tests: added =TestCjScanNestedFencesIgnored= (6 tests) to +=claude-templates/.ai/scripts/tests/test_cj_scan.py= covering nesting inside +=#+begin_example=, =#+begin_src =, and =#+begin_quote=, plus +regression guards that a wrapper closes cleanly (a subsequent real cj fence +is still detected) and that an unclosed wrapper doesn't silently swallow +later content into false-positive cj blocks. -Added a "Confidence and Provenance" subsection: every framework-agnostic finding carries High/Medium/Low + how it was determined, with a required "Not fully checked because..." note when scale, runtime imports, reflection, or dynamic dispatch cap certainty. Updated the example findings and review checklist; a finding with no note now asserts a full read. +Full =make test-scripts= equivalent (=python3 -m pytest=): 302 passed, 1 +skipped, 0 failures. -*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-evaluate report skipped tool checks explicitly +** DONE [#A] Add =make doctor= — verify ~/.claude/ matches repo + settings.json :feature: -Replaced "skip silently" with explicit reporting: for each detected language whose tool isn't configured or can't run, emit an Info "tool not configured / not run" finding (with an example) so the audit shows what was and wasn't verified. A check that didn't run no longer reads as a pass. Updated workflow step 4 and the review checklist. +A drift detector that scans =~/.claude/= and reports anything inconsistent with what the repo expects. Single-command answer to "is my machine consistent with rulesets?" -*** 2026-05-22 Fri @ 14:51:37 -0500 Added notation/output fallback to c4-analyze + c4-diagram +*** Why this matters -Both commands now treat C4 as notation-independent: a "Choosing a notation" section (draw.io XML, Structurizr DSL, Mermaid with native C4 types, PlantUML/C4-PlantUML) and a headless fallback that emits a text notation (Mermaid or Structurizr DSL) and skips PNG-export/desktop-open when =drawio= or a GUI is absent, rather than failing. draw.io is now one option, not the only one. +A 2026-05-06 sweep found =~/.claude/hooks/= didn't exist on this machine even though =settings.json= referenced =~/.claude/hooks/precompact-priorities.sh= as a PreCompact hook. Compaction would have silently failed to invoke the hook. The fix was =make install-hooks=, but the breakage was invisible until I happened to grep for it. =make doctor= run regularly (or even as part of session start) would catch this kind of drift in seconds instead of after the fact. -*** 2026-05-22 Fri @ 14:51:37 -0500 Clarified C4 abstraction boundaries in c4-analyze + c4-diagram +*** Checks -Added an "Abstraction boundaries" section to both: a Container is a separately deployable/runnable unit (not synonymous with a Docker container — a SPA or managed DB counts), a Component lives inside one Container and isn't separately deployable. Added a 4e "Verify single abstraction level" check that walks every element and relationship to confirm it stays at the diagram's level, notation-independent. +- Every entry in =settings.json= ="hooks"= block points at a file that exists. +- Every entry in =enabledPlugins= has a matching install under =~/.claude/plugins/data/=. +- Every skill in =$(SKILLS)= has a working symlink at =~/.claude/skills/=. +- Every rule in =$(RULES)= has a working symlink at =~/.claude/rules/=. +- Every default hook has a symlink at =~/.claude/hooks/= (warn-only — opt-out is legitimate). +- =settings.json= and =.mcp.json= symlinks resolve to the rulesets versions. +- =mcp/install.py= state matches =claude mcp list= (every server in =servers.json= is registered). +- No dangling symlinks anywhere under =~/.claude/=. -*** 2026-05-22 Fri @ 15:10:35 -0500 Added "When You Cannot Verify" standard to verification.md +*** Output -Added a section requiring, when a verification command can't run, a four-part report: command attempted, why it couldn't run, risk left unverified, and the smallest next command for the user. States the principle that a check that didn't run is never reported as a pass — "unable to verify" is a required honest outcome, not silence. Placed after Red Flags. +One line per check: =ok= / =WARN= / =FAIL=. Final summary: =N ok, M warnings, K failures=. Exit non-zero on any failure so it can ride a pre-flight check. -*** 2026-05-22 Fri @ 15:10:35 -0500 Added property-based + mutation testing escalation to testing.md +** DONE [#A] Build =voice= skill — combine =humanizer= with universal + personal style passes :feature: -Added an "Escalation Beyond Category and Pairwise" section: property-based testing for invariants over a broad input domain (round-trips, idempotence, ordering — Hypothesis/fast-check/proptest) and mutation testing for when high line coverage hides thin assertions (mutmut/cosmic-ray/Stryker). Both framed as escalation paths to reach for on a gap, not gates on every unit. +Combine =humanizer= with universal good-writing passes (Strunk & White, Orwell, Plain English) and the personal-style passes from =commits.md=. Two modes — =general= for arbitrary writing, =personal= for commits/PRs/comments — share a foundation and diverge on register. -*** 2026-05-22 Fri @ 15:10:35 -0500 Added a disciplined spike protocol to testing.md +Built and shipped 2026-05-07: =voice/SKILL.md= with 39 numbered patterns walked sequentially. Patterns 1-25 carried over from humanizer, 26-31 are universal good-writing additions, 32-39 are personal-only. Migrated three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=). Removed the standalone =humanizer= skill since voice supersedes it. -Formalized the existing "I need to spike first" excuse-table row into a "Spike Exception (Disciplined)" subsection under TDD Discipline: TDD stays the default, but a spike is sanctioned when all three hold — timeboxed, spike code not committed, and the first failing test written before productionizing the discovered approach. Built on the existing row rather than contradicting it. +*** Why this matters -*** 2026-05-22 Fri @ 15:10:35 -0500 Added pre-dispatch availability + cost checks to subagents.md +Three transformations want to run together for personal-mode artifacts (commits, PR titles + bodies, PR comments) but lived in three places: =humanizer= as a skill, S&W-style universal rules nowhere (applied ad-hoc), and the personal-style passes as prose steps in =commits.md= that got re-applied by hand each time. Costs: (1) the "I forgot pass (e)" failure mode — skipping a pass without flagging is a defect but happens in practice. (2) No single-call invocation of the full transform. (3) General-mode writing (research notes, philosophy, history) got only humanizer with no universal-prose pass at all. Combining brings them under one skill with one invocation. -Added a "Pre-Dispatch Checks" section with two gates: Availability (no Agent capability → do the work in the main thread under the same scope/constraints/output discipline the contract would enforce) and Cost (when writing the full contract costs more than the task, do it inline). Cross-references the existing "Don't Subagent At All" section and "Subagenting trivial work" anti-pattern rather than duplicating. +*** Design -*** 2026-05-22 Fri @ 15:06:04 -0500 Revised python-testing SQLite guidance toward production-like DBs +Two modes: -Replaced "prefer in-memory SQLite for speed" with: run ORM/query tests against a production-like DB (same engine as prod, often containerized), since SQLite diverges from Postgres/MySQL on query semantics, constraints, transactions, JSON, time zones, and indexes (a test can pass on SQLite and fail in prod). SQLite stays only for pure unit tests with no DB-semantics dependency. +- *general* (default) — for arbitrary writing not bound for commit/PR/comment publishing (research notes, philosophy/history essays, emails, README prose). Runs: + - humanizer (current behavior — strip AI-generated-writing fingerprints) + - tier-1 universal passes (canonical good-writing rules) + - the 2 personal-style passes that have no register conflict (jargon-fragment rewrite, noun-ified verbs) -*** 2026-05-22 Fri @ 15:06:04 -0500 Clarified python-testing ORM-mocking boundary +- *personal* — for commits, PR titles + bodies, PR comments. Runs general PLUS: + - 8 personal-only passes (first-person rewrite, semicolons, contractions, sentence-split, felt-experience, sentence fragments, terse cut, public-artifact scope check) -Changed the "never mock" bullet from "ORM queries" to "ORM internals (querysets, sessions, model internals)" and added a paragraph: domain services use real model methods/validation, but a thin orchestration unit can inject a fake at a deliberate data-access port (a repository/interface the code owns). That's still mocking at a boundary, not at ORM internals. +The 8 personal-only passes are explicitly *not* in general mode. They conflict with academic / literary / philosophical register. Forcing first-person on a Foucault essay or stripping felt-experience from a journal entry would damage the writing. -*** 2026-05-22 Fri @ 15:06:04 -0500 Made elisp.md editing advice tool-agnostic +*** Tier 1 universals (v1) -Rephrased the "prefer Write over repeated Edits" bullet around intent: land nontrivial Elisp as one cohesive change rather than dribbling it in over tiny partial edits (which accumulate paren mismatches), and run paren-balance + byte-compile checks immediately after, whatever editing mechanism the environment uses. +From Strunk & White, Orwell's "Politics and the English Language", Plain English Campaign, and Garner's Modern English Usage. Each is a detection-pattern + rewrite-rule pair, mechanical enough to apply consistently across runs. -*** 2026-05-22 Fri @ 15:06:04 -0500 Added batch-mode + native-comp caveats to elisp-testing.md +- *Omit needless words* — curated phrase list (=the fact that= → =that=/=because=, =in order to= → =to=, =at this point in time= → =now=, =due to the fact that= → =because=, =for the purpose of= → =to=, =in spite of= → =despite=, etc.) +- *Long word → short word* — Plain English wordlist (~150 entries: =utilize=→=use=, =commence=→=start=, =terminate=→=end=, =facilitate=→=help=, =demonstrate=→=show=, =sufficient=→=enough=, =prior to=→=before=, =subsequent to=→=after=, =in the event that=→=if=, =a great deal of=→=much=) +- *Active over passive voice* — detect "to be + past-participle" patterns. Suggestion-only in v1 (auto-rewrite is risky in technical contexts where passive is appropriate); graduate to auto-rewrite for unambiguous cases in v2. +- *Comma splices* — detect independent clauses joined only by comma; rewrite to period or semicolon-then-period. +- *Cliché flag* — small curated list (=at the end of the day=, =moving forward=, =going forward=, =at this juncture=, =circle back=, =low-hanging fruit=, =deep dive=, =leverage= as verb). -Added three sections: Batch-Mode Reproducibility (=emacs --batch= as source of truth, no interactive-session state, no blocking prompts, deterministic), Isolating Emacs State (temp =user-emacs-directory=, explicit load-path, declared deps only, with an unwind-protect sandbox example), and Byte-Compile/Native-Comp Warnings (=byte-compile-error-on-warn=, native-comp gated on =native-comp-available-p= and kept opt-in/version-aware). +*** Tier 2 universals (v2) -*** 2026-05-22 Fri @ 15:16:22 -0500 Synced hooks/README install snippets with the destructive hook (opt-in) +- *Positive over negative form* (S&W) — =not unlike= → =like=, =do not fail to= → =remember to=, =did not pay any attention= → =ignored= +- *Garner-style word-pair corrections* — comprise/compose, less/fewer, that/which (restrictive vs nonrestrictive), affect/effect, principal/principle +- *Parallelism in lists* — detect mismatched grammar in bullet items +- *Tense consistency* — flag mid-paragraph tense shifts +- *Acronym definition on first use* — detect uppercase tokens used before being expanded -Brought the README's manual-install and settings-JSON snippets in line with the canonical =hooks/settings-snippet.json= (which already wires all three) and the Makefile's opt-in design: added the destructive-bash-confirm.py symlink as an opt-in step, added its settings entry, and reworded the note to say all three are no-op-safe but the destructive gate is opt-in (=make install-hooks= excludes it by default — link manually before relying on the snippet entry). +*** Tier 3 (v3, may not land) -*** 2026-05-22 Fri @ 15:35:06 -0500 Hooks now scan file-backed commit/PR messages +- *Concrete-over-abstract* preference +- *Emphatic word at sentence end* (S&W rule 18) +- *Vary sentence length / rhythm* +- *Reading-grade-level scoring* (Hemingway-style) -Added =read_referenced_file()= to =_common.py= (safe local read: missing/oversize/non-UTF-8 → None) and wired it in: =git-commit-confirm.py= =extract_commit_message= now handles =-F=/=--file=/=--file==== (reads + scans the file, falls through to UNPARSEABLE → asks if unreadable), and =gh-pr-create-confirm.py= reads =--body-file= content instead of a placeholder. Attribution scanning now sees the real committed/posted text. Built a pytest harness (=hooks/tests/=, importlib-by-path loader for the hyphen-named hooks) and wired =hooks/tests= into =make test=. 54 hook tests pass; full suite green. +*** Personal-style pass placement -*** 2026-05-22 Fri @ 15:35:06 -0500 Rewrote destructive-bash rm parsing on shlex +| # | Pass | Mode | Why | +|---|------|------|-----| +| 1 | First-person voice rewrite | personal only | Forces "I" voice; wrong for academic prose where third-person and "we" are conventional | +| 2 | Jargon-fragment → complete sentence | both | Universal clarity, no genre conflict | +| 3 | Semicolon → period/comma | personal only | Semicolons are conventional in long-form / academic prose | +| 4 | Contractions ("it's", "don't") | personal only | Academic and formal writing typically avoids contractions | +| 5 | Sentence split on conjunctions | personal only | Foucault, Hegel, Adorno deliberately use long compound sentences | +| 6 | Felt-experience narration ("I'll feel this every time") | personal only | Personal essays *use* felt-experience as content | +| 7 | Noun-ified verbs ("the ask", "a learn", "the spend") | both | Targets corporate-speak with curated wordlist; doesn't catch philosophical nominalizations like "the becoming" | +| 8 | Sentence fragments → complete (in prose) | personal only | Fragments are valid stylistic devices in literary prose | +| 9 | Terse cut (rhetorical padding: "worth noting", "it's important to understand") | personal only | Tier 1 omit-needless-words covers the worst offenders universally; aggressive cut conflicts with academic register | +| 10 | Public-artifact scope check (local paths, private repos, personal tooling) | personal only — *flag-only*, no auto-rewrite | Operational/safety check, not stylistic; auto-masking risks silently editing meaningful text | -=detect_rm_rf= now tokenizes with =shlex.split= instead of a whitespace split, so quoted/spaced paths and combined/separate/reordered flags (=-rf=, =-r -f=, =-fr=, =--recursive=/=--force=) all parse. Fails toward asking — returns a sentinel that still fires the modal — on unbalanced quotes or when a forced recursive rm coexists with a compound/pipeline/substitution/redirect construct. Documented the supported/unsupported shell constructs in the docstrings, and extended the dangerous-path banner to =$HOME=-prefixed and wildcard targets. Covered by 25 new tests. (Pre-existing, out-of-scope: path-prefixed =rm= like =/bin/rm= still isn't matched.) +*** Inclusive-language pass — explicitly excluded -** TODO [#C] Build =/update-skills= skill for keeping forks in sync with upstream -:PROPERTIES: -:LAST_REVIEWED: 2026-05-20 -:END: +Considered and rejected. Conflicts with planned writing on philosophy/history topics (Foucault on sexuality and gender, history of slavery in New Orleans). Wordlist substitutions would override deliberate vocabulary choices in those genres. -The rulesets repo has a growing set of forks (=arch-decide= from -wshobson/agents, =playwright-js= from lackeyjb/playwright-skill, =playwright-py= -from anthropics/skills/webapp-testing). Over time, upstream releases fixes, -new templates, or scope expansions that we'd want to pull in without losing -our local modifications. A skill should handle this deliberately rather than -by manual re-cloning. +*** V1 scope -*** 2026-05-16 Sat @ 01:14:40 -0500 Specification -#+begin_src cj: comment -write the specification here. -#+end_src +- [ ] Skill at =~/code/rulesets/voice/= with =SKILL.md= +- [ ] Frontmatter with positive triggers (commit, PR, comment, "humanize", "voice pass") and negative triggers (code, structured data, plain bullet lists) +- [X] Mode invocation: default = =general= when invoked bare; =personal= invoked explicitly by publish-context callers +- [X] humanizer content migrated from =humanizer/= → =voice/= +- [X] Tier 1 universal passes implemented (5 patterns: #26-30, plus #31 noun-ified verbs as a universal personal addition) +- [X] 2 personal passes that run in both modes (#30 jargon-fragment, #31 noun-ified verbs) +- [X] 8 personal passes that run in personal mode only (#32 first-person, #33 semicolons, #34 contractions, #35 sentence-split, #36 felt-experience, #37 fragments, #38 terse cut, #39 scope check) +- [X] Each pass = detection-pattern + rewrite-rule pair (#39 is detection + flag-only) +- [X] Total v1 pattern count: 31 in general mode (humanizer's 25 + 4 tier-1 + 2 universal personal); +8 personal-only = 39 in personal mode +- [X] Update =commits.md= to invoke =/voice personal= instead of "run =humanizer= and apply five passes manually" +- [X] Remove the existing =humanizer/= skill (no callers outside this repo, all migrated) +- [X] =make doctor= still passes +- [X] =make lint= clean -*** 2026-05-16 Sat @ 01:14:20 -0500 original goals and decisions -**** Design decisions (agreed) +*** v2 (deferred) -- *Upstream tracking:* per-fork manifest =.skill-upstream= (YAML or JSON): - - =url= (GitHub URL) - - =ref= (branch or tag) - - =subpath= (path inside the upstream repo when it's a monorepo) - - =last_synced_commit= (updated on successful sync) -- *Local modifications:* 3-way merge. Requires a pristine baseline snapshot of - the upstream-at-time-of-fork. Store under =.skill-upstream/baseline/= or - similar; committed to the rulesets repo so the merge base is reproducible. -- *Apply changes:* skill edits files directly with per-file confirmation. -- *Conflict policy:* per-hunk prompt inside the skill. When a 3-way merge - produces a conflict, the skill walks each conflicting hunk and asks Craig: - keep-local / take-upstream / both / skip. Editor-independent; works on - machines where Emacs isn't available. Fallback when baseline is missing - or corrupt (can't run 3-way merge): write =.local=, =.upstream=, - =.baseline= files side-by-side and surface as manual review. +- [ ] Tier 2 universals (positive form, word-pair corrections, parallelism, tense consistency, acronym definition) +- [ ] Per-pass severity flags for Tier 1 active-voice (suggestion-only when actor is implicit; auto-rewrite when actor is named) +- [ ] Reporting mode: list which passes fired and which were no-ops -**** V1 Scope +*** v3 (aspirational, may not land) -- [ ] Skill at =~/code/rulesets/update-skills/= -- [ ] Discovery: scan sibling skill dirs for =.skill-upstream= manifests -- [ ] Helper script (bash or python) to: - - Clone each upstream at =ref= shallowly into =/tmp/= - - Compare current skill state vs latest upstream vs stored baseline - - Classify each file: =unchanged= / =upstream-only= / =local-only= / =both-changed= - - For =both-changed=: run =git merge-file --stdout =; - if clean, write result directly; if conflicts, parse the conflict-marker - output and feed each hunk into the per-hunk prompt loop -- [ ] Per-hunk prompt loop: - - Show base / local / upstream side-by-side for each conflicting hunk - - Ask: keep-local / take-upstream / both (concatenate) / skip (leave marker) - - Assemble resolved hunks into the final file content -- [ ] Per-fork summary output with file-level classification table -- [ ] Per-file confirmation flow (yes / no / show-diff) BEFORE per-hunk loop -- [ ] On successful sync: update =last_synced_commit= in the manifest -- [ ] =--dry-run= to preview without writing +- [ ] Tier 3 (concrete-over-abstract, emphatic-word position, sentence-length variation, reading-grade scoring) +- [ ] Progressive disclosure split: =voice/SKILL.md= orchestrator + =voice/passes/.md= per pass with worked examples -**** V2+ (deferred) +*** Migration (resolved) -- [ ] Track upstream *releases* (tags) not just branches, so skill can propose - "upgrade from v1.2 to v1.3" with release notes pulled in -- [ ] Generate patch files as an alternative apply method (for users who prefer - =git apply= / =patch= over in-place edits) -- [ ] Non-interactive mode (=--non-interactive= / CI): skip conflict resolution, - emit side-by-side files for later manual review -- [ ] Auto-run on a schedule via Claude Code background agent -- [ ] Summary of aggregate upstream activity across all forks (which forks have - upstream changes waiting, which don't) -- [ ] Optional editor integration: on machines with Emacs, offer - =M-x smerge-ediff= as an alternate path for users who prefer ediff over - per-hunk prompts +Decision: deleted =humanizer/= entirely. Three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=) all updated to invoke =/voice= directly. No alias needed since nothing outside the repo invoked humanizer. -**** Initial forks to enumerate (for manifest bootstrap) +*** Naming alternatives considered -- [ ] =arch-decide= → =wshobson/agents= :: =plugins/documentation-generation/skills/architecture-decision-records= :: MIT -- [ ] =playwright-js= → =lackeyjb/playwright-skill= :: =skills/playwright-skill= :: MIT -- [ ] =playwright-py= → =anthropics/skills= :: =skills/webapp-testing= :: Apache-2.0 +- =voice= — chosen. Captures both modes; broad enough. +- =polish= — descriptive of multi-pass nature; less prescriptive about whose voice. +- =house-style= — signals "this is the house style"; appropriate for personal repo. +- =commit-voice= — too narrow (passes apply to research notes, emails, etc. in general mode). +- =humanize= (extending current) — undersells the universal + personal additions. -**** Open questions +*** Open questions before implementation -- [ ] What happens when upstream *renames* a file we fork? Skill would see - "file gone from upstream, still present locally" — drop, keep, or prompt? -- [ ] What happens when upstream splits into multiple forks (e.g., a plugin - reshuffles its structure)? Probably out of scope for v1; manual migration. -- [ ] Rate-limit / offline mode: if GitHub is unreachable, should skill fail - or degrade gracefully? Likely degrade; print warning per fork. +Resolved during implementation: +- Default mode when =/voice= is invoked bare: =general=. Personal-context callers (=commits.md= publish flow, =respond-to-cj-comments.md=) invoke =/voice personal= explicitly. Avoids accidentally first-person-ifying research notes. +- Reporting: skill prints "Summary of changes" listing which patterns fired (audit value). +- Public-artifact scope check (#39): flag-only, user resolves manually. Blocking would frustrate on legitimate path mentions. +- Tier 1 active-voice detection: suggestion-only in v1. Auto-rewrite for unambiguous cases deferred to v2. -** TODO [#B] Build /research-writer — clean-room synthesis for research-backed long-form -SCHEDULED: <2026-05-15 Fri> +** DONE [#B] Add =--archive-done= mode to =.ai/scripts/todo-cleanup.el= :feature: -Gap in current rulesets: between =brainstorm= (idea refinement → design doc) -and =arch-document= (arc42 technical docs), there's no skill for -research-backed long-form prose — blog posts, essays, white papers, -proposals with data backing, article-length content with citations. +Opt-in mode that moves every level-2 subtree whose TODO state is DONE or CANCELLED out of the "Open Work" section and into the "Resolved" section of the same org file, subtree intact. -Craig writes documents across many contexts (defense-contractor work, -personal, technical, proposals). The gap is real. - -*Evaluated 2026-04-19:* ComposioHQ/awesome-claude-skills has a -=content-research-writer= skill (540 lines, 14 KB) that attempts this. *Not -adopting:* -- Parent repo has no LICENSE file — reuse legally ambiguous -- Bloated: 540 lines of prose-scaffolding with no tooling -- No citation-style enforcement (APA/Chicago/IEEE/MLA) -- No source-quality heuristics (primary vs secondary, peer-review, recency) -- Fictional example citations in the skill itself (models the hallucination - failure mode a citation-focused skill should prevent) -- No citation-verification step -- Overlaps with =humanizer= at polish with no composition guidance - -*Patterns worth lifting clean-room (from their better parts):* -- Folder convention =~/writing//= with =outline.md=, - =research.md=, versioned drafts, =sources/= -- Section-by-section feedback loop (outline validated → per-section - research validated → per-section draft validated) -- Hook alternatives pattern (generate three hook variants with rationale) - -*Additions for the clean-room version (v1):* -- Citation-style selection (APA / Chicago / MLA / IEEE / custom) with - style-specific examples and a pick-one step up front -- Source-quality heuristics: primary > secondary; peer-reviewed; recency - thresholds by domain; publisher reputation; funding transparency -- Citation-verification discipline: fetch real sources, never fabricate, - mark unverifiable claims with =[citation needed]= rather than inventing -- Composition hand-off to =/humanizer= at the polish stage -- Classification awareness: if the working directory or context signals - defense / regulated territory, flag any sentence that might touch CUI - or classified material before emission +- *Section matching.* Key on a top-level heading containing "Open Work" and one containing "Resolved" — that pairing is the only naming consistent across projects (=Work Open Work= / =Work Resolved= here; bare =Open Work= / =Resolved= elsewhere). Require exactly one match for each; otherwise skip with a clear message, no crash. +- *Modes.* =--check= previews and writes nothing, same as the existing hygiene pass. Idempotent. Not run by default in the wrap-up flow — archiving is consequential, so it stays opt-in: =emacs --batch -q -l todo-cleanup.el --archive-done FILE=. +- *Edge cases.* Source or target section missing; subtree at EOF; nested DONE subtree under an open parent stays put (only level-2 entries move); nothing to move → clean no-op. +- *Tests.* TDD with ERT — the project's first elisp tests. Fixtures (synthetic) under =.ai/scripts/tests/=; run via =make test= (rulesets) or =make test-scripts= (claude-templates), which run pytest + every =tests/test-*.el= ERT suite. Cases: one DONE level-2 moves; multiple; CANCELLED also moves; structural (no-state) headings don't move; nested DONE under an open parent stays; level-2 DONE with open level-3 children moves intact; subtree at EOF; missing source/target section; ambiguous "Resolved"; lowercase headings; nothing-to-do; idempotency; =--check= preview + its idempotency; realistic-sample integration. -*Target:* ~150-200 lines, clean-room per blanket policy. +Origin: came up while scrubbing a project's todo.org on 2026-05-11 — moving a big completed PROJECT subtree (plus a few smaller ones) into the Resolved section by hand was the cue to build a reusable tool. -*When to build:* wait for a real research-writing task to validate the -design against actual document patterns. Building preemptively risks -tuning for my guess at Craig's workflow rather than his real one. -Triggers that would prompt "let's build it now": -- Starting a white paper / proposal that needs citation discipline -- Writing a technical blog post with external references -- A pattern of hitting the same research-writing friction 3+ times +Built and shipped 2026-05-11: =--archive-done= added to =.ai/scripts/todo-cleanup.el= test-first; 13-test ERT suite (=tests/test-todo-cleanup.el=) + realistic synthetic fixture (=tests/fixtures/todo-sample.org=), wired into =make test= / =make test-scripts= alongside pytest. The CLI dispatch moved into =tc-main= behind a guard so the suite can =require= the file without firing it. Section matching is case-insensitive and tolerates the = Open Work= / = Resolved= naming variants. Opt-in only — not wired into the wrap-up flow. Source of truth is =~/projects/claude-templates/=; rsync'd into this repo. -Upstream reference (do not vendor): ComposioHQ/awesome-claude-skills -=content-research-writer/SKILL.md=. +** DONE [#B] Encode follow-up filing rules into =/start-work= +CLOSED: [2026-05-15 Fri] -** TODO [#C] Try Skill Seekers on a real DeepSat docs-briefing need -SCHEDULED: <2026-05-15 Fri> +Phase 4 step 5 of =/start-work= ("refactor audit") says any candidate that isn't fix-now must land in one of three buckets: fold-into-related-commit, separate =refactor:= commit, or "file a ticket or todo.org entry." The third disposition doesn't say *where* — which leaves the orchestrator picking a location ad-hoc. Result: follow-ups buried under children of an epic parent get orphaned when the parent closes, or follow-ups for standalone tasks scatter across the file with no convention. -=Skill Seekers= ([[https://github.com/yusufkaraaslan/Skill_Seekers]]) is a Python -CLI + MCP server that ingests 18 source types (docs sites, PDFs, GitHub -repos, YouTube videos, Confluence, Notion, OpenAPI specs, etc.) and -exports to 20+ AI targets including Claude skills. MIT licensed, 12.9k -stars, active as of 2026-04-12. +Proposed placement rule (already memorized for this project as =feedback_followups_as_siblings.md=, generalizing): -*Evaluated: 2026-04-19 — not adopted for rulesets.* Generates -*reference-style* skills (encyclopedic dumps of scraped source material), -not *operational* skills (opinionated how-we-do-things content). Doesn't -fit the rulesets curation pattern. +- *Epic-style parent task* (level-2 with multiple level-3 children) → follow-ups file as level-2 *siblings* of the parent. Stays visible after parent closure. +- *Standalone task* (level-2 with no children, or a level-3 inside another structure) → follow-up files as a new level-2 top-level entry in the same =* Open Work= section. Don't nest under the originating task. -*Next-trigger experiment (this TODO):* the next time a DeepSat task needs -Claude briefed deeply on a specific library, API, or docs site — try: -#+begin_src bash -pip install skill-seekers -skill-seekers create --target claude -#+end_src -Measure output quality vs hand-curated briefing. If usable, consider -installing as a persistent tool. If output is bloated / under-structured, -discard and stick with hand briefing. +Both cases: include a "Triggered by: " line so a future reader sees what surfaced it. -*Candidate first experiments (pick one from an actual need, don't invent):* -- A Django ORM reference skill scoped to the version DeepSat pins -- An OpenAPI-to-skill conversion for a partner-vendor API -- A React hooks reference skill for the frontend team's current patterns -- A specific AWS service's docs (e.g. GovCloud-flavored) +Update =.claude/commands/start-work.md= Phase 4 step 5's "Disposition for each candidate" section to spell this out. Update any cross-references in =commits.md= or other files that touch the discipline. -*Patterns worth borrowing into rulesets even without adopting the tool:* -- Enhancement-via-agent pipeline (scrape raw → LLM pass → structured - SKILL.md). Applicable if we ever build internal-docs-to-skill tooling. -- Multi-target export abstraction (one knowledge extraction → many output - formats). Clean design for any future multi-AI-tool workflow. +Triggered by: 2026-05-15 fold-epic session — Craig flagged the gap mid-flight after I'd surfaced a follow-up but hadn't filed it. +** DONE [#A] Consolidate =.ai/= template infrastructure (fold + audit + install-ai + ratio) :feature: +CLOSED: [2026-05-15 Fri] -*Concerns to verify on actual use:* -- =LICENSE= has an unfilled =[Your Name/Username]= placeholder (MIT is - unambiguous, but sloppy for a 12k-star project) -- Default branch is =development=, not =main= — pin with care -- Heavy commercialization signals (website at skillseekersweb.com, - Trendshift promo, branded badges) — license might shift later; watch -- Companion =skill-seekers-configs= community repo has only 8 stars - despite main's 12.9k — ecosystem thinner than headline adoption +End-state: one repo (=rulesets=) is the single source of truth for =.ai/= template content. =make audit= verifies and applies drift across every =.ai/=-using project on the machine. =make install-ai= bootstraps new projects. Same setup propagated to ratio so both machines run the same way. -** TODO [#C] Revisit =c4-*= rename if a second notation skill ships +Today (2026-05-15) the canonical-source rule got violated again: rulesets commit =372fb76= added a wrap-up subsection to =rulesets= without going through =claude-templates= first, and the next session's startup rsync was about to silently undo it. Two-repo coordination is the root cause; fold solves it. -Current naming keeps =c4-analyze= and =c4-diagram= as-is (framework prefix -encodes the notation; "C4" is a discoverable brand). Suite membership is -surfaced via the description footer, not the name. +Build order: fold first (others depend on the new canonical path), then audit + install-ai in parallel, then test, then propagate to ratio. -If a second notation-specific skill ever lands (=uml-*=, =erd-*=, =arc42-*=), -the compound pattern =arch-analyze-= / =arch-diagram-= -starts paying off: alphabetical clustering under 'a' amortizes across three+ -skills, and the hierarchy becomes regular. At that point, rename all -notation skills together in one pass. +*** DONE [#A] Fold =claude-templates= into rulesets +CLOSED: [2026-05-15 Fri] -Trigger: adding skill #2 in the notation family. Don't pre-rename. +Two repos, one source of truth. =~/projects/claude-templates/= is the canonical =.ai/= template that gets rsync'd into every project at session start. Keeping it standalone means a second =git pull= in startup Phase A.0, a second remote to push to at wrap-up, and a split history any time a change touches both. Folding it into =rulesets/claude-templates/= gives one repo to clone on a fresh machine and one place to edit templates. -Candidate future notation skills (not yet in scope — noted for when a -real need arrives, not pre-emptively): +**** Open design choices -- *UML* (Unified Modeling Language): OO design notation, 14 diagram types - in practice dominated by class / sequence / state / component. Common - in DoD / safety-critical / enterprise-architecture contexts. Tooling: - PlantUML (text-to-diagram), Mermaid UML, draw.io. Would likely split - into =uml-class=, =uml-sequence=, =uml-state= rather than one monolith - — different audiences, different inputs. -- *ERD* (Entity-Relationship Diagram): database schema modeling — - entities, attributes, cardinality. Crow's Foot notation dominates - practice; Chen is academic; IDEF1X is DoD-standard. Tooling: - dbdiagram.io, Mermaid ERD, PlantUML, ERAlchemy (code-to-ERD for SQL). - Natural fit as =erd-analyze= (extract from schema/migrations) and - =erd-diagram= (generate from prose/model definitions). -- *arc42*: already partially covered by =arch-document= (which emits - arc42-structured docs). A standalone =arc42-*= skill would be - redundant unless the arc42-specific visualizations need separation. +- *History.* =git subtree add --prefix=claude-templates ~/projects/claude-templates main= preserves the 84-commit history under the new prefix. Plain content copy (=cp -a= + =git add=) is simpler but loses history. Either is fine since the standalone repo stays archived on =cjennings.net=. +- *Layout.* =rulesets/claude-templates/= mirrors the old repo name and sits next to =claude-rules/= cleanly. Alternative: absorb =.ai/= directly under a different name (=rulesets/.ai-template/= or similar). First option is clearer. +- *bin/ai.* The standalone Makefile symlinks =$HOME/.local/bin/ai → bin/ai=. After the move, fold that into rulesets' Makefile as another install target. -Each answers a different question: +**** Mechanical steps -- C4 → "What systems exist and how do they talk, at what zoom?" -- UML class/sequence → "What does the code look like / what happens when X runs?" -- ERD → "What's the database shape?" -- arc42 → "What's the full architecture document?" +1. Subtree-merge or copy =~/projects/claude-templates/= into =rulesets/claude-templates/=. +2. Update 3 references in rulesets: + - =.ai/protocols.org= line 163 — pointer in the "Let's run/do the X workflow" section. + - =.ai/workflows/cross-agent-comms.org= line 8 — promotion-target path. + - =.ai/workflows/startup.org= lines 22, 96-98 — Phase A.0 pull + Phase A rsync sources. +3. Update Phase A.0 of =startup.org= to pull rulesets instead of claude-templates. Inside rulesets sessions, the existing project-repo pull already covers it. Outside rulesets (every other project's session), Phase A.0 needs an explicit =git pull= on =~/code/rulesets/= before the rsync — otherwise the templates will be stale. +4. Replace =~/projects/claude-templates/= with a symlink to =~/code/rulesets/claude-templates/= for transition continuity. +5. After every active project has had one session start (and rsync'd the new =startup.org=), drop the symlink and archive =cjennings.net:git/claude-templates.git=. -Deferred pending an actual need that's blocked on not having one of these. +**** Bootstrap gap -*** DoD-specific notations (DeepSat context) +Every project on the machine has a =.ai/workflows/startup.org= that rsyncs from =~/projects/claude-templates/=. Until each project's startup.org gets refreshed (which happens via the rsync itself), the old path needs to keep resolving. The symlink at step 4 is the bridge: old paths resolve into the new location, the rsync delivers the updated startup.org, next session uses the new path directly. -Defense-contractor work uses a narrower, different notation set than -commercial software. Document the trigger conditions and starting point -so a future decision to build doesn't have to re-derive the landscape. +*** DONE [#A] Add =make audit= — drift detector across all =.ai/=-using projects +CLOSED: [2026-05-15 Fri] -**** SysML (Systems Modeling Language) +Companion to =make doctor= (single-machine scope, checks =~/.claude/=). =audit= is cross-project scope: walks every directory on the machine that has a =.ai/=, diffs the synced template files against the canonical source, and reports drift. =--apply= flag rsyncs the drift into the project's working tree (no auto-commit). Catches stale projects without forcing a session start in each one. -UML 2 profile, dominant in DoD systems engineering. Six diagrams account -for ~all practical use: +**** Open design choices -- *Block Definition Diagram (BDD)* — structural; like UML class but for - system blocks (components, subsystems, hardware). -- *Internal Block Diagram (IBD)* — parts within a block and how they - connect (flow ports, interfaces). -- *Requirement diagram* — unique to SysML; traces requirements to - satisfying blocks. Essential in regulated environments. -- *Activity diagram* — behavioral flow. -- *State machine* — same shape as UML. -- *Sequence diagram* — same shape as UML. +- *Scope.* Template-sync drift is the useful flavor: for each project, diff =.ai/protocols.org=, =.ai/workflows/=, =.ai/scripts/= against the canonical source. +- *Source path.* Post-fold: =~/code/rulesets/claude-templates/.ai/=. Build =audit= against the new path from day one. +- *Project discovery.* Walk =~/code/=, =~/projects/=, =~/.emacs.d/= up to depth 3 for any directory containing =.ai/=. Skip the canonical source itself. +- *Default mode is report-only.* =--apply= triggers rsync; =--force= overrides the dirty-skip safety. -SysML v1.x is in the field; v2 is emerging but not yet adopted at scale -(as of 2026-04). Tooling dominated by Cameo Systems Modeler / MagicDraw -and Enterprise Architect. Text-based option: PlantUML + =plantuml-sysml= -(git-friendly, growing niche). +**** Per-project flow (designed 2026-05-15) -*Candidate skills*: =sysml-bdd=, =sysml-ibd=, =sysml-requirement=, -=sysml-sequence=. Three or more in this cluster triggers the -=arch-*-= rename discussion from the parent entry. +For each discovered project, in order: -**** DoDAF / UAF (architecture frameworks) +1. Verify =.ai/= exists (path probe). If missing → =FAIL=, skip, continue loop. +2. Detect git tracking via =git check-ignore .ai/= → =tracked= or =gitignored=. +3. Verify no uncommitted =.ai/= changes (=git status --porcelain .ai/=). Dirty → =WARN=, skip rsync unless =--force=. +4. Verify content matches canonical via three =rsync -a --dry-run --itemize-changes= calls (=protocols.org=, =workflows/=, =scripts/=). Zero items = clean. +5. Action (=--apply= only, drift detected): three =rsync -a [--delete]= calls. +6. Verify rsync converged (re-run the dry-runs; zero now). +7. Verify working-tree state after rsync (tracked projects). Report deltas. Do not auto-commit. +8. Verify no unpushed =.ai/= commits (=git log @{u}..HEAD -- .ai/=). Informational only. -Not notations themselves — frameworks that specify *which* viewpoints a -program must deliver. Viewpoints are rendered using UML/SysML diagrams. +**** Output format (mirrors =doctor=) -- *DoDAF (DoD Architecture Framework)* — legacy but still - contract-required on many programs. -- *UAF (Unified Architecture Framework)* — DoDAF/MODAF successor, - SysML-based. Gaining adoption on newer contracts. +#+begin_example +Claude-templates source: + ok rulesets/claude-templates is current (origin/main) -Common required viewpoints (formal CDRL deliverables or PDR/CDR -review packages): +Per-project .ai/ drift: + ok ~/projects/work + applied ~/projects/homelab 3 files changed + skipped ~/code/winvm uncommitted .ai/ (use --force) + ok ~/projects/clipper -- *OV-1* — High-Level Operational Concept Graphic. The "cartoon" showing - the system in operational context with icons, arrows, surrounding - actors/environment. *Universally asked for — informal or formal.* - Starting point for any DoD diagram skill. -- *OV-2* — Operational resource flows (nodes and flows). -- *OV-5a/b* — Operational activities. -- *SV-1* — Systems interfaces. Maps closely to C4 Container. -- *SV-2* — Systems resource flows. -- *SV-4* — Systems functionality. -- *SV-10b* — Systems state transitions. +Summary: 18 ok, 3 applied, 1 skipped, 0 failed +#+end_example -*Informal ask ("send me an architecture diagram") → OV-1 + SV-1 satisfies -90% of the time.* Formal CDRL asks specify the viewpoint set contractually. +Exit code: =0= if all clean, no skips, no failures. =1= otherwise. -*C4 gap*: C4 is rare in DoD. C4 System Context ≈ OV-1 in intent but not -in visual convention. C4 Container ≈ SV-1. Expect a mapping step or -reviewer pushback if delivering C4-shaped artifacts to a DoD audience. +**** Why not extend =make doctor= instead -*Candidate skills*: =dodaf-ov1=, =dodaf-sv1= first (highest-value); -=uaf-viewpoint= if newer contracts require UAF. +=doctor= has a clean meaning today: "is this machine's =~/.claude/= consistent with rulesets?" Mixing in cross-project =.ai/= drift muddies the exit code. Keep them separate. =audit= can optionally invoke =doctor= as its last check since both ask "did the symlinks keep up with the source?". A future =make all-checks= can wrap both. -**** IDEF1X (data modeling) +*** DONE [#A] Add =make install-ai PROJECT== — bootstrap =.ai/= in a fresh project +CLOSED: [2026-05-15 Fri] -FIPS 184 — federal standard for data modeling. Used in classified DoD -data systems, intelligence databases, and anywhere the government -specifies the data model. Same shape language as Crow's Foot but with -different adornments and notation conventions. +Separate target from =audit= because operating on projects that lack =.ai/= is a distinct action. The absence might be intentional, so =audit= skips them. Bootstrap is explicit opt-in. -*Rule of thumb*: classified DoD data work → IDEF1X; unclassified -contractor work → Crow's Foot unless the contract specifies otherwise. +**** Flow -*Candidate skills*: =idef1x-diagram= / =idef1x-analyze= (parallel to a -future =erd-diagram= / =erd-analyze= pair). +1. Refuse if =.ai/= already exists in =PROJECT=. Message: "already installed; use =make audit --apply= to update." +2. Verify =PROJECT= is a git checkout (warn if not — works without git, loses some lifecycle benefits). +3. Create =PROJECT/.ai/= directory. +4. Rsync canonical content: =protocols.org=, =workflows/=, =scripts/= (same three rsyncs as =audit=). +5. Seed =PROJECT/.ai/notes.org= from a canonical template with project-name placeholder. +6. Create empty =PROJECT/.ai/sessions/= (with =.gitkeep= for tracked projects). +7. Track or gitignore =.ai/=? Default: ask. Flag: =--track= / =--gitignore=. +8. Print next-steps banner: =make install-lang LANG= PROJECT==; open Claude Code in the project. -**** Tooling baseline +**** Symmetry with existing install targets -- *Cameo Systems Modeler / MagicDraw* (Dassault) — commercial SysML - dominant in DoD programs. -- *Enterprise Architect (Sparx)* — widely used for UML + SysML + DoDAF. -- *Rhapsody (IBM)* — SysML with code generation; strong in avionics / - embedded (FACE, ARINC). -- *Papyrus (Eclipse)* — open source SysML; free but clunkier. -- *PlantUML + plantuml-sysml* — text-based, version-controllable. Fits a - git-centric workflow better than any GUI tool. +#+begin_example +make install-lang LANG=python PROJECT=/path # language bundle (existing) +make install-ai PROJECT=/path # .ai/ template (new) +make install-lang # no args → fzf-pick +make install-ai # no args → fzf-pick from + # ~/projects/* + ~/code/* dirs + # without an existing .ai/ +#+end_example -**** Highest-value starting point +*** DONE [#A] Test plan for audit + install-ai before propagating to ratio +CLOSED: [2026-05-15 Fri] -If DeepSat contracts regularly require architecture deliverables, the -highest-ROI first skill is =dodaf-ov1= (or whatever naming convention -the rename discussion lands on). OV-1 is the universal currency in -briefings, proposals, and reviews; it's the one artifact that shows up -in every program regardless of contract specifics. +Test against the current state of this machine before pushing changes to ratio. -Trigger for building: an actual DoD deliverable that's blocked on not -having a skill to generate or check OV-1-shaped artifacts. Don't build -speculatively — defense-specific notations are narrow enough that each -skill should be driven by a concrete contract need, not aspiration. +**** =make audit= tests -** DONE [#B] Add =make remove= for interactive ruleset removal via fzf -CLOSED: [2026-05-22 Fri] -Shipped: =scripts/remove.sh= (three modes — =--list=, =--remove-selected= reading stdin, and the default fzf-multi interactive flow) + =make remove= target + =scripts/tests/remove.bats= (5 cases). Lists only symlinks resolving into the repo (foreign links left alone); rm's picked links while leaving repo sources untouched; reports-and-continues on a missing target; quiet no-op on empty selection. shellcheck clean, make test green. Dropped the stale =bridge= entry per the note below. +1. Dry-run report only (no =--apply=). Should show: claude-templates current; per-project drift; correct =ok=/=drift= classifications; summary line and exit code match. +2. After the fold lands, every project should be reported as drift (their =startup.org= still points at the old path). Run =--apply= → rsync converges. Re-run audit → all =ok=. +3. Manually edit one =.ai/workflows/foo.org= in a tracked project. Re-run audit → should report =skipped: uncommitted .ai/=. Run =--apply --force= → rsync clobbers the edit. Verify the edit is gone. +4. Manually delete one =.ai/= dir. Re-run audit → =FAIL: .ai/ missing=. Loop continues. +5. Idempotency: =--apply= twice in a row converges to all =ok= on the second pass. -Add a Makefile target that lists every currently-installed ruleset entry -and lets me pick one or more to remove via fzf. Granular alternative to -=make uninstall= (removes everything) and =make uninstall-hooks= (removes -only hooks). +**** =make install-ai= tests -*** Why this matters +1. Create =/tmp/test-fresh-project= as a git repo. Run =make install-ai PROJECT=/tmp/test-fresh-project=. Verify =.ai/= structure matches canonical, =notes.org= has placeholder, =sessions/= exists. +2. Run =make install-ai PROJECT=/tmp/test-fresh-project= again → should refuse (=.ai/= already exists). +3. Open Claude Code in the new project. Startup workflow runs cleanly (Phase A.0 + Phase A rsync should be a no-op since the install just ran). +4. fzf form: =make install-ai= with no args. Lists candidate dirs (=~/projects/*=, =~/code/*= without =.ai/=). -Tearing down a single skill, rule, hook, or config file currently means -either running =make uninstall= and re-installing what I want to keep, -or =rm=ing the symlink directly and remembering the exact path. Both are -friction. An interactive picker lets me filter, multi-select with Tab, -and confirm with Enter — the typical fzf flow. Costs about 3-5 seconds -per teardown instead of 15+ seconds of "what's the exact name?". +**** Pass criteria -*** Design +- =audit= behavior matches the per-project flow spec for every classification path. +- =install-ai= produces a project indistinguishable from one that's been running sessions for a while. +- =make doctor= still passes 36/0/0 after all the work. +- =make test= (pytest + ERT) passes. -The recipe builds a tab-separated list of every currently-installed item, -categorized by type, and pipes it to =fzf --multi=. The user filters, -marks with Tab, and confirms with Enter. The recipe parses the selections -and =rm=s the matching symlinks. +*** DONE [#A] Migrate projects on ratio (second machine) +CLOSED: [2026-05-15 Fri] -#+begin_example - skill debug - rule commits.md - hook destructive-bash-confirm.py - config settings.json - commands commands - bridge claude-rules -#+end_example +After local fold + audit + install-ai are working, propagate to ratio. -Each line is =\t=. The recipe maps == to the right path: +**** Steps -- =skill= → =$(SKILLS_DIR)/= -- =rule= → =$(RULES_DIR)/= -- =hook= → =$(HOOKS_DIR)/= -- =config= → =$(CLAUDE_DIR)/= -- =commands= → =$(CLAUDE_DIR)/commands= -- =bridge= → =$(SKILLS_DIR)/claude-rules= +1. On ratio: =git -C ~/code/rulesets pull= — picks up the folded =claude-templates/= subdir and updated =Makefile= targets. +2. On ratio: archive or =mv= the standalone =~/projects/claude-templates/= aside, replace with symlink to =~/code/rulesets/claude-templates/= (same bridge mechanic as local). +3. On ratio: =make audit= → see drift across ratio's projects. +4. On ratio: =make audit --apply= → rsync into each tracked/gitignored project. Surface projects with uncommitted =.ai/= drift for manual handling. +5. On ratio: =make doctor= → catch any =~/.claude/= install drift (likely some, since ratio hasn't seen recent rulesets updates). +6. Verify by opening Claude Code in a few ratio projects. Startup should be a no-op or near-zero rsync. -Source files in =rulesets/= stay untouched. =make install= re-creates the -removed links if needed (the install loop is idempotent). +**** Known unknowns -*** Edge cases +- Ratio may have its own project list overlapping with this machine's but not identical. =audit= discovers projects via the walk, so this is automatic. +- Ratio might have uncommitted =.ai/= work in some projects that this machine doesn't. =audit= surfaces them; handle case-by-case. +- If anything goes wrong, ratio's archived =~/projects/claude-templates/= is the safety net — restore the symlink target and re-run audit. -- Esc instead of Enter → empty selection → clean exit, no removal. -- Filter to nothing then Enter → same as Esc. -- Selected item already gone → =rm= fails visibly, processing continues - on the rest. -- =fzf= not installed → fail fast with a clear error (matches the pattern - used by =install-lang=). +**** Adjacent: cross-machine memory sync -*** Possible extensions +The =[#A] DOING= memory-sync investigation (todo.org:10) is adjacent. Both involve "make my Claude setup portable across machines." Coordinate so the memory-sync stow approach (if approved) doesn't conflict with this fold's symlink mechanics. +** DONE [#B] Document startup pull-ordering rule in protocols.org +CLOSED: [2026-05-15 Fri] -- Parallel =make pick-install= target that lists not-yet-installed items - and installs the chosen ones. Symmetric UX, same fzf flow. -- Confirmation prompt when more than N items selected (defense against - accidental select-all). -- =--source= flag that also runs =git rm= against the rulesets source for - the selected item. Probably bad idea — too easy to lose work. -- The =bridge → $(SKILLS_DIR)/claude-rules= entry above is stale — the - bridge symlink got removed in a later commit. Drop that bullet when the - recipe lands. +Phase A.0 of =startup.org= now pulls rulesets ff-only before the project repo +(shipped 2026-05-15 as part of the claude-templates fold — after the subtree +merge, there's no separate claude-templates pull, just rulesets-then-project). +The protocols.org paragraph stating the ordering and "resolve any issues +before proceeding" rule shipped 2026-05-15 in the =** Startup Pull Ordering= +subsection under =IMPORTANT - MUST DO=. +** DONE [#A] Build =/lint-org= skill + wrap-up integration +CLOSED: [2026-05-14 Thu] -** DONE [#B] Document the =mcp/= install pipeline in =mcp/README.org= -CLOSED: [2026-05-22 Fri] -Wrote =mcp/README.org= covering everything in the "what to cover" list: the file layout (tracked vs gitignored), the secrets-bundle shape (plain =${VAR}= secrets + base64-bundled OAuth artifacts, AES256 symmetric =gpg -c=), the install flow (decrypt → materialize keys/token caches at mode 600 → expand → register unregistered, idempotent), the http/sse-vs-stdio transport split, token rotation when a Google refresh token is revoked, and adding a new server. Grounded in a read of the actual =install.py= + =servers.json=. +Spec: [[file:.ai/specs/lint-org-skill-spec.md]] -=mcp/= has =install.py=, =servers.json=, =secrets.env.gpg=, =gcp-oauth.keys.json= (gitignored, regenerated at install). No README. Coming back to this in three months I'll re-discover how the bundle is structured, what =install.py= does, and how to rotate tokens. Saving that re-discovery is the whole point. +A two-mode skill (=interactive=, =mechanical-only=) that runs =org-lint=, +auto-fixes safe categories (item-number, missing-language-in-src-block, +misplaced-planning-info, markdown-bold → single-asterisk), and walks judgment +items (broken local-file links, invalid fuzzy links, verbatim-asterisk false +positives, suspicious-language blocks) inline. -*** What to cover +Wrap-up integration: =wrap-it-up.org= invokes +=/lint-org todo.org --mode=mechanical-only= after the existing +=todo-cleanup.el --archive-done= pass. Judgment items defer to a +carry-forward file that the next morning's daily-prep merges in, so +wrap-up never blocks on a judgment call. -- Layout: what each file is, which are tracked vs gitignored. -- Secrets bundle shape: how vars are listed in =secrets.env=, the symmetric-encryption pattern (=gpg -c --cipher-algo AES256=), the base64-bundled OAuth artifacts (=GCP_OAUTH_KEYS_JSON_B64=, =GOOGLE_DOCS_PERSONAL_TOKEN_B64=, =GOOGLE_DOCS_WORK_TOKEN_B64=). -- Install flow: =make install-mcp= → =install.py= decrypts, writes the keys file and Google Docs token caches at mode 600, expands =${VAR}= in =servers.json=, calls =claude mcp add --scope user= for unregistered servers. Idempotent. -- Token rotation: when a refresh token gets revoked, the recovery flow (re-auth on one machine, re-bundle, recommit). -- Adding a new server: edit =servers.json=, add any new =${VAR}= placeholders to the bundle, re-encrypt. -- The OAuth dance for HTTP-transport servers (linear, notion) versus stdio (google-docs-*) — different paths, different gotchas. +Baseline that motivated this: the 2026-05-14 manual pass took =todo.org= +from 55 → 1 lint warnings across two commits (=0d10458= signal, +=9ad5b30= cosmetic). A nightly mechanical sweep keeps the count near +zero forever — each day's drift is small. +** DONE [#C] Test harness for =make audit= + =make install-ai= edge cases :test: +CLOSED: [2026-05-15 Fri] -** TODO [#C] Add =make uninstall-mcp= + =mcp/install.py --check= for symmetry +Three edge cases from the fold-epic test plan were not exercised because they're destructive on real projects: -Currently the MCP install pipeline only flows one direction. No way to remove rulesets-managed MCP servers in one command. No way to ask "what's the drift between =servers.json= and =claude mcp list=" without eyeballing. +- =audit --force= clobbers uncommitted =.ai/= work — needs a project with intentionally dirty =.ai/= to verify the override path. +- =audit= reports =FAIL= when =.ai/= is missing — needs a project where the directory was deleted to verify the loop continues past the failure. +- =install-ai= fzf-pick form (no =PROJECT= arg) — needs interactive testing. -*** =make uninstall-mcp= +Build a self-contained test harness under =.ai/scripts/tests/= that spins up =/tmp/audit-test-projects/= with a known matrix of project states (clean, dirty, missing =.ai/=, pristine, etc.), runs the audit + install-ai targets against it, and asserts expected outputs. The harness should clean up after itself. -Iterate over =servers.json=, run =claude mcp remove -s user= for each. Ignore "not registered" errors. Idempotent. +Pattern reference: bats or shell-based assertions (similar to the elisp ERT suites for =todo-cleanup= and =lint-org=, but for shell scripts). -*** =mcp/install.py --check= +Triggered by: 2026-05-15 fold-epic, child 4 test plan; commits =94782ee= (audit) + =d364cf2= (install-ai). +** DONE [#A] wrap it up mentions github, which isn't the remote for many projects. :chore: +CLOSED: [2026-05-16 Sat] +For many of them, git.cjennings.net mirrors to github.com, and github.com isn't the remote. +For many others, git.cjennings.net is the remote with no mirror. +Remove or replace the reference to github.com +** DONE [#B] Phase A startup blind to =claude-templates/inbox/= post-fold :bug:fold: +CLOSED: [2026-05-19 Tue] -Dry-run mode. Decrypt secrets, but instead of registering, print the drift report: +Resolved on inspection: the bug is moot in current state. =inbox-send.py='s discovery scans =~/code/*= and =~/projects/*= single-level only, so =claude-templates/= (two levels under =~/code/=) is never a routable target; the 2026-05-15 incident was a one-time manual workaround because =rulesets/inbox/= didn't exist yet, and that root inbox was added in =470085f=. =claude-templates/inbox/= was removed 2026-05-15 and is no longer on disk. -- Servers in =servers.json= not in =claude mcp list= → =MISSING= -- Servers in =claude mcp list= not in =servers.json= → =EXTRA= -- Servers in both → =ok= +Phase A's inbox check at =startup.org:107= runs =\ls -la inbox/= against the project root. Post-fold, the canonical's inbox sits inside the subtree at =claude-templates/inbox/= and never gets scanned. A 2026-05-15 cross-project handoff from a dotemacs session dropped a record there; the next rulesets session (this one) missed it at startup entirely. Picked up only when the working-tree drift surfaced during the publish flow. -Useful for diagnosing connection failures and for the eventual =make doctor= integration. +Fix: extend Phase A's discovery to also scan =claude-templates/inbox/= when the canonical lives in-repo (i.e., when =claude-templates/.ai/= exists alongside =./.ai/=). The Phase B/C inbox-processing flow already handles per-file routing once a file is surfaced; the gap is only in discovery. -** TODO [#C] Update =README.org= with MCP install pipeline section +Adjacent question worth answering at the same time: should cross-project handoffs file into =./inbox/= at the project root (matching what Phase A already scans), or stay in =claude-templates/inbox/= and rely on the discovery fix? The =inbox-send= script's target-project logic is the place to settle that. -=README.org= covers global install, per-project language bundles, and design principles, but doesn't mention =make install-mcp= or the =mcp/= directory. Add a short section after "Per-project language bundles" describing the user-scope MCP install pattern (decrypt → expand → register) and pointing at the eventual =mcp/README.org=. +Triggered by: 2026-05-15 evening session, surfaced when committing the test-harness work. +** DONE [#A] Implement task-review daily-habit per spec +CLOSED: [2026-05-20 Wed] +:PROPERTIES: +:LAST_REVIEWED: 2026-05-20 +:END: +Spec: [[file:docs/design/task-review.org]] -** TODO [#C] Token-rotation helper for =@a-bonus/google-docs-mcp= OAuth refresh +Retires =wrap-it-up.org='s date-coverage scan and replaces it with a daily list-hygiene review (N=7 oldest-unreviewed top-level =[#A]= / =[#B]= / =[#C]= tasks per session, ~12-day rotation). Built as a pure Claude workflow — Shape B, no elisp; see the spec's Revision section for why the elisp approach was dropped. -When a Google refresh token gets revoked (re-grant scopes, removed Connected App, account password reset), recovery is currently manual: run =npx -y @a-bonus/google-docs-mcp= with the right env, follow the URL in a browser, kill the process, base64-encode the new =token.json=, decrypt =secrets.env.gpg=, replace the var, re-encrypt. A small =mcp/refresh-google-docs-token.sh = would chain that into one command. +Status: +1. [X] =task-review-staleness.sh= + bats (count + =--list= modes). +2. [X] =wrap-it-up.org= health check (threshold 30). +3. [-] =task-review.el= — dropped (Shape B is a pure workflow, not an Emacs mode). +4. [X] New =task-review.org= workflow + INDEX entry (the existing listing workflow was renamed to =open-tasks.org= to free the name). +5. [X] Startup nudge in template =startup.org= (threshold 7), not the project-only startup-extras layer. +6. [X] Smoke test against live =todo.org= — first cycle run 2026-05-20 (7 tasks reviewed: 3 re-grades, 1 cancellation, 1 bump-and-tag). -*** Sketch +Triggered by: 2026-05-16 brainstorm on retiring the date-coverage scan. +** CANCELLED [#B] Build =ov-1= skill for DoDAF OV-1 (High-Level Operational Concept Graphic) +CLOSED: [2026-05-20 Wed] -#+begin_src bash -# usage: mcp/refresh-google-docs-token.sh personal -profile="$1" -gpg -d ... | grep -v "GOOGLE_DOCS_${profile^^}_TOKEN_B64" > /tmp/secrets.env.tmp -GOOGLE_MCP_PROFILE="$profile" npx -y @a-bonus/google-docs-mcp & -xdg-open -# wait for ~/.config/google-docs-mcp/$profile/token.json to land -kill %1 -echo "GOOGLE_DOCS_${profile^^}_TOKEN_B64=$(base64 -w0 ~/.config/google-docs-mcp/$profile/token.json)" >> /tmp/secrets.env.tmp -gpg -c --cipher-algo AES256 -o mcp/secrets.env.gpg.new /tmp/secrets.env.tmp -mv mcp/secrets.env.gpg.new mcp/secrets.env.gpg -rm /tmp/secrets.env.tmp -#+end_src +Cancelled during the 2026-05-20 task review. -The flow tonight worked but took a handful of manual steps. One script collapses it. +Triggered by SOFWeek (May 2026, Tampa) — DeepSat attending; DoD attendees +may ask for architecture diagrams. OV-1 is the universal informal +currency in DoD briefings ("show me the architecture" → OV-1 by default). -** TODO [#C] Decide on category-3 rule copies in the deepsat tree +Priority upgrades to =[#A]= if Craig confirms scenario 2 below (personal +load-bearing need at the event); stays =[#B]= or drops to =[#C]= if +scenario 1 (team already covers it, future asset only). -While symlinking personal-project =.claude/rules/= mirrors to the rulesets canonical on 2026-05-07, two locations didn't fit the "personal mirror → symlink" pattern and were left untouched pending judgment: +*** Prior art (searched 2026-04-19) -- =~/projects/work/deepsat/code/coding-rulesets/claude-rules/{testing,verification}.md= — looks like a vendored team-shared copy. -- =~/projects/work/deepsat/code/orchestration_dashboard_mvp/.claude/rules/{testing,verification}.md= — could be project-specific overrides. +No existing Claude Code skill exists for DoDAF / OV-1 / SV-1 / SysML. -For each: read the file, diff against the rulesets canonical, decide whether it's an intentional diverge (leave alone), stale (sync content), or should canonicalize (replace with symlink and accept the cross-repo dependency). The orchestration_dashboard_mvp pair is the project where Vrezh's PR review surfaced this whole thread, so any decision there has team-visibility implications. +- =anthropics/skills= — 17 skills, zero DoDAF/SysML/defense coverage. +- =awesome-claude-code= list — zero hits for DoDAF/OV-1/SysML/UAF. +- =mfsgr/sysml2dodaf= — empty repo (0 stars, no code). Vapor. +- =HowardKao-1130/mini-NEXEN= — broad SE methodology skill that + name-drops DoDAF as a trigger keyword; no artifact generation. 0 stars. +- =gaphor/gaphor= (Apache-2.0, 2.2k stars) — mature UML/SysML GUI + modeler. Not a skill; not a pipeline. Useful reference only. -** TODO [#C] Audit language-specific rule files for cross-project duplication +Nearest prior art to lean on when building: +- DoDAF 2.02 Viewpoints & Models reference (dodcio.defense.gov) — + canonical OV-1 exemplars. Embed 3-5 layouts as skill =references/=. +- Pattern from existing =c4-diagram= skill — same shape (prose → diagram + spec), swap the viewpoint vocabulary to DoDAF. +- PlantUML for SV-1 (when that skill comes later); Mermaid or draw.io + XML for OV-1 lightweight visuals. -The four canonical rules (=commits=, =testing=, =verification=, =subagents=) are now symlinked across the five personal-project mirrors as of 2026-05-07. But several language-specific rule files exist in multiple project mirrors and may be duplicated or drifted: +*** Build scope (when triggered) -- =python-testing.md= in =~/projects/work/.claude/rules/= -- =typescript-testing.md= in =~/projects/work/deepsat/code/.claude/rules/= -- =elisp-testing.md= and =elisp.md= in =~/.emacs.d/=, =~/code/gloss/=, =~/code/chime/= +*In scope:* +- Input: prose description of a system + its operational context. +- Output: structured OV-1 *spec* — performers, external actors (other + systems, forces, adversaries), relationships (data/control flows), + narrative captions, classification marking, legend requirements. +- DoDAF 2.02 completeness checklist as a quality gate — verify the + produced spec contains every element a correct OV-1 requires. +- Optional lightweight visual: draw.io XML or Mermaid approximation for + quick review; NOT a finished rendering. -The Elisp pair is the most suspicious — three repos using essentially the same rules. Audit: diff these across the projects, check for drift, then decide whether to canonicalize them under =~/code/rulesets/claude-rules/languages//= and symlink, or leave them as project-local. +*Out of scope:* +- Icon libraries, pictorial assets, finished PowerPoint export. OV-1 + final art belongs to a designer or Craig in Visio/PowerPoint; the + skill's job is the spec and the check, not the slide. +- SV-1, SV-2, UAF, IDEF1X, other viewpoints. Build only when a + concrete need triggers each. -** TODO [#C] Consolidate =claude-templates/Makefile= after fold :chore: +Estimate: 4-6 hours. -Sibling follow-up from the fold child (2026-05-15). After the subtree merge, =rulesets/claude-templates/Makefile= still has its standalone =install= / =uninstall= / =list= / =test-scripts= targets. The =install= target's =bin/ai= logic is now duplicated in =rulesets/Makefile=. Both work; the redundancy is harmless but worth cleaning up. +*** Craig's investigation before kickoff -Options: -- *Delete* =claude-templates/Makefile= entirely — forces all install through rulesets root. Cleaner. -- *Strip down* to just =test-scripts= — the one piece not redundant with =rulesets/Makefile=. -- *Leave it* — slight redundancy, no functional harm. +1. Does DeepSat's systems-engineering or marketing team already have an + OV-1 (or the equivalent briefing artifact) for SOFWeek? +2. If yes (scenario 1) — skill is a future asset, not event-load-bearing. + Ship after SOFWeek. Priority drops to =[#C]=. +3. If no, or if the scenario is "Craig may need to produce/iterate an + OV-1 on the fly during the event" (scenario 2) — skill is load-bearing + for the event. Priority upgrades to =[#A]=; build before SOFWeek. +4. Confirm the classification level the skill needs to handle + (unclassified-only? or FOUO markings? affects the classification + block in the spec). +5. Confirm the target rendering format DeepSat uses for OV-1 + deliverables (PowerPoint slide? Cameo? Visio? affects whether the + skill emits draw.io XML vs Mermaid vs pure structured spec). -Triggered by: 2026-05-15 fold session's refactor audit (commit =2d645fc=). +*** Related -** TODO [#C] Refactor =daily-prep.org= to delegate to =triage-intake.org= for the triage section +See also the DoD-specific notations section under the later TODO +(=c4-*= rename revisit) — OV-1 is flagged there as the highest-value +starting point across the DoD notation landscape (SysML, DoDAF/UAF, +IDEF1X). This entry is the execution plan for that starting point. +** DONE [#A] Split team-specific publishing rules out of commits.md :commits: +CLOSED: [2026-05-22 Fri] +Shipped 3cb467e. Moved the DeepSat publishing steps (Linear ticket-state, the Slack notification protocol + channel ID, the GHE host, the team merge norm, the Linear ticket-body structure) out of the global =claude-rules/commits.md= into =teams/deepsat/claude/rules/publishing.md=. The global file keeps the universal skeleton and uses seams ("run the project's publishing overlay here if present") like startup-extras. Added =install-team= (targeted per-project copy, keyed on PROJECT, never globally symlinked) and generalized =sync-language-bundle.sh= to keep team overlays fresh at startup (3 new bats; make test green). -=daily-prep.org= still does its own inline triage (Gmail × 3 accounts, Slack, Linear, GHE PRs, calendars) as part of the full prep flow. Now that =triage-intake.org= exists as a standalone scan over the same source set, daily-prep could call it and consume its synthesis instead of duplicating the source-scan logic — DRYs up a 57k-line workflow and keeps both flows in sync when sources change. +Remaining deploy step (cross-project, surfaced to Craig): install the overlay into the DeepSat work project — =make install-team TEAM=deepsat PROJECT== — so it actually loads there. +** DONE [#A] Define a /voice-unavailable fallback in the commits.md publish flow :commits: +CLOSED: [2026-05-22 Fri] +Added an "If =/voice= is unavailable" paragraph to the Single-skill gate in =commits.md=: walk the same patterns inline (the flow already names which matter), state the skill was unavailable and the pass was applied by hand ("/voice unavailable — patterns walked inline"), and flag the missing skill for install. The gate is the pattern walk, not the tooling. The original "=humanizer= unavailable" framing was moot (humanizer → /voice). +** DONE [#A] wrap-it-up Step 3.5 assumes GitHub-family remote :chore:quick: +CLOSED: [2026-05-22 Fri] +:PROPERTIES: +:LAST_REVIEWED: 2026-05-20 +:END: +Documented the assumption inline at =wrap-it-up.org= Step 3.5 (chose the lightweight path over a provider-agnostic rewrite): the =gh= lookup expects a GitHub-family host, holds today via DeepSat on GHE, flagged for update if a future Linear project lands on GitLab/Gitea/Bitbucket. +Triggered by: 2026-05-16 wrap-it-up github.com cleanup (audit of the same file). -Scope: -- Identify the sections in =daily-prep.org= that do the inline triage (the email / Slack / Linear / PR / calendar fan-out, plus the "Sources checked: ..." footer at the top of each generated prep doc). -- Replace those sections with "run =triage-intake.org=" and adapt the downstream sections (Heads-up, Day's Priorities, Carry-forwards) to read triage-intake's synthesis output rather than the inline scan results. -- Verify the generated prep doc still has the same shape (Heads-up + Day's Priorities + Carry-forwards + Sources checked). +Step 3.5 (Linear ticket-state hygiene) at =wrap-it-up.org:207= says "the project's GitHub remote — use =gh pr list ...=". Currently fine in practice: the step is Linear-gated, and the only Linear-using project is DeepSat (on =deepsat.ghe.com=, a GitHub-family host where =gh= works). Would break if a future Linear-using project lived on a non-GitHub host (gitlab, gitea, bitbucket). Either drop the GitHub-family assumption (provider-agnostic lookup, harder) or document the assumption explicitly so future projects know the step needs an update if they don't fit. +** DONE [#C] Review pass: tighten skills and rulesets after 2026-05-04 audit +CLOSED: [2026-05-22 Fri] +:PROPERTIES: +:LAST_REVIEWED: 2026-05-20 +:END: +All 55 grouped-index items dispositioned (2026-05-22): ~49 edited across skills, commands, rule files, hooks, and the two playwright skills; several came out moot post-audit (humanizer→voice, skills→commands, typescript ruleset added); the two commits.md items shipped as the team-overlay split + /voice fallback. Freshness-checked each item against current reality before editing. -Origin: came up while authoring =triage-intake.org= on 2026-05-11. +Source notes used in this pass: +- C4 official docs: C4 is notation-independent; System Context and Container + diagrams are enough for most teams; every diagram needs title, key/legend, + explicit element types, and audience-appropriate abstraction. + [[https://c4model.com/diagrams][C4 diagrams]], + [[https://c4model.com/diagrams/notation][C4 notation]], + [[https://c4model.com/abstractions/component][C4 component]] +- arc42 docs: quality requirements need measurable scenarios; section 10 + should reference top quality goals and capture lesser quality requirements + with specific measures. [[https://docs.arc42.org/section-10/][arc42 section 10]], + [[https://quality.arc42.org/articles/specify-quality-requirements][specifying quality requirements]] +- ADR references: ADRs capture one justified architecturally significant + decision and its rationale; Nygard's original guidance emphasizes short, + numbered, repository-stored records and superseding rather than rewriting old + decisions. [[https://adr.github.io/][adr.github.io]], + [[https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions][Nygard ADR article]] +- Playwright docs: prefer user-visible locators and web assertions; locators + auto-wait and retry; =networkidle= is discouraged for testing readiness. + [[https://playwright.dev/docs/best-practices][Playwright best practices]], + [[https://playwright.dev/docs/locators][Playwright locators]], + [[https://playwright.dev/docs/next/api/class-page][Playwright page API]] +- OWASP references: Top 10 2021 includes Broken Access Control, + Cryptographic Failures, Injection, Insecure Design, Security + Misconfiguration, Vulnerable and Outdated Components, Identification and + Authentication Failures, Software and Data Integrity Failures, Security + Logging and Monitoring Failures, and SSRF; WSTG adds a broader testing map + across configuration, identity, authn/z, sessions, input validation, error + handling, cryptography, business logic, client-side, and API testing. + [[https://owasp.org/Top10/2021/][OWASP Top 10 2021]], + [[https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/][OWASP WSTG]] +- V2MOM references: Salesforce calls the last M "Measures" and emphasizes a + simple alignment document with prioritized Methods, explicit Obstacles, and + measurable outcomes. [[https://trailhead.salesforce.com/content/learn/modules/selfmotivation/get-focused-with-your-personal-v2mom][Salesforce Trailhead personal V2MOM]], + [[https://www.salesforce.com/blog/?p=12][Salesforce V2MOM alignment]] +- Prompt research: the cited Meincke paper is titled "Call Me A Jerk: + Persuading AI to Comply with Objectionable Requests"; its scope is + persuasion increasing compliance with objectionable requests, not a general + proof that persuasion framing improves prompt quality. + [[https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5357179][SSRN paper]] +- Combinatorial testing references: NIST supports t-way combinatorial testing + and notes pairwise is one covering strength, with higher-strength arrays + useful for failures requiring more interacting factors. + [[https://www.nist.gov/publications/practical-combinatorial-testing-beyond-pairwise][NIST beyond pairwise]], + [[https://www.nist.gov/publications/combinatorial-software-testing][NIST combinatorial testing]] -* Rulesets Resolved -** DONE [#C] Fix =cj-scan= false positives on cj fences nested inside other =#+begin_*= blocks :bug: -CLOSED: [2026-05-15 Fri] +*** Grouped index (for batching by area) -=cj-scan.py= was matching =#+begin_src cj:= / =#+end_src= line-by-line -without awareness of enclosing block scopes. A cj fence embedded inside a -=#+begin_example= block (typically when documenting what the == -(for any == other than =cj:= via the more-specific cj-open regex, which -is checked first), it enters a wrapper state where every line is treated as -content until the matching =#+end_= closer fires. Inside a wrapper, cj -fence patterns and legacy inline =cj:= lines are both suppressed. +**** Browser testing +- [X] [#A] =playwright-js=: locator/assertion-first guidance (replace raw CSS, =networkidle=) +- [X] [#B] =playwright-js= + =playwright-py=: reconcile headless/visible defaults +- [X] [#B] =playwright-js= + =playwright-py=: remove emoji console markers from examples -Tests: added =TestCjScanNestedFencesIgnored= (6 tests) to -=claude-templates/.ai/scripts/tests/test_cj_scan.py= covering nesting inside -=#+begin_example=, =#+begin_src =, and =#+begin_quote=, plus -regression guards that a wrapper closes cleanly (a subsequent real cj fence -is still detected) and that an unclosed wrapper doesn't silently swallow -later content into false-positive cj blocks. +**** Frontend / UI +- [X] [#B] =frontend-design=: WCAG 2.2 alignment, accessibility non-optional +- [X] [#B] =frontend-design=: harmonize aesthetic guidance with anti-pattern rules -Full =make test-scripts= equivalent (=python3 -m pytest=): 302 passed, 1 -skipped, 0 failures. +**** Security +- [X] [#A] =security-check=: OWASP 2021 + WSTG coverage +- [X] [#B] =security-check=: tooling and offline/network caveats -** DONE [#A] Add =make doctor= — verify ~/.claude/ matches repo + settings.json :feature: +**** Combinatorial testing +- [X] [#B] =pairwise-tests=: t-way escalation guidance beyond pairwise +- [X] [#B] =pairwise-tests=: clarify negative value syntax + generator availability -A drift detector that scans =~/.claude/= and reports anything inconsistent with what the repo expects. Single-command answer to "is my machine consistent with rulesets?" +**** V2MOM +- [X] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment) +- [X] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog +- [X] [#B] =create-v2mom=: mitigation/owner fields for Obstacles -*** Why this matters +**** Prompt engineering +- [X] [#A] =prompt-engineering=: correct/narrow Meincke citation +- [X] [#B] =prompt-engineering=: eval-harness requirement for production prompts -A 2026-05-06 sweep found =~/.claude/hooks/= didn't exist on this machine even though =settings.json= referenced =~/.claude/hooks/precompact-priorities.sh= as a PreCompact hook. Compaction would have silently failed to invoke the hook. The fix was =make install-hooks=, but the breakage was invisible until I happened to grep for it. =make doctor= run regularly (or even as part of session start) would catch this kind of drift in seconds instead of after the fact. +**** Codify +- [X] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md= -*** Checks +**** Code review +- [X] [#A] =review-code=: resolve local-verification vs CI boundary +- [X] [#B] =review-code=: =CLAUDE.md= citation scope for public artifacts +- [X] [#B] =review-code=: relax three-strengths rule for tiny/failing diffs -- Every entry in =settings.json= ="hooks"= block points at a file that exists. -- Every entry in =enabledPlugins= has a matching install under =~/.claude/plugins/data/=. -- Every skill in =$(SKILLS)= has a working symlink at =~/.claude/skills/=. -- Every rule in =$(RULES)= has a working symlink at =~/.claude/rules/=. -- Every default hook has a symlink at =~/.claude/hooks/= (warn-only — opt-out is legitimate). -- =settings.json= and =.mcp.json= symlinks resolve to the rulesets versions. -- =mcp/install.py= state matches =claude mcp list= (every server in =servers.json= is registered). -- No dangling symlinks anywhere under =~/.claude/=. +**** PR / review responses +- [X] [#A] =respond-to-review=: remove review-process language from commit messages +- [X] [#B] =respond-to-review=: use unresolved threads + resolution state +- [X] [#B] =respond-to-cj-comments=: drop personal absolute paths from public-writing (moot — already clean) +- [X] [#B] =respond-to-cj-comments=: fallback when =humanizer= or =emacsclient= unavailable (moot — superseded by /voice + VERIFY pattern) -*** Output +**** Branch workflow +- [X] [#A] =finish-branch=: fix base-branch detection +- [X] [#B] =finish-branch=: worktree-aware pull/merge safety +- [X] [#B] =start-work=: tool-availability + ceremony-scaling rules +- [X] [#B] =start-work=: claim-before-justify rollback risk -One line per check: =ok= / =WARN= / =FAIL=. Final summary: =N ok, M warnings, K failures=. Exit non-zero on any failure so it can ride a pre-flight check. +**** Tests / TDD +- [X] [#B] =add-tests=: fix missing =typescript-testing.md= reference or add ruleset (moot — ruleset now exists) +- [X] [#B] =add-tests=: explicit exceptions to "all three categories per function" -** DONE [#A] Build =voice= skill — combine =humanizer= with universal + personal style passes :feature: +**** Debugging / RCA +- [X] [#B] =debug=: capture environment + recent-change context before hypotheses +- [X] [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries +- [X] [#B] =five-whys=: require evidence + counterfactual validation per why -Combine =humanizer= with universal good-writing passes (Strunk & White, Orwell, Plain English) and the personal-style passes from =commits.md=. Two modes — =general= for arbitrary writing, =personal= for commits/PRs/comments — share a foundation and diverge on register. +**** Brainstorming +- [X] [#B] =brainstorm=: timebox + research/source rules for high-stakes designs + +**** Architecture +- [X] [#B] =arch-decide=: timeless examples, drop unverifiable claims +- [X] [#B] =arch-decide=: standardize statuses + immutability language +- [X] [#B] =arch-design=: threat modeling + privacy/compliance as first-class inputs +- [X] [#B] =arch-design=: separate paradigms from tactical patterns +- [X] [#B] =arch-document=: arc42/Q42 quality scenarios +- [X] [#B] =arch-document=: staleness + ownership metadata for generated docs +- [X] [#B] =arch-evaluate=: confidence levels for framework-agnostic findings +- [X] [#B] =arch-evaluate=: report skipped tool checks explicitly + +**** C4 modeling +- [X] [#A] =c4-analyze= + =c4-diagram=: notation/output fallback (not draw.io-only) +- [X] [#B] =c4-analyze= + =c4-diagram=: clarify abstraction boundaries + +**** Global rules +- [X] [#B] =commits.md=: split DeepSat/Linear/Slack-specific from global rules → promoted to a top-level task (deferred for Craig) +- [X] [#A] =commits.md= + publish flows: =humanizer=-unavailable fallback → promoted to a top-level task (deferred; humanizer premise moot) +- [X] [#B] =verification.md=: explicit "unable to verify" reporting standard +- [X] [#B] =testing.md=: property-based + mutation testing as escalation paths +- [X] [#B] =testing.md=: soften absolute TDD with explicit spike protocol +- [X] [#B] =subagents.md=: capability/availability + cost checks -Built and shipped 2026-05-07: =voice/SKILL.md= with 39 numbered patterns walked sequentially. Patterns 1-25 carried over from humanizer, 26-31 are universal good-writing additions, 32-39 are personal-only. Migrated three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=). Removed the standalone =humanizer= skill since voice supersedes it. +**** Languages +- [X] [#A] =python-testing.md=: revisit in-memory SQLite guidance +- [X] [#B] =python-testing.md=: separate "never mock ORM" from unit-test boundaries +- [X] [#B] =elisp.md=: drop tool-specific advice +- [X] [#B] =elisp-testing.md=: batch-mode + native-comp caveats -*** Why this matters +**** Hooks +- [X] [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets +- [X] [#A] =hooks/git-commit-confirm.py= + =gh-pr-create-confirm.py=: inspect message/body files referenced by =-F= / =--body-file= +- [X] [#B] =hooks/destructive-bash-confirm.py=: shell-aware command parsing (not regex) -Three transformations want to run together for personal-mode artifacts (commits, PR titles + bodies, PR comments) but lived in three places: =humanizer= as a skill, S&W-style universal rules nowhere (applied ad-hoc), and the personal-style passes as prose steps in =commits.md= that got re-applied by hand each time. Costs: (1) the "I forgot pass (e)" failure mode — skipping a pass without flagging is a defect but happens in practice. (2) No single-call invocation of the full transform. (3) General-mode writing (research notes, philosophy, history) got only humanizer with no universal-prose pass at all. Combining brings them under one skill with one invocation. +*** 2026-05-22 Fri @ 15:47:10 -0500 Made playwright guidance locator/assertion-first, dropped networkidle-as-readiness -*** Design +Rewrote the readiness guidance in both =playwright-js/SKILL.md= and =playwright-py/SKILL.md=: reconnaissance now waits for a visible app landmark via a web assertion or locator (=expect(...).toBeVisible()= / =get_by_role(...).wait_for()=), not =networkidle= (which Playwright discourages). Updated the login/form examples to =getByLabel=/=getByRole= + web assertions, the API_REFERENCE.md waiting section, and =lib/helpers.js= defaults (=waitForPageReady= now defaults to =load= and prefers a caller-supplied landmark; =authenticate= races the success indicator over a =load= navigation). node --check passes. -Two modes: +*** 2026-05-22 Fri @ 14:23:02 -0500 Added headed/headless decision tables to both playwright skills -- *general* (default) — for arbitrary writing not bound for commit/PR/comment publishing (research notes, philosophy/history essays, emails, README prose). Runs: - - humanizer (current behavior — strip AI-generated-writing fingerprints) - - tier-1 universal passes (canonical good-writing rules) - - the 2 personal-style passes that have no register conflict (jargon-fragment rewrite, noun-ified verbs) +Added matching purpose-based decision tables to =playwright-js/SKILL.md= (was "always visible") and =playwright-py/SKILL.md= Best Practices (was "always headless"). Each names its own default and points at the other skill, so the difference is deliberate, not a habit-flip: headed for interactive debugging, headless for CI/pytest. Also softened the absolutist "Always launch... headless" comment in the py example. -- *personal* — for commits, PR titles + bodies, PR comments. Runs general PLUS: - - 8 personal-only passes (first-person rewrite, semicolons, contractions, sentence-split, felt-experience, sentence fragments, terse cut, public-artifact scope check) +*** 2026-05-22 Fri @ 15:47:10 -0500 Removed emoji console markers from the playwright skills -The 8 personal-only passes are explicitly *not* in general mode. They conflict with academic / literary / philosophical register. Forcing first-person on a Foucault essay or stripping felt-experience from a journal entry would damage the writing. +Replaced every emoji status marker with a plain ASCII prefix across =playwright-js/= (run.js, lib/helpers.js, SKILL.md) and =playwright-py/= (SKILL.md, examples/*.py): 📦/⚡/📄/📥/🎭/🚀/📋/✅/❌/🔍/📸/✓/✗ → =[setup]=/=[run]=/=[ok]=/=[error]=/=[fail]= etc. Post-change emoji grep is clean (excluding node_modules); node --check and py_compile pass. -*** Tier 1 universals (v1) +*** 2026-05-22 Fri @ 14:35:16 -0500 Made accessibility a non-optional WCAG 2.2 gate in frontend-design -From Strunk & White, Orwell's "Politics and the English Language", Plain English Campaign, and Garner's Modern English Usage. Each is a detection-pattern + rewrite-rule pair, mechanical enough to apply consistently across runs. +Added an "Accessibility Gate (required before handoff)" section to =frontend-design/SKILL.md= covering keyboard operation, focus visibility, focus-not-obscured (2.2), target size (2.2), contrast, reduced motion, labels, and semantic structure — a baseline for all frontend work, not just interactive components. Rewrote the Build/Review phases to build accessibly as you go and clear the gate before handoff, and bumped =references/accessibility.md= from WCAG 2.1 to 2.2 with backing detail for the new criteria. -- *Omit needless words* — curated phrase list (=the fact that= → =that=/=because=, =in order to= → =to=, =at this point in time= → =now=, =due to the fact that= → =because=, =for the purpose of= → =to=, =in spite of= → =despite=, etc.) -- *Long word → short word* — Plain English wordlist (~150 entries: =utilize=→=use=, =commence=→=start=, =terminate=→=end=, =facilitate=→=help=, =demonstrate=→=show=, =sufficient=→=enough=, =prior to=→=before=, =subsequent to=→=after=, =in the event that=→=if=, =a great deal of=→=much=) -- *Active over passive voice* — detect "to be + past-participle" patterns. Suggestion-only in v1 (auto-rewrite is risky in technical contexts where passive is appropriate); graduate to auto-rewrite for unambiguous cases in v2. -- *Comma splices* — detect independent clauses joined only by comma; rewrite to period or semicolon-then-period. -- *Cliché flag* — small curated list (=at the end of the day=, =moving forward=, =going forward=, =at this juncture=, =circle back=, =low-hanging fruit=, =deep dive=, =leverage= as verb). +*** 2026-05-22 Fri @ 14:35:16 -0500 Added a "creative but bounded" section to frontend-design -*** Tier 2 universals (v2) +Added a subsection under Frontend Aesthetics framing the bold/maximalist directions as tools, not obligations: domain fit, readability first, responsive stability, and no decorative effect that degrades the workflow. Reconciles rather than contradicts the maximalist encouragement (maximalism stays on the table as deliberate usable density), and ties the readability bullet to the new accessibility gate. -- *Positive over negative form* (S&W) — =not unlike= → =like=, =do not fail to= → =remember to=, =did not pay any attention= → =ignored= -- *Garner-style word-pair corrections* — comprise/compose, less/fewer, that/which (restrictive vs nonrestrictive), affect/effect, principal/principle -- *Parallelism in lists* — detect mismatched grammar in bullet items -- *Tense consistency* — flag mid-paragraph tense shifts -- *Acronym definition on first use* — detect uppercase tokens used before being expanded +*** 2026-05-22 Fri @ 14:35:16 -0500 Updated security-check to OWASP Top 10 2021 + WSTG mapping -*** Tier 3 (v3, may not land) +Replaced the older six-category list in =.claude/commands/security-check.md= with the full Top 10 2021 set, each finding mapped to a 2021 category or WSTG area. Added the four missing categories (Insecure Design, Software and Data Integrity Failures, Security Logging and Monitoring Failures, SSRF) plus explicit checks for object/function-level authorization, SSRF on URL-fetch paths, update/plugin/dependency integrity, and logging/monitoring gaps. -- *Concrete-over-abstract* preference -- *Emphatic word at sentence end* (S&W rule 18) -- *Vary sentence length / rhythm* -- *Reading-grade-level scoring* (Hemingway-style) +*** 2026-05-22 Fri @ 14:35:16 -0500 Added scanner tooling + network caveats to security-check -*** Personal-style pass placement +Added an optional configured-scanners step (=gitleaks=/=trufflehog= secrets, =semgrep= source patterns, OSV scanner, lockfile-diff review) that supplements the manual scans, plus a network caveat: dependency audits that can't run (offline, tool absent, DB unreachable) must report "not run" naming the tool and reason, never read as a pass. Carried that into the no-issues summary. -| # | Pass | Mode | Why | -|---|------|------|-----| -| 1 | First-person voice rewrite | personal only | Forces "I" voice; wrong for academic prose where third-person and "we" are conventional | -| 2 | Jargon-fragment → complete sentence | both | Universal clarity, no genre conflict | -| 3 | Semicolon → period/comma | personal only | Semicolons are conventional in long-form / academic prose | -| 4 | Contractions ("it's", "don't") | personal only | Academic and formal writing typically avoids contractions | -| 5 | Sentence split on conjunctions | personal only | Foucault, Hegel, Adorno deliberately use long compound sentences | -| 6 | Felt-experience narration ("I'll feel this every time") | personal only | Personal essays *use* felt-experience as content | -| 7 | Noun-ified verbs ("the ask", "a learn", "the spend") | both | Targets corporate-speak with curated wordlist; doesn't catch philosophical nominalizations like "the becoming" | -| 8 | Sentence fragments → complete (in prose) | personal only | Fragments are valid stylistic devices in literary prose | -| 9 | Terse cut (rhetorical padding: "worth noting", "it's important to understand") | personal only | Tier 1 omit-needless-words covers the worst offenders universally; aggressive cut conflicts with academic register | -| 10 | Public-artifact scope check (local paths, private repos, personal tooling) | personal only — *flag-only*, no auto-rewrite | Operational/safety check, not stylistic; auto-masking risks silently editing meaningful text | +*** 2026-05-22 Fri @ 14:35:16 -0500 Added t-way escalation guidance to pairwise-tests -*** Inclusive-language pass — explicitly excluded +Added an "Escalating Beyond Pairwise (t-way)" subsection: start with pairwise across the whole space, then escalate specific high-risk clusters to 3-way+ when history, safety, security, or domain coupling says a fault needs more than two interacting factors. Lists escalation triggers and shows the sub-model order syntax (={ A, B, C } @ 3=) vs a blanket =/o:3= bump, stressing targeted not uniform escalation. Cites NIST combinatorial-testing work. -Considered and rejected. Conflicts with planned writing on philosophy/history topics (Foucault on sexuality and gender, history of slavery in New Orleans). Wordlist substitutions would override deliberate vocabulary choices in those genres. +*** 2026-05-22 Fri @ 14:35:16 -0500 Clarified PICT ~ syntax + honest generator-availability path in pairwise-tests -*** V1 scope +Added a "~ prefix" explanation (PICT marker tagging a value as negative/invalid, not an arithmetic operator; PICT pairs negatives with valid values once and strips the marker before the SUT) and a stop-at-the-model rule: if neither the =pict= binary nor =pypict= is present, produce the model and stop rather than hand-writing a table and passing it off as PICT output. -- [ ] Skill at =~/code/rulesets/voice/= with =SKILL.md= -- [ ] Frontmatter with positive triggers (commit, PR, comment, "humanize", "voice pass") and negative triggers (code, structured data, plain bullet lists) -- [X] Mode invocation: default = =general= when invoked bare; =personal= invoked explicitly by publish-context callers -- [X] humanizer content migrated from =humanizer/= → =voice/= -- [X] Tier 1 universal passes implemented (5 patterns: #26-30, plus #31 noun-ified verbs as a universal personal addition) -- [X] 2 personal passes that run in both modes (#30 jargon-fragment, #31 noun-ified verbs) -- [X] 8 personal passes that run in personal mode only (#32 first-person, #33 semicolons, #34 contractions, #35 sentence-split, #36 felt-experience, #37 fragments, #38 terse cut, #39 scope check) -- [X] Each pass = detection-pattern + rewrite-rule pair (#39 is detection + flag-only) -- [X] Total v1 pattern count: 31 in general mode (humanizer's 25 + 4 tier-1 + 2 universal personal); +8 personal-only = 39 in personal mode -- [X] Update =commits.md= to invoke =/voice personal= instead of "run =humanizer= and apply five passes manually" -- [X] Remove the existing =humanizer/= skill (no callers outside this repo, all migrated) -- [X] =make doctor= still passes -- [X] =make lint= clean +*** 2026-05-22 Fri @ 14:43:17 -0500 Renamed Metrics → Measures throughout create-v2mom -*** v2 (deferred) +Full rename across =.claude/commands/create-v2mom.md= (acronym expansions, Phase 7 heading, the "Measures must be measurable" principle, exit criteria, review questions, red flags, examples) to match Salesforce's official term. Kept the "vanity metrics" idiom intact — it's the anti-pattern term, not a section reference. -- [ ] Tier 2 universals (positive form, word-pair corrections, parallelism, tense consistency, acronym definition) -- [ ] Per-pass severity flags for Tier 1 active-voice (suggestion-only when actor is implicit; auto-rewrite when actor is named) -- [ ] Reporting mode: list which passes fired and which were no-ops +*** 2026-05-22 Fri @ 14:43:17 -0500 Split strategy from execution in create-v2mom task migration -*** v3 (aspirational, may not land) +Rewrote Phase 8 (and tightened Phase 5.5): tasks stay in the backlog grouped by method, and each method gains a one-line link to where its tasks live, instead of transplanting the task tree into the V2MOM. Strategy (V2MOM) and execution (backlog) are now explicitly separate sources of truth, keeping the V2MOM concise. -- [ ] Tier 3 (concrete-over-abstract, emphatic-word position, sentence-length variation, reading-grade scoring) -- [ ] Progressive disclosure split: =voice/SKILL.md= orchestrator + =voice/passes/.md= per pass with worked examples +*** 2026-05-22 Fri @ 14:43:17 -0500 Made create-v2mom obstacles operational (mitigation/owner/cadence) -*** Migration (resolved) +Phase 6 now captures, per obstacle: name, manifestation, stakes, mitigation, owner, and review cadence — with a worked example per domain (health/finance/software), a "good obstacle" characteristic, a Phase 9 review question, and a red flag for candid-but-not-operational obstacles. An obstacle without a countermove is now flagged as an observation, not a plan. -Decision: deleted =humanizer/= entirely. Three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=) all updated to invoke =/voice= directly. No alias needed since nothing outside the repo invoked humanizer. +*** 2026-05-22 Fri @ 14:43:17 -0500 Corrected and narrowed the Meincke citation in prompt-engineering -*** Naming alternatives considered +Fixed the title to "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests" (SSRN abstract_id=5357179) in all three spots (frontmatter, Seven Principles intro, References). Reframed the ~33%→72% result as what it is — a prompt-safety caution that persuasion raises compliance with objectionable requests — explicitly not evidence that persuasion framing improves engineering prompt quality. Kept the seven principles as a tone vocabulary. -- =voice= — chosen. Captures both modes; broad enough. -- =polish= — descriptive of multi-pass nature; less prescriptive about whose voice. -- =house-style= — signals "this is the house style"; appropriate for personal repo. -- =commit-voice= — too narrow (passes apply to research notes, emails, etc. in general mode). -- =humanize= (extending current) — undersells the universal + personal additions. +*** 2026-05-22 Fri @ 14:43:17 -0500 Added an eval-harness requirement to prompt-engineering critique mode -*** Open questions before implementation +Added critique step 7 + a checklist line: for fragile or reusable/production prompts, write 3-5 adversarial/edge inputs, run both the old and new prompt against each, and record the behavioral delta. A throwaway prompt can ship on the rewrite alone; a discipline/reused/production one can't. Without examples, "the rewrite is better" is an assertion, not a result. -Resolved during implementation: -- Default mode when =/voice= is invoked bare: =general=. Personal-context callers (=commits.md= publish flow, =respond-to-cj-comments.md=) invoke =/voice personal= explicitly. Avoids accidentally first-person-ifying research notes. -- Reporting: skill prints "Summary of changes" listing which patterns fired (audit value). -- Public-artifact scope check (#39): flag-only, user resolves manually. Blocking would frustrate on legitimate path mentions. -- Tier 1 active-voice detection: suggestion-only in v1. Auto-rewrite for unambiguous cases deferred to v2. +*** 2026-05-22 Fri @ 14:43:17 -0500 Added mandatory stale-entry + privacy pre-write checks to codify -** DONE [#B] Add =--archive-done= mode to =.ai/scripts/todo-cleanup.el= :feature: +Added a "Mandatory pre-write checks" block at the top of Phase 3 (Write) in =.claude/commands/codify.md=: a stale-entry scan (update/remove no-longer-true entries in place, don't append contradictions around them) and a privacy/leak check carrying both questions verbatim — "safe if the project were public?" and "belongs in private memory instead?" — routing private content to auto-memory. Gates, not background guidance. -Opt-in mode that moves every level-2 subtree whose TODO state is DONE or CANCELLED out of the "Open Work" section and into the "Resolved" section of the same org file, subtree intact. +*** 2026-05-22 Fri @ 14:06:41 -0500 Scoped review-code's CI-trust rule to reviewing, not shipping -- *Section matching.* Key on a top-level heading containing "Open Work" and one containing "Resolved" — that pairing is the only naming consistent across projects (=Work Open Work= / =Work Resolved= here; bare =Open Work= / =Resolved= elsewhere). Require exactly one match for each; otherwise skip with a clear message, no crash. -- *Modes.* =--check= previews and writes nothing, same as the existing hygiene pass. Idempotent. Not run by default in the wrap-up flow — archiving is consequential, so it stays opt-in: =emacs --batch -q -l todo-cleanup.el --archive-done FILE=. -- *Edge cases.* Source or target section missing; subtree at EOF; nested DONE subtree under an open parent stays put (only level-2 entries move); nothing to move → clean no-op. -- *Tests.* TDD with ERT — the project's first elisp tests. Fixtures (synthetic) under =.ai/scripts/tests/=; run via =make test= (rulesets) or =make test-scripts= (claude-templates), which run pytest + every =tests/test-*.el= ERT suite. Cases: one DONE level-2 moves; multiple; CANCELLED also moves; structural (no-state) headings don't move; nested DONE under an open parent stays; level-2 DONE with open level-3 children moves intact; subtree at EOF; missing source/target section; ambiguous "Resolved"; lowercase headings; nothing-to-do; idempotency; =--check= preview + its idempotency; realistic-sample integration. +Expanded the False-Positive Filter bullet in =review-code/SKILL.md=: "trust CI, don't run builds" applies to reading a diff, not producing one. A pre-commit/pre-push flow still owes the local verification =verification.md= requires (run the suite or state "not run because..."). Closes the apparent contradiction with =verification.md= / =finish-branch=. -Origin: came up while scrubbing a project's todo.org on 2026-05-11 — moving a big completed PROJECT subtree (plus a few smaller ones) into the Resolved section by hand was the cue to build a reusable tool. +*** 2026-05-22 Fri @ 14:06:41 -0500 Added private-vs-public CLAUDE.md citation modes to review-code -Built and shipped 2026-05-11: =--archive-done= added to =.ai/scripts/todo-cleanup.el= test-first; 13-test ERT suite (=tests/test-todo-cleanup.el=) + realistic synthetic fixture (=tests/fixtures/todo-sample.org=), wired into =make test= / =make test-scripts= alongside pytest. The CLI dispatch moved into =tc-main= behind a guard so the suite can =require= the file without firing it. Section matching is case-insensitive and tolerates the = Open Work= / = Resolved= naming variants. Opt-in only — not wired into the wrap-up flow. Source of truth is =~/projects/claude-templates/=; rsync'd into this repo. +Expanded the Content scope section in =review-code/SKILL.md= with two modes: a private/internal review cites =CLAUDE.md= directly; a public/team review translates the rule into the engineering reason it encodes and doesn't name the rules file (a teammate can act on the reason, not on a file they can't reach). Same principle =commits.md= states for personal tooling in public artifacts. -** DONE [#B] Encode follow-up filing rules into =/start-work= -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 13:48:14 -0500 Relaxed review-code "three strengths" to up-to-three-or-none -Phase 4 step 5 of =/start-work= ("refactor audit") says any candidate that isn't fix-now must land in one of three buckets: fold-into-related-commit, separate =refactor:= commit, or "file a ticket or todo.org entry." The third disposition doesn't say *where* — which leaves the orchestrator picking a location ad-hoc. Result: follow-ups buried under children of an epic parent get orphaned when the parent closes, or follow-ups for standalone tasks scatter across the file with no convention. +Changed all three "three minimum" spots in =review-code/SKILL.md= (Strengths section, Critical Rules DO list, Anti-Patterns) to "up to three specific; say none found on a tiny or weak diff." Reframed the old "No Strengths section" anti-pattern as "Skipping strengths out of laziness" so a substantive diff still demands them while a weak one can honestly report nothing notable. Landed alongside Craig's adjacent edit telling reviewers not to explain why a strength is good (sycophantic padding). -Proposed placement rule (already memorized for this project as =feedback_followups_as_siblings.md=, generalizing): +*** 2026-05-22 Fri @ 14:12:24 -0500 Removed review-process language from respond-to-review commit guidance -- *Epic-style parent task* (level-2 with multiple level-3 children) → follow-ups file as level-2 *siblings* of the parent. Stays visible after parent closure. -- *Standalone task* (level-2 with no children, or a level-3 inside another structure) → follow-up files as a new level-2 top-level entry in the same =* Open Work= section. Don't nest under the originating task. +Replaced the =fix: Address review — [description]= example (and the matching description-line phrasing) in =.claude/commands/respond-to-review.md= with "name the actual fix (=fix: validate export filename=), not the review that prompted it." Killed the non-ASCII dash and the process-in-commit pattern that conflicted with =commits.md=. -Both cases: include a "Triggered by: " line so a future reader sees what surfaced it. +*** 2026-05-22 Fri @ 14:12:24 -0500 Made respond-to-review fetch unresolved threads + resolve after verification -Update =.claude/commands/start-work.md= Phase 4 step 5's "Disposition for each candidate" section to spell this out. Update any cross-references in =commits.md= or other files that touch the discipline. +Rewrote section 1 (Gather) in =.claude/commands/respond-to-review.md= to pull =reviewThreads= via =gh api graphql= with =isResolved=, skipping already-resolved threads so settled feedback isn't re-processed; top-level conversation comments still come from REST. Added a section-4 step: reply and resolve a thread only after the fix is verified, never before. -Triggered by: 2026-05-15 fold-epic session — Craig flagged the gap mid-flight after I'd surfaced a follow-up but hadn't filed it. -** DONE [#A] Consolidate =.ai/= template infrastructure (fold + audit + install-ai + ratio) :feature: -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 14:12:24 -0500 Verified respond-to-cj-comments no longer embeds an absolute path (moot) -End-state: one repo (=rulesets=) is the single source of truth for =.ai/= template content. =make audit= verifies and applies drift across every =.ai/=-using project on the machine. =make install-ai= bootstraps new projects. Same setup propagated to ratio so both machines run the same way. +Already resolved by a prior migration: =grep= for =/home/= and =/Users/= in =.claude/commands/respond-to-cj-comments.md= returns nothing. The public-writing section refers to the rules by name, not by local path. No edit needed. -Today (2026-05-15) the canonical-source rule got violated again: rulesets commit =372fb76= added a wrap-up subsection to =rulesets= without going through =claude-templates= first, and the next session's startup rsync was about to silently undo it. Two-repo coordination is the root cause; fold solves it. +*** 2026-05-22 Fri @ 14:12:24 -0500 Closed respond-to-cj-comments humanizer/emacsclient fallback (largely moot) -Build order: fold first (others depend on the new canonical path), then audit + install-ai in parallel, then test, then propagate to ratio. +Overtaken by two later changes: =/humanizer= was replaced by =/voice personal= (no =/humanizer= invocation remains), and the mandatory =emacsclient= summary-open was replaced by the in-place VERIFY-task pattern (workflow line ~262, Craig's 2026-05-12 standing instruction). Only a stale descriptive phrase remained — tidied "humanizer's signs of AI writing" to "the signs of AI writing." The original fresh-environment-fallback concern no longer applies as written. -*** DONE [#A] Fold =claude-templates= into rulesets -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 14:51:37 -0500 Fixed finish-branch base-branch detection -Two repos, one source of truth. =~/projects/claude-templates/= is the canonical =.ai/= template that gets rsync'd into every project at session start. Keeping it standalone means a second =git pull= in startup Phase A.0, a second remote to push to at wrap-up, and a split history any time a change touches both. Folding it into =rulesets/claude-templates/= gives one repo to clone on a fresh machine and one place to edit templates. +Rewrote Phase 2: resolve the base *branch name* in priority order (open PR's =baseRefName=, then =git symbolic-ref --short refs/remotes/origin/HEAD= stripped, then ask), and compute the merge-base *SHA* separately only where a commit range is needed. Made the branch-name-vs-merge-base distinction explicit, since the old command returned a SHA where a branch name was needed. -**** Open design choices +*** 2026-05-22 Fri @ 14:51:37 -0500 Made finish-branch merge safer + worktree-aware -- *History.* =git subtree add --prefix=claude-templates ~/projects/claude-templates main= preserves the 84-commit history under the new prefix. Plain content copy (=cp -a= + =git add=) is simpler but loses history. Either is fine since the standalone repo stays archived on =cjennings.net=. -- *Layout.* =rulesets/claude-templates/= mirrors the old repo name and sits next to =claude-rules/= cleanly. Alternative: absorb =.ai/= directly under a different name (=rulesets/.ai-template/= or similar). First option is clearer. -- *bin/ai.* The standalone Makefile symlinks =$HOME/.local/bin/ai → bin/ai=. After the move, fold that into rulesets' Makefile as another install target. +Added pre-flight checks to Option 1 (Merge Locally): dirty-tree refusal with no auto-stash, protected-branch awareness, upstream-gated =git pull --ff-only=, and merge-commit-vs-rebase as a team-policy choice instead of a hardcoded =--no-ff=. Replaced the fragile =git worktree list | grep = detection with a =git rev-parse --git-dir= vs =--git-common-dir= comparison plus =git worktree list --porcelain= for the path. -**** Mechanical steps +*** 2026-05-22 Fri @ 14:51:37 -0500 Added tool-availability + ceremony-scale paths to start-work -1. Subtree-merge or copy =~/projects/claude-templates/= into =rulesets/claude-templates/=. -2. Update 3 references in rulesets: - - =.ai/protocols.org= line 163 — pointer in the "Let's run/do the X workflow" section. - - =.ai/workflows/cross-agent-comms.org= line 8 — promotion-target path. - - =.ai/workflows/startup.org= lines 22, 96-98 — Phase A.0 pull + Phase A rsync sources. -3. Update Phase A.0 of =startup.org= to pull rulesets instead of claude-templates. Inside rulesets sessions, the existing project-repo pull already covers it. Outside rulesets (every other project's session), Phase A.0 needs an explicit =git pull= on =~/code/rulesets/= before the rsync — otherwise the templates will be stale. -4. Replace =~/projects/claude-templates/= with a symlink to =~/code/rulesets/claude-templates/= for transition continuity. -5. After every active project has had one session start (and rsync'd the new =startup.org=), drop the symlink and archive =cjennings.net:git/claude-templates.git=. +Added a "Tool availability" section (graceful degradation when Linear MCP / =gh= / =/voice= / Playwright are missing — do what's available, surface what isn't, don't block) and a "Ceremony scale" section (trivial / small / standard tiers so a two-line fix skips ticket+branch+gates unless asked). The =humanizer= reference in the original item is moot — the file already uses =/voice= throughout. -**** Bootstrap gap +*** 2026-05-22 Fri @ 14:51:37 -0500 Resolved start-work claim-before-justify rollback risk -Every project on the machine has a =.ai/workflows/startup.org= that rsyncs from =~/projects/claude-templates/=. Until each project's startup.org gets refreshed (which happens via the rsync itself), the old path needs to keep resolving. The symlink at step 4 is the bridge: old paths resolve into the new location, the rsync delivers the updated startup.org, next session uses the new path directly. +Split the claim by tracker type: personal todo.org claims defer to after the Justify gate (a killed task needs no rollback), while team trackers (Linear/GitHub) still claim first to signal intent but record prior state (status, assignee, label) so the Phase 2 rollback restores exactly it. Updated the per-tracker rollback steps and the matching anti-pattern. -*** DONE [#A] Add =make audit= — drift detector across all =.ai/=-using projects -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 14:28:41 -0500 Verified add-tests typescript-testing.md reference resolves (moot) -Companion to =make doctor= (single-machine scope, checks =~/.claude/=). =audit= is cross-project scope: walks every directory on the machine that has a =.ai/=, diffs the synced template files against the canonical source, and reports drift. =--apply= flag rsyncs the drift into the project's working tree (no auto-commit). Catches stale projects without forcing a session start in each one. +Resolved since the audit: =languages/typescript/claude/rules/typescript-testing.md= now exists, and =add-tests/SKILL.md:68= references it by bare filename, the same way it references =python-testing.md= (both get copied into a project's =.claude/rules/=). The "missing file" premise no longer holds. No edit needed. -**** Open design choices +*** 2026-05-22 Fri @ 14:28:41 -0500 Added a category-exception protocol to add-tests -- *Scope.* Template-sync drift is the useful flavor: for each project, diff =.ai/protocols.org=, =.ai/workflows/=, =.ai/scripts/= against the canonical source. -- *Source path.* Post-fold: =~/code/rulesets/claude-templates/.ai/=. Build =audit= against the new path from day one. -- *Project discovery.* Walk =~/code/=, =~/projects/=, =~/.emacs.d/= up to depth 3 for any directory containing =.ai/=. Skip the canonical source itself. -- *Default mode is report-only.* =--apply= triggers rsync; =--force= overrides the dirty-skip safety. +Added an exception note to step 7 (proposal) in =add-tests/SKILL.md=: pure adapters, generated code, tiny pass-through wrappers, and framework glue may skip a category that would only re-test the framework, but the skip must be stated and justified in the plan and the behavior covered at integration/E2E level — never a silent omission. Step 12 (write) now points back to "honor documented category exceptions." -**** Per-project flow (designed 2026-05-15) +*** 2026-05-22 Fri @ 14:25:37 -0500 Added environment + recent-change capture to debug Phase 1 -For each discovered project, in order: +Added a fourth Phase-1 step in =debug/SKILL.md=: record versions, feature-flag/config state, dataset/fixture, seed/clock, concurrency, and recent commits/config-infra changes. Noted that intermittent bugs usually live in environment/state transitions (and "what changed recently" is often the fastest route), while a deterministic local bug only needs a one-liner. Updated the phase's closing recap to include the context. -1. Verify =.ai/= exists (path probe). If missing → =FAIL=, skip, continue loop. -2. Detect git tracking via =git check-ignore .ai/= → =tracked= or =gitignored=. -3. Verify no uncommitted =.ai/= changes (=git status --porcelain .ai/=). Dirty → =WARN=, skip rsync unless =--force=. -4. Verify content matches canonical via three =rsync -a --dry-run --itemize-changes= calls (=protocols.org=, =workflows/=, =scripts/=). Zero items = clean. -5. Action (=--apply= only, drift detected): three =rsync -a [--delete]= calls. -6. Verify rsync converged (re-run the dry-runs; zero now). -7. Verify working-tree state after rsync (tracked projects). Report deltas. Do not auto-commit. -8. Verify no unpushed =.ai/= commits (=git log @{u}..HEAD -- .ai/=). Informational only. +*** 2026-05-22 Fri @ 14:25:37 -0500 Constrained root-cause-trace defense-in-depth to boundaries -**** Output format (mirrors =doctor=) +Rewrote step b in =root-cause-trace/SKILL.md=: instead of "add a check at each layer that could have caught it," add one only at a layer that owns a boundary or invariant — ingress/trust, persistence, invariant-owning service, final render. Added the explicit rule that a pass-through function owning neither shouldn't get a duplicate null check (validation spam). Recast the three example layers as the boundary types. -#+begin_example -Claude-templates source: - ok rulesets/claude-templates is current (origin/main) +*** 2026-05-22 Fri @ 14:25:37 -0500 Required evidence + counterfactual per why in five-whys -Per-project .ai/ drift: - ok ~/projects/work - applied ~/projects/homelab 3 files changed - skipped ~/code/winvm uncommitted .ai/ (use --force) - ok ~/projects/clipper +Expanded step 2 in =five-whys/SKILL.md=: each link now owes an evidence field (a log/commit/metric/config you can point to) and a counterfactual check (remove this cause — does the symptom above plausibly not happen?). Framed the counterfactual as the main guard against monocausal storytelling, and updated the worked example to show both fields. -Summary: 18 ok, 3 applied, 1 skipped, 0 failed -#+end_example +*** 2026-05-22 Fri @ 15:51:59 -0500 Added timebox + fresh-sources rules to brainstorm -Exit code: =0= if all clean, no skips, no failures. =1= otherwise. +Phase 1 gained a "Timebox the dialogue" rule (aim for the one-sentence restatement in ~5-8 questions, then move on and park the rest as open questions). Phase 2 gained "Ground high-stakes claims in fresh sources" (check load-bearing claims about markets/regulations/tools/vendors/APIs against a current source; mark unverified ones as assumptions). The design-doc skeleton gained an "## Assumptions" section that distinguishes researched facts (with source) from assumptions (to confirm before building). -**** Why not extend =make doctor= instead +*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-decide examples timeless + required citations -=doctor= has a clean meaning today: "is this machine's =~/.claude/= consistent with rulesets?" Mixing in cross-project =.ai/= drift muddies the exit code. Keep them separate. =audit= can optionally invoke =doctor= as its last check since both ask "did the symlinks keep up with the source?". A future =make all-checks= can wrap both. +Dated the MongoDB multi-document-transaction example (scoped to 2024-01) with a backing reference, and added a "Cite, don't assert" Do: every concrete technical claim about a tool/version/platform carries a link, doc, version, or "checked YYYY-MM" date, or gets a domain-neutral placeholder — so unsourced "X can't do Y" doesn't rot into stale fact. -*** DONE [#A] Add =make install-ai PROJECT== — bootstrap =.ai/= in a fresh project -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 14:59:32 -0500 Standardized arch-decide ADR statuses + immutability rule -Separate target from =audit= because operating on projects that lack =.ai/= is a distinct action. The absence might be intentional, so =audit= skips them. Bootstrap is explicit opt-in. +Declared a canonical five-status set (Proposed, Accepted, Rejected, Deprecated, Superseded) with an explicit "no synonyms" line, and spelled out the immutability rule in the Don'ts: an accepted ADR's body is frozen, only status/link metadata changes, a changed decision gets a new superseding ADR and the old one stays as the historical record. -**** Flow +*** 2026-05-22 Fri @ 14:59:32 -0500 Added Trust/Data/Compliance phase to arch-design -1. Refuse if =.ai/= already exists in =PROJECT=. Message: "already installed; use =make audit --apply= to update." -2. Verify =PROJECT= is a git checkout (warn if not — works without git, loses some lifecycle benefits). -3. Create =PROJECT/.ai/= directory. -4. Rsync canonical content: =protocols.org=, =workflows/=, =scripts/= (same three rsyncs as =audit=). -5. Seed =PROJECT/.ai/notes.org= from a canonical template with project-name placeholder. -6. Create empty =PROJECT/.ai/sessions/= (with =.gitkeep= for tracked projects). -7. Track or gitignore =.ai/=? Default: ask. Flag: =--track= / =--gitignore=. -8. Print next-steps banner: =make install-lang LANG= PROJECT==; open Claude Code in the project. +Added a new Phase 4 (Trust, Data, and Compliance) before the paradigm shortlist: trust boundaries, data classification, abuse/misuse cases, privacy constraints, compliance evidence, and operational ownership — surfaced early so the architecture is drawn around them, not retrofitted by a downstream =security-check=. Threaded into the workflow list, brief template (new §6), review checklist, and anti-patterns. -**** Symmetry with existing install targets +*** 2026-05-22 Fri @ 14:59:32 -0500 Split paradigms from tactical patterns in arch-design -#+begin_example -make install-lang LANG=python PROJECT=/path # language bundle (existing) -make install-ai PROJECT=/path # .ai/ template (new) -make install-lang # no args → fzf-pick -make install-ai # no args → fzf-pick from - # ~/projects/* + ~/code/* dirs - # without an existing .ai/ -#+end_example +Split Phase 5's single mixed table into Step 1 (pick one paradigm: monolith/microservices/layered/event-driven/serverless/pipeline/space-based) and Step 2 (compose tactical patterns: DDD, hexagonal, CQRS, event sourcing — several or none, often per-module), with composition examples and an anti-pattern against treating DDD/CQRS as alternatives to a paradigm. Recommendation + brief now name a paradigm plus composed patterns. -*** DONE [#A] Test plan for audit + install-ai before propagating to ratio -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 14:59:32 -0500 Expanded arch-document quality scenarios to the Q42 six-part template -Test against the current state of this machine before pushing changes to ratio. +Replaced §10's thin "Under [condition]..." template with the arc42/Q42 six-part structure (source, stimulus, environment, artifact, response, response measure), each glossed, with the cart-checkout example rewritten across all six parts. A one-line prose form stays acceptable once all six parts are recoverable. -**** =make audit= tests +*** 2026-05-22 Fri @ 14:59:32 -0500 Added staleness/ownership metadata to arch-document output -1. Dry-run report only (no =--apply=). Should show: claude-templates current; per-project drift; correct =ok=/=drift= classifications; summary line and exit code match. -2. After the fold lands, every project should be reported as drift (their =startup.org= still points at the old path). Run =--apply= → rsync converges. Re-run audit → all =ok=. -3. Manually edit one =.ai/workflows/foo.org= in a tracked project. Re-run audit → should report =skipped: uncommitted .ai/=. Run =--apply --force= → rsync clobbers the edit. Verify the edit is gone. -4. Manually delete one =.ai/= dir. Re-run audit → =FAIL: .ai/ missing=. Loop continues. -5. Idempotency: =--apply= twice in a row converges to all =ok= on the second pass. +Added a per-section metadata block (owner, generated-against SHA + date, review cadence, "stale-when" conditions) as an HTML-comment header plus a visible Doc-status note, with field-fill guidance, and a whole-document Doc Status table replacing the README's "Last Updated" stub. Wired into the review checklist and an "Undated docs" anti-pattern. -**** =make install-ai= tests +*** 2026-05-22 Fri @ 14:59:32 -0500 Added confidence levels to arch-evaluate findings -1. Create =/tmp/test-fresh-project= as a git repo. Run =make install-ai PROJECT=/tmp/test-fresh-project=. Verify =.ai/= structure matches canonical, =notes.org= has placeholder, =sessions/= exists. -2. Run =make install-ai PROJECT=/tmp/test-fresh-project= again → should refuse (=.ai/= already exists). -3. Open Claude Code in the new project. Startup workflow runs cleanly (Phase A.0 + Phase A rsync should be a no-op since the install just ran). -4. fzf form: =make install-ai= with no args. Lists candidate dirs (=~/projects/*=, =~/code/*= without =.ai/=). +Added a "Confidence and Provenance" subsection: every framework-agnostic finding carries High/Medium/Low + how it was determined, with a required "Not fully checked because..." note when scale, runtime imports, reflection, or dynamic dispatch cap certainty. Updated the example findings and review checklist; a finding with no note now asserts a full read. -**** Pass criteria +*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-evaluate report skipped tool checks explicitly -- =audit= behavior matches the per-project flow spec for every classification path. -- =install-ai= produces a project indistinguishable from one that's been running sessions for a while. -- =make doctor= still passes 36/0/0 after all the work. -- =make test= (pytest + ERT) passes. +Replaced "skip silently" with explicit reporting: for each detected language whose tool isn't configured or can't run, emit an Info "tool not configured / not run" finding (with an example) so the audit shows what was and wasn't verified. A check that didn't run no longer reads as a pass. Updated workflow step 4 and the review checklist. -*** DONE [#A] Migrate projects on ratio (second machine) -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 14:51:37 -0500 Added notation/output fallback to c4-analyze + c4-diagram -After local fold + audit + install-ai are working, propagate to ratio. +Both commands now treat C4 as notation-independent: a "Choosing a notation" section (draw.io XML, Structurizr DSL, Mermaid with native C4 types, PlantUML/C4-PlantUML) and a headless fallback that emits a text notation (Mermaid or Structurizr DSL) and skips PNG-export/desktop-open when =drawio= or a GUI is absent, rather than failing. draw.io is now one option, not the only one. -**** Steps +*** 2026-05-22 Fri @ 14:51:37 -0500 Clarified C4 abstraction boundaries in c4-analyze + c4-diagram -1. On ratio: =git -C ~/code/rulesets pull= — picks up the folded =claude-templates/= subdir and updated =Makefile= targets. -2. On ratio: archive or =mv= the standalone =~/projects/claude-templates/= aside, replace with symlink to =~/code/rulesets/claude-templates/= (same bridge mechanic as local). -3. On ratio: =make audit= → see drift across ratio's projects. -4. On ratio: =make audit --apply= → rsync into each tracked/gitignored project. Surface projects with uncommitted =.ai/= drift for manual handling. -5. On ratio: =make doctor= → catch any =~/.claude/= install drift (likely some, since ratio hasn't seen recent rulesets updates). -6. Verify by opening Claude Code in a few ratio projects. Startup should be a no-op or near-zero rsync. +Added an "Abstraction boundaries" section to both: a Container is a separately deployable/runnable unit (not synonymous with a Docker container — a SPA or managed DB counts), a Component lives inside one Container and isn't separately deployable. Added a 4e "Verify single abstraction level" check that walks every element and relationship to confirm it stays at the diagram's level, notation-independent. -**** Known unknowns +*** 2026-05-22 Fri @ 15:10:35 -0500 Added "When You Cannot Verify" standard to verification.md -- Ratio may have its own project list overlapping with this machine's but not identical. =audit= discovers projects via the walk, so this is automatic. -- Ratio might have uncommitted =.ai/= work in some projects that this machine doesn't. =audit= surfaces them; handle case-by-case. -- If anything goes wrong, ratio's archived =~/projects/claude-templates/= is the safety net — restore the symlink target and re-run audit. +Added a section requiring, when a verification command can't run, a four-part report: command attempted, why it couldn't run, risk left unverified, and the smallest next command for the user. States the principle that a check that didn't run is never reported as a pass — "unable to verify" is a required honest outcome, not silence. Placed after Red Flags. -**** Adjacent: cross-machine memory sync +*** 2026-05-22 Fri @ 15:10:35 -0500 Added property-based + mutation testing escalation to testing.md -The =[#A] DOING= memory-sync investigation (todo.org:10) is adjacent. Both involve "make my Claude setup portable across machines." Coordinate so the memory-sync stow approach (if approved) doesn't conflict with this fold's symlink mechanics. -** DONE [#B] Document startup pull-ordering rule in protocols.org -CLOSED: [2026-05-15 Fri] +Added an "Escalation Beyond Category and Pairwise" section: property-based testing for invariants over a broad input domain (round-trips, idempotence, ordering — Hypothesis/fast-check/proptest) and mutation testing for when high line coverage hides thin assertions (mutmut/cosmic-ray/Stryker). Both framed as escalation paths to reach for on a gap, not gates on every unit. -Phase A.0 of =startup.org= now pulls rulesets ff-only before the project repo -(shipped 2026-05-15 as part of the claude-templates fold — after the subtree -merge, there's no separate claude-templates pull, just rulesets-then-project). -The protocols.org paragraph stating the ordering and "resolve any issues -before proceeding" rule shipped 2026-05-15 in the =** Startup Pull Ordering= -subsection under =IMPORTANT - MUST DO=. -** DONE [#A] Build =/lint-org= skill + wrap-up integration -CLOSED: [2026-05-14 Thu] +*** 2026-05-22 Fri @ 15:10:35 -0500 Added a disciplined spike protocol to testing.md -Spec: [[file:.ai/specs/lint-org-skill-spec.md]] +Formalized the existing "I need to spike first" excuse-table row into a "Spike Exception (Disciplined)" subsection under TDD Discipline: TDD stays the default, but a spike is sanctioned when all three hold — timeboxed, spike code not committed, and the first failing test written before productionizing the discovered approach. Built on the existing row rather than contradicting it. -A two-mode skill (=interactive=, =mechanical-only=) that runs =org-lint=, -auto-fixes safe categories (item-number, missing-language-in-src-block, -misplaced-planning-info, markdown-bold → single-asterisk), and walks judgment -items (broken local-file links, invalid fuzzy links, verbatim-asterisk false -positives, suspicious-language blocks) inline. +*** 2026-05-22 Fri @ 15:10:35 -0500 Added pre-dispatch availability + cost checks to subagents.md -Wrap-up integration: =wrap-it-up.org= invokes -=/lint-org todo.org --mode=mechanical-only= after the existing -=todo-cleanup.el --archive-done= pass. Judgment items defer to a -carry-forward file that the next morning's daily-prep merges in, so -wrap-up never blocks on a judgment call. +Added a "Pre-Dispatch Checks" section with two gates: Availability (no Agent capability → do the work in the main thread under the same scope/constraints/output discipline the contract would enforce) and Cost (when writing the full contract costs more than the task, do it inline). Cross-references the existing "Don't Subagent At All" section and "Subagenting trivial work" anti-pattern rather than duplicating. -Baseline that motivated this: the 2026-05-14 manual pass took =todo.org= -from 55 → 1 lint warnings across two commits (=0d10458= signal, -=9ad5b30= cosmetic). A nightly mechanical sweep keeps the count near -zero forever — each day's drift is small. -** DONE [#C] Test harness for =make audit= + =make install-ai= edge cases :test: -CLOSED: [2026-05-15 Fri] +*** 2026-05-22 Fri @ 15:06:04 -0500 Revised python-testing SQLite guidance toward production-like DBs -Three edge cases from the fold-epic test plan were not exercised because they're destructive on real projects: +Replaced "prefer in-memory SQLite for speed" with: run ORM/query tests against a production-like DB (same engine as prod, often containerized), since SQLite diverges from Postgres/MySQL on query semantics, constraints, transactions, JSON, time zones, and indexes (a test can pass on SQLite and fail in prod). SQLite stays only for pure unit tests with no DB-semantics dependency. -- =audit --force= clobbers uncommitted =.ai/= work — needs a project with intentionally dirty =.ai/= to verify the override path. -- =audit= reports =FAIL= when =.ai/= is missing — needs a project where the directory was deleted to verify the loop continues past the failure. -- =install-ai= fzf-pick form (no =PROJECT= arg) — needs interactive testing. +*** 2026-05-22 Fri @ 15:06:04 -0500 Clarified python-testing ORM-mocking boundary -Build a self-contained test harness under =.ai/scripts/tests/= that spins up =/tmp/audit-test-projects/= with a known matrix of project states (clean, dirty, missing =.ai/=, pristine, etc.), runs the audit + install-ai targets against it, and asserts expected outputs. The harness should clean up after itself. +Changed the "never mock" bullet from "ORM queries" to "ORM internals (querysets, sessions, model internals)" and added a paragraph: domain services use real model methods/validation, but a thin orchestration unit can inject a fake at a deliberate data-access port (a repository/interface the code owns). That's still mocking at a boundary, not at ORM internals. -Pattern reference: bats or shell-based assertions (similar to the elisp ERT suites for =todo-cleanup= and =lint-org=, but for shell scripts). +*** 2026-05-22 Fri @ 15:06:04 -0500 Made elisp.md editing advice tool-agnostic -Triggered by: 2026-05-15 fold-epic, child 4 test plan; commits =94782ee= (audit) + =d364cf2= (install-ai). -** DONE [#A] wrap it up mentions github, which isn't the remote for many projects. :chore: -CLOSED: [2026-05-16 Sat] -For many of them, git.cjennings.net mirrors to github.com, and github.com isn't the remote. -For many others, git.cjennings.net is the remote with no mirror. -Remove or replace the reference to github.com -** DONE [#B] Phase A startup blind to =claude-templates/inbox/= post-fold :bug:fold: -CLOSED: [2026-05-19 Tue] +Rephrased the "prefer Write over repeated Edits" bullet around intent: land nontrivial Elisp as one cohesive change rather than dribbling it in over tiny partial edits (which accumulate paren mismatches), and run paren-balance + byte-compile checks immediately after, whatever editing mechanism the environment uses. -Resolved on inspection: the bug is moot in current state. =inbox-send.py='s discovery scans =~/code/*= and =~/projects/*= single-level only, so =claude-templates/= (two levels under =~/code/=) is never a routable target; the 2026-05-15 incident was a one-time manual workaround because =rulesets/inbox/= didn't exist yet, and that root inbox was added in =470085f=. =claude-templates/inbox/= was removed 2026-05-15 and is no longer on disk. +*** 2026-05-22 Fri @ 15:06:04 -0500 Added batch-mode + native-comp caveats to elisp-testing.md -Phase A's inbox check at =startup.org:107= runs =\ls -la inbox/= against the project root. Post-fold, the canonical's inbox sits inside the subtree at =claude-templates/inbox/= and never gets scanned. A 2026-05-15 cross-project handoff from a dotemacs session dropped a record there; the next rulesets session (this one) missed it at startup entirely. Picked up only when the working-tree drift surfaced during the publish flow. +Added three sections: Batch-Mode Reproducibility (=emacs --batch= as source of truth, no interactive-session state, no blocking prompts, deterministic), Isolating Emacs State (temp =user-emacs-directory=, explicit load-path, declared deps only, with an unwind-protect sandbox example), and Byte-Compile/Native-Comp Warnings (=byte-compile-error-on-warn=, native-comp gated on =native-comp-available-p= and kept opt-in/version-aware). -Fix: extend Phase A's discovery to also scan =claude-templates/inbox/= when the canonical lives in-repo (i.e., when =claude-templates/.ai/= exists alongside =./.ai/=). The Phase B/C inbox-processing flow already handles per-file routing once a file is surfaced; the gap is only in discovery. +*** 2026-05-22 Fri @ 15:16:22 -0500 Synced hooks/README install snippets with the destructive hook (opt-in) -Adjacent question worth answering at the same time: should cross-project handoffs file into =./inbox/= at the project root (matching what Phase A already scans), or stay in =claude-templates/inbox/= and rely on the discovery fix? The =inbox-send= script's target-project logic is the place to settle that. +Brought the README's manual-install and settings-JSON snippets in line with the canonical =hooks/settings-snippet.json= (which already wires all three) and the Makefile's opt-in design: added the destructive-bash-confirm.py symlink as an opt-in step, added its settings entry, and reworded the note to say all three are no-op-safe but the destructive gate is opt-in (=make install-hooks= excludes it by default — link manually before relying on the snippet entry). -Triggered by: 2026-05-15 evening session, surfaced when committing the test-harness work. -** DONE [#A] Implement task-review daily-habit per spec -CLOSED: [2026-05-20 Wed] -:PROPERTIES: -:LAST_REVIEWED: 2026-05-20 -:END: -Spec: [[file:docs/design/task-review.org]] +*** 2026-05-22 Fri @ 15:35:06 -0500 Hooks now scan file-backed commit/PR messages -Retires =wrap-it-up.org='s date-coverage scan and replaces it with a daily list-hygiene review (N=7 oldest-unreviewed top-level =[#A]= / =[#B]= / =[#C]= tasks per session, ~12-day rotation). Built as a pure Claude workflow — Shape B, no elisp; see the spec's Revision section for why the elisp approach was dropped. +Added =read_referenced_file()= to =_common.py= (safe local read: missing/oversize/non-UTF-8 → None) and wired it in: =git-commit-confirm.py= =extract_commit_message= now handles =-F=/=--file=/=--file==== (reads + scans the file, falls through to UNPARSEABLE → asks if unreadable), and =gh-pr-create-confirm.py= reads =--body-file= content instead of a placeholder. Attribution scanning now sees the real committed/posted text. Built a pytest harness (=hooks/tests/=, importlib-by-path loader for the hyphen-named hooks) and wired =hooks/tests= into =make test=. 54 hook tests pass; full suite green. -Status: -1. [X] =task-review-staleness.sh= + bats (count + =--list= modes). -2. [X] =wrap-it-up.org= health check (threshold 30). -3. [-] =task-review.el= — dropped (Shape B is a pure workflow, not an Emacs mode). -4. [X] New =task-review.org= workflow + INDEX entry (the existing listing workflow was renamed to =open-tasks.org= to free the name). -5. [X] Startup nudge in template =startup.org= (threshold 7), not the project-only startup-extras layer. -6. [X] Smoke test against live =todo.org= — first cycle run 2026-05-20 (7 tasks reviewed: 3 re-grades, 1 cancellation, 1 bump-and-tag). +*** 2026-05-22 Fri @ 15:35:06 -0500 Rewrote destructive-bash rm parsing on shlex -Triggered by: 2026-05-16 brainstorm on retiring the date-coverage scan. -** CANCELLED [#B] Build =ov-1= skill for DoDAF OV-1 (High-Level Operational Concept Graphic) -CLOSED: [2026-05-20 Wed] +=detect_rm_rf= now tokenizes with =shlex.split= instead of a whitespace split, so quoted/spaced paths and combined/separate/reordered flags (=-rf=, =-r -f=, =-fr=, =--recursive=/=--force=) all parse. Fails toward asking — returns a sentinel that still fires the modal — on unbalanced quotes or when a forced recursive rm coexists with a compound/pipeline/substitution/redirect construct. Documented the supported/unsupported shell constructs in the docstrings, and extended the dangerous-path banner to =$HOME=-prefixed and wildcard targets. Covered by 25 new tests. (Pre-existing, out-of-scope: path-prefixed =rm= like =/bin/rm= still isn't matched.) +** DONE [#B] Add =make remove= for interactive ruleset removal via fzf +CLOSED: [2026-05-22 Fri] +Shipped: =scripts/remove.sh= (three modes — =--list=, =--remove-selected= reading stdin, and the default fzf-multi interactive flow) + =make remove= target + =scripts/tests/remove.bats= (5 cases). Lists only symlinks resolving into the repo (foreign links left alone); rm's picked links while leaving repo sources untouched; reports-and-continues on a missing target; quiet no-op on empty selection. shellcheck clean, make test green. Dropped the stale =bridge= entry per the note below. -Cancelled during the 2026-05-20 task review. +Add a Makefile target that lists every currently-installed ruleset entry +and lets me pick one or more to remove via fzf. Granular alternative to +=make uninstall= (removes everything) and =make uninstall-hooks= (removes +only hooks). -Triggered by SOFWeek (May 2026, Tampa) — DeepSat attending; DoD attendees -may ask for architecture diagrams. OV-1 is the universal informal -currency in DoD briefings ("show me the architecture" → OV-1 by default). +*** Why this matters -Priority upgrades to =[#A]= if Craig confirms scenario 2 below (personal -load-bearing need at the event); stays =[#B]= or drops to =[#C]= if -scenario 1 (team already covers it, future asset only). +Tearing down a single skill, rule, hook, or config file currently means +either running =make uninstall= and re-installing what I want to keep, +or =rm=ing the symlink directly and remembering the exact path. Both are +friction. An interactive picker lets me filter, multi-select with Tab, +and confirm with Enter — the typical fzf flow. Costs about 3-5 seconds +per teardown instead of 15+ seconds of "what's the exact name?". -*** Prior art (searched 2026-04-19) +*** Design -No existing Claude Code skill exists for DoDAF / OV-1 / SV-1 / SysML. +The recipe builds a tab-separated list of every currently-installed item, +categorized by type, and pipes it to =fzf --multi=. The user filters, +marks with Tab, and confirms with Enter. The recipe parses the selections +and =rm=s the matching symlinks. -- =anthropics/skills= — 17 skills, zero DoDAF/SysML/defense coverage. -- =awesome-claude-code= list — zero hits for DoDAF/OV-1/SysML/UAF. -- =mfsgr/sysml2dodaf= — empty repo (0 stars, no code). Vapor. -- =HowardKao-1130/mini-NEXEN= — broad SE methodology skill that - name-drops DoDAF as a trigger keyword; no artifact generation. 0 stars. -- =gaphor/gaphor= (Apache-2.0, 2.2k stars) — mature UML/SysML GUI - modeler. Not a skill; not a pipeline. Useful reference only. +#+begin_example + skill debug + rule commits.md + hook destructive-bash-confirm.py + config settings.json + commands commands + bridge claude-rules +#+end_example -Nearest prior art to lean on when building: -- DoDAF 2.02 Viewpoints & Models reference (dodcio.defense.gov) — - canonical OV-1 exemplars. Embed 3-5 layouts as skill =references/=. -- Pattern from existing =c4-diagram= skill — same shape (prose → diagram - spec), swap the viewpoint vocabulary to DoDAF. -- PlantUML for SV-1 (when that skill comes later); Mermaid or draw.io - XML for OV-1 lightweight visuals. +Each line is =\t=. The recipe maps == to the right path: -*** Build scope (when triggered) +- =skill= → =$(SKILLS_DIR)/= +- =rule= → =$(RULES_DIR)/= +- =hook= → =$(HOOKS_DIR)/= +- =config= → =$(CLAUDE_DIR)/= +- =commands= → =$(CLAUDE_DIR)/commands= +- =bridge= → =$(SKILLS_DIR)/claude-rules= -*In scope:* -- Input: prose description of a system + its operational context. -- Output: structured OV-1 *spec* — performers, external actors (other - systems, forces, adversaries), relationships (data/control flows), - narrative captions, classification marking, legend requirements. -- DoDAF 2.02 completeness checklist as a quality gate — verify the - produced spec contains every element a correct OV-1 requires. -- Optional lightweight visual: draw.io XML or Mermaid approximation for - quick review; NOT a finished rendering. +Source files in =rulesets/= stay untouched. =make install= re-creates the +removed links if needed (the install loop is idempotent). -*Out of scope:* -- Icon libraries, pictorial assets, finished PowerPoint export. OV-1 - final art belongs to a designer or Craig in Visio/PowerPoint; the - skill's job is the spec and the check, not the slide. -- SV-1, SV-2, UAF, IDEF1X, other viewpoints. Build only when a - concrete need triggers each. +*** Edge cases -Estimate: 4-6 hours. +- Esc instead of Enter → empty selection → clean exit, no removal. +- Filter to nothing then Enter → same as Esc. +- Selected item already gone → =rm= fails visibly, processing continues + on the rest. +- =fzf= not installed → fail fast with a clear error (matches the pattern + used by =install-lang=). -*** Craig's investigation before kickoff +*** Possible extensions -1. Does DeepSat's systems-engineering or marketing team already have an - OV-1 (or the equivalent briefing artifact) for SOFWeek? -2. If yes (scenario 1) — skill is a future asset, not event-load-bearing. - Ship after SOFWeek. Priority drops to =[#C]=. -3. If no, or if the scenario is "Craig may need to produce/iterate an - OV-1 on the fly during the event" (scenario 2) — skill is load-bearing - for the event. Priority upgrades to =[#A]=; build before SOFWeek. -4. Confirm the classification level the skill needs to handle - (unclassified-only? or FOUO markings? affects the classification - block in the spec). -5. Confirm the target rendering format DeepSat uses for OV-1 - deliverables (PowerPoint slide? Cameo? Visio? affects whether the - skill emits draw.io XML vs Mermaid vs pure structured spec). +- Parallel =make pick-install= target that lists not-yet-installed items + and installs the chosen ones. Symmetric UX, same fzf flow. +- Confirmation prompt when more than N items selected (defense against + accidental select-all). +- =--source= flag that also runs =git rm= against the rulesets source for + the selected item. Probably bad idea — too easy to lose work. +- The =bridge → $(SKILLS_DIR)/claude-rules= entry above is stale — the + bridge symlink got removed in a later commit. Drop that bullet when the + recipe lands. +** DONE [#B] Document the =mcp/= install pipeline in =mcp/README.org= +CLOSED: [2026-05-22 Fri] +Wrote =mcp/README.org= covering everything in the "what to cover" list: the file layout (tracked vs gitignored), the secrets-bundle shape (plain =${VAR}= secrets + base64-bundled OAuth artifacts, AES256 symmetric =gpg -c=), the install flow (decrypt → materialize keys/token caches at mode 600 → expand → register unregistered, idempotent), the http/sse-vs-stdio transport split, token rotation when a Google refresh token is revoked, and adding a new server. Grounded in a read of the actual =install.py= + =servers.json=. -*** Related +=mcp/= has =install.py=, =servers.json=, =secrets.env.gpg=, =gcp-oauth.keys.json= (gitignored, regenerated at install). No README. Coming back to this in three months I'll re-discover how the bundle is structured, what =install.py= does, and how to rotate tokens. Saving that re-discovery is the whole point. -See also the DoD-specific notations section under the later TODO -(=c4-*= rename revisit) — OV-1 is flagged there as the highest-value -starting point across the DoD notation landscape (SysML, DoDAF/UAF, -IDEF1X). This entry is the execution plan for that starting point. +*** What to cover + +- Layout: what each file is, which are tracked vs gitignored. +- Secrets bundle shape: how vars are listed in =secrets.env=, the symmetric-encryption pattern (=gpg -c --cipher-algo AES256=), the base64-bundled OAuth artifacts (=GCP_OAUTH_KEYS_JSON_B64=, =GOOGLE_DOCS_PERSONAL_TOKEN_B64=, =GOOGLE_DOCS_WORK_TOKEN_B64=). +- Install flow: =make install-mcp= → =install.py= decrypts, writes the keys file and Google Docs token caches at mode 600, expands =${VAR}= in =servers.json=, calls =claude mcp add --scope user= for unregistered servers. Idempotent. +- Token rotation: when a refresh token gets revoked, the recovery flow (re-auth on one machine, re-bundle, recommit). +- Adding a new server: edit =servers.json=, add any new =${VAR}= placeholders to the bundle, re-encrypt. +- The OAuth dance for HTTP-transport servers (linear, notion) versus stdio (google-docs-*) — different paths, different gotchas. -- cgit v1.2.3