aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-05-22 18:25:29 -0500
committerCraig Jennings <c@cjennings.net>2026-05-22 18:25:29 -0500
commit459d426a23f6a96b66c60f202b577d67547f34e8 (patch)
tree3db38848ddcce644717fef1b2f507a4c8137c5eb
parent1e216dd170a46d99ef400ca6bf98f97b23c1db9b (diff)
downloadrulesets-459d426a23f6a96b66c60f202b577d67547f34e8.tar.gz
rulesets-459d426a23f6a96b66c60f202b577d67547f34e8.zip
chore(ai): archive session record, sweep completed tasks, queue follow-ups
Archived the session record. Moved six completed tasks from Open Work to Resolved: the 2026-05-04 audit-pass parent, the two commits.md overlay tasks, the make-remove feature, the mcp/ install-pipeline doc, and the wrap-it-up GitHub-host quick fix. Queued the one lint judgment and the task-review staleness note in the inbox for next-session processing.
-rw-r--r--.ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org (renamed from .ai/session-context.org)24
-rw-r--r--inbox/lint-followups.org5
-rw-r--r--todo.org915
3 files changed, 480 insertions, 464 deletions
diff --git a/.ai/session-context.org b/.ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org
index 2cb293b..e3a060a 100644
--- a/.ai/session-context.org
+++ b/.ai/sessions/2026-05-22-18-23-audit-sweep-and-bundle-sync-tooling.org
@@ -1,11 +1,11 @@
-#+TITLE: Session — language-bundle startup self-sync
+#+TITLE: Session — fleet reconcile, bundle-sync tooling, and the 2026-05-04 audit sweep
#+DATE: 2026-05-22
* Summary
** Active Goal
-Build =scripts/sync-language-bundle.sh=: a per-project language-bundle freshness check wired into startup Phase A. Detect the installed bundle by fingerprint, auto-fix rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), surface drift in project-customizable files (=settings.json=, =CLAUDE.md=) without writing.
+A long, multi-strand session. (1) Reconciled all 26 AI projects against their remotes and synced =.ai/= templates fleet-wide. (2) Built the per-project bundle-sync tooling: =sync-language-bundle.sh= (startup freshness for language bundles, later generalized to team overlays), =make install-team=, and =make remove=. (3) Completed the entire 2026-05-04 audit cluster — all 55 items dispositioned, including splitting DeepSat's publishing rules out of the global =commits.md= into an installable team overlay and deploying it to =~/projects/work=. (4) Documented the =mcp/= install pipeline. (5) Committed Craig's in-flight WIP (emacs live-reload rule, task-audit workflow, voice + review-code refinements). The original entry point was the language-bundle self-sync (item below); everything else grew from "what's next / any ripe fruit" between completed pieces.
** Decisions
@@ -47,10 +47,26 @@ Done (all pushed through =1825226=):
- *Browser testing* (1 of 3) — headed/headless decision tables in both playwright skills (2e9d5b0). DEFERRED: #1 networkidle/locator refactor (touches helper code, no tests) and #3 emoji sweep (~30 occurrences, 7 files) — both spread-heavy, held for focused passes given the concurrent-edit warning.
- *Debugging/RCA* (3) — debug env/recent-change capture, root-cause-trace boundary-only defense, five-whys evidence+counterfactual per link (3916dc4).
+*** Most of the cluster cleared — 48 items, all pushed through efcc8e5
+
+Continued area-by-area with parallel subagents (batches of 3-4, every diff reviewed) for independent command/skill/rule files. Also done + pushed since the line above: Tests/TDD (2, 1 moot), Frontend (2), Security (2), Pairwise (2), V2MOM (3), Prompt-eng (2), Codify (1), Branch-workflow (4), C4 (2), Architecture (8), Languages (4), Global-rules verification/testing/subagents (4), Hooks README (1).
+
+** Audit cluster essentially complete — 53 of 55 items done + pushed (through 81280b7)
+
+Code items finished after the prose batches: Hooks #2/#3 (built a pytest harness under hooks/tests + read_referenced_file helper + shlex rm parsing; 54 tests, wired into make test, /review-code Approve), Browser #1 networkidle/locator refactor + #3 emoji sweep (node --check + py_compile clean), and the earlier-missed brainstorm item (timebox + fresh-sources + Assumptions section).
+
+** Audit cluster fully closed — both commits.md items shipped + deployed
+
+- *#1012 (split)* — shipped 3cb467e: DeepSat publishing steps moved out of global commits.md into =teams/deepsat/claude/rules/publishing.md=; commits.md uses seams (startup-extras pattern); added =install-team= (targeted copy, never globally symlinked) + generalized =sync-language-bundle.sh= to keep team overlays fresh (process_bundle function, team syncs only its own rule; 3 new bats). Both /review-code Approve.
+- *#1019 (/voice fallback)* — shipped ca6a213: Single-skill gate now handles /voice being unavailable (walk patterns inline, flag, don't block).
+- *Cross-project deploy (done from here)* — =make install-team TEAM=deepsat PROJECT=~/projects/work= + a sync refresh: work now has the split commits.md + publishing.md overlay (both match canonical; .claude/ gitignored there so nothing to commit). Handoff note dropped at =~/projects/work/inbox/2026-05-22-1708-from-rulesets-deepsat-publishing-overlay-installed.org=.
+
** Next Steps
-- Continue the cluster, next area my choice (~29 items left across ~12 areas; Browser testing, Security, Global rules, Hooks, Languages, Architecture, C4, etc.).
-- The 3 real bundle-bearing projects (chime, gloss = elisp; work = python) self-heal language-bundle drift at their own next startup via step 12 — Craig opted to let them.
+- *memory-sync [#A]* (=DOING=) — held by Craig: dotfiles are being split out of archsetup into their own repo, so the stow target is in flux. Revisit once that settles; the VERIFY now records the new dotfiles repo as the target, not =archsetup/dotfiles/common/=.
+- Open carryover: =create-documentation= skill, =/update-skills= skill (both [#C]).
+- Bundle-bearing projects (chime, gloss, work) self-heal language + team-overlay drift at their next startup via the step-12 sync.
+- The 2026-05-04 audit cluster is fully closed (parent archived). =teams/deepsat/= overlay is deployed to =~/projects/work= and kept fresh by startup sync.
* Session Log
diff --git a/inbox/lint-followups.org b/inbox/lint-followups.org
index 42b0eb4..ae8591e 100644
--- a/inbox/lint-followups.org
+++ b/inbox/lint-followups.org
@@ -8,3 +8,8 @@
** TODO line 2070 — misplaced-heading — Possibly misplaced heading line
* 2026-05-21 Thu — Task-review health: 12 top-level [#A]/[#B]/[#C] tasks unreviewed for >30 days (daily review may have slipped)
+
+* 2026-05-22 lint-org follow-ups — todo.org
+** TODO line 1454 — misplaced-heading — Possibly misplaced heading line
+
+* 2026-05-22 Fri — Task-review health: 10 top-level [#A]/[#B]/[#C] tasks unreviewed for >30 days (daily review may have slipped)
diff --git a/todo.org b/todo.org
index 347194a..f3c4b48 100644
--- a/todo.org
+++ b/todo.org
@@ -7,25 +7,6 @@ Project-scoped (not the global =~/sync/org/roam/inbox.org= list).
* Rulesets Open Work
-** DONE [#A] Split team-specific publishing rules out of commits.md :commits:
-CLOSED: [2026-05-22 Fri]
-Shipped 3cb467e. Moved the DeepSat publishing steps (Linear ticket-state, the Slack notification protocol + channel ID, the GHE host, the team merge norm, the Linear ticket-body structure) out of the global =claude-rules/commits.md= into =teams/deepsat/claude/rules/publishing.md=. The global file keeps the universal skeleton and uses seams ("run the project's publishing overlay here if present") like startup-extras. Added =install-team= (targeted per-project copy, keyed on PROJECT, never globally symlinked) and generalized =sync-language-bundle.sh= to keep team overlays fresh at startup (3 new bats; make test green).
-
-Remaining deploy step (cross-project, surfaced to Craig): install the overlay into the DeepSat work project — =make install-team TEAM=deepsat PROJECT=<deepsat-path>= — so it actually loads there.
-
-** DONE [#A] Define a /voice-unavailable fallback in the commits.md publish flow :commits:
-CLOSED: [2026-05-22 Fri]
-Added an "If =/voice= is unavailable" paragraph to the Single-skill gate in =commits.md=: walk the same patterns inline (the flow already names which matter), state the skill was unavailable and the pass was applied by hand ("/voice unavailable — patterns walked inline"), and flag the missing skill for install. The gate is the pattern walk, not the tooling. The original "=humanizer= unavailable" framing was moot (humanizer → /voice).
-
-** DONE [#A] wrap-it-up Step 3.5 assumes GitHub-family remote :chore:quick:
-CLOSED: [2026-05-22 Fri]
-:PROPERTIES:
-:LAST_REVIEWED: 2026-05-20
-:END:
-Documented the assumption inline at =wrap-it-up.org= Step 3.5 (chose the lightweight path over a provider-agnostic rewrite): the =gh= lookup expects a GitHub-family host, holds today via DeepSat on GHE, flagged for update if a future Linear project lands on GitLab/Gitea/Bitbucket.
-Triggered by: 2026-05-16 wrap-it-up github.com cleanup (audit of the same file).
-
-Step 3.5 (Linear ticket-state hygiene) at =wrap-it-up.org:207= says "the project's GitHub remote — use =gh pr list ...=". Currently fine in practice: the step is Linear-gated, and the only Linear-using project is DeepSat (on =deepsat.ghe.com=, a GitHub-family host where =gh= works). Would break if a future Linear-using project lived on a non-GitHub host (gitlab, gitea, bitbucket). Either drop the GitHub-family assumption (provider-agnostic lookup, harder) or document the assumption explicitly so future projects know the step needs an update if they don't fit.
** DOING [#A] Check that memories are sync'd across machines via git.m
:PROPERTIES:
:LAST_REVIEWED: 2026-05-20
@@ -697,365 +678,6 @@ The skill should reject:
public/library/API docs: =llms.txt= or markdown export is valuable, but normal
human navigation remains primary.
-** DONE [#C] Review pass: tighten skills and rulesets after 2026-05-04 audit
-CLOSED: [2026-05-22 Fri]
-:PROPERTIES:
-:LAST_REVIEWED: 2026-05-20
-:END:
-All 55 grouped-index items dispositioned (2026-05-22): ~49 edited across skills, commands, rule files, hooks, and the two playwright skills; several came out moot post-audit (humanizer→voice, skills→commands, typescript ruleset added); the two commits.md items shipped as the team-overlay split + /voice fallback. Freshness-checked each item against current reality before editing.
-
-Source notes used in this pass:
-- C4 official docs: C4 is notation-independent; System Context and Container
- diagrams are enough for most teams; every diagram needs title, key/legend,
- explicit element types, and audience-appropriate abstraction.
- [[https://c4model.com/diagrams][C4 diagrams]],
- [[https://c4model.com/diagrams/notation][C4 notation]],
- [[https://c4model.com/abstractions/component][C4 component]]
-- arc42 docs: quality requirements need measurable scenarios; section 10
- should reference top quality goals and capture lesser quality requirements
- with specific measures. [[https://docs.arc42.org/section-10/][arc42 section 10]],
- [[https://quality.arc42.org/articles/specify-quality-requirements][specifying quality requirements]]
-- ADR references: ADRs capture one justified architecturally significant
- decision and its rationale; Nygard's original guidance emphasizes short,
- numbered, repository-stored records and superseding rather than rewriting old
- decisions. [[https://adr.github.io/][adr.github.io]],
- [[https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions][Nygard ADR article]]
-- Playwright docs: prefer user-visible locators and web assertions; locators
- auto-wait and retry; =networkidle= is discouraged for testing readiness.
- [[https://playwright.dev/docs/best-practices][Playwright best practices]],
- [[https://playwright.dev/docs/locators][Playwright locators]],
- [[https://playwright.dev/docs/next/api/class-page][Playwright page API]]
-- OWASP references: Top 10 2021 includes Broken Access Control,
- Cryptographic Failures, Injection, Insecure Design, Security
- Misconfiguration, Vulnerable and Outdated Components, Identification and
- Authentication Failures, Software and Data Integrity Failures, Security
- Logging and Monitoring Failures, and SSRF; WSTG adds a broader testing map
- across configuration, identity, authn/z, sessions, input validation, error
- handling, cryptography, business logic, client-side, and API testing.
- [[https://owasp.org/Top10/2021/][OWASP Top 10 2021]],
- [[https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/][OWASP WSTG]]
-- V2MOM references: Salesforce calls the last M "Measures" and emphasizes a
- simple alignment document with prioritized Methods, explicit Obstacles, and
- measurable outcomes. [[https://trailhead.salesforce.com/content/learn/modules/selfmotivation/get-focused-with-your-personal-v2mom][Salesforce Trailhead personal V2MOM]],
- [[https://www.salesforce.com/blog/?p=12][Salesforce V2MOM alignment]]
-- Prompt research: the cited Meincke paper is titled "Call Me A Jerk:
- Persuading AI to Comply with Objectionable Requests"; its scope is
- persuasion increasing compliance with objectionable requests, not a general
- proof that persuasion framing improves prompt quality.
- [[https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5357179][SSRN paper]]
-- Combinatorial testing references: NIST supports t-way combinatorial testing
- and notes pairwise is one covering strength, with higher-strength arrays
- useful for failures requiring more interacting factors.
- [[https://www.nist.gov/publications/practical-combinatorial-testing-beyond-pairwise][NIST beyond pairwise]],
- [[https://www.nist.gov/publications/combinatorial-software-testing][NIST combinatorial testing]]
-
-*** Grouped index (for batching by area)
-
-Each item below is a one-line summary of a sub-TODO further down. Tick the box when the matching sub-TODO is moved to =DONE=. Items are grouped by area so they can be batched (e.g., "do all Playwright items in one session").
-
-**** Browser testing
-- [X] [#A] =playwright-js=: locator/assertion-first guidance (replace raw CSS, =networkidle=)
-- [X] [#B] =playwright-js= + =playwright-py=: reconcile headless/visible defaults
-- [X] [#B] =playwright-js= + =playwright-py=: remove emoji console markers from examples
-
-**** Frontend / UI
-- [X] [#B] =frontend-design=: WCAG 2.2 alignment, accessibility non-optional
-- [X] [#B] =frontend-design=: harmonize aesthetic guidance with anti-pattern rules
-
-**** Security
-- [X] [#A] =security-check=: OWASP 2021 + WSTG coverage
-- [X] [#B] =security-check=: tooling and offline/network caveats
-
-**** Combinatorial testing
-- [X] [#B] =pairwise-tests=: t-way escalation guidance beyond pairwise
-- [X] [#B] =pairwise-tests=: clarify negative value syntax + generator availability
-
-**** V2MOM
-- [X] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment)
-- [X] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog
-- [X] [#B] =create-v2mom=: mitigation/owner fields for Obstacles
-
-**** Prompt engineering
-- [X] [#A] =prompt-engineering=: correct/narrow Meincke citation
-- [X] [#B] =prompt-engineering=: eval-harness requirement for production prompts
-
-**** Codify
-- [X] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md=
-
-**** Code review
-- [X] [#A] =review-code=: resolve local-verification vs CI boundary
-- [X] [#B] =review-code=: =CLAUDE.md= citation scope for public artifacts
-- [X] [#B] =review-code=: relax three-strengths rule for tiny/failing diffs
-
-**** PR / review responses
-- [X] [#A] =respond-to-review=: remove review-process language from commit messages
-- [X] [#B] =respond-to-review=: use unresolved threads + resolution state
-- [X] [#B] =respond-to-cj-comments=: drop personal absolute paths from public-writing (moot — already clean)
-- [X] [#B] =respond-to-cj-comments=: fallback when =humanizer= or =emacsclient= unavailable (moot — superseded by /voice + VERIFY pattern)
-
-**** Branch workflow
-- [X] [#A] =finish-branch=: fix base-branch detection
-- [X] [#B] =finish-branch=: worktree-aware pull/merge safety
-- [X] [#B] =start-work=: tool-availability + ceremony-scaling rules
-- [X] [#B] =start-work=: claim-before-justify rollback risk
-
-**** Tests / TDD
-- [X] [#B] =add-tests=: fix missing =typescript-testing.md= reference or add ruleset (moot — ruleset now exists)
-- [X] [#B] =add-tests=: explicit exceptions to "all three categories per function"
-
-**** Debugging / RCA
-- [X] [#B] =debug=: capture environment + recent-change context before hypotheses
-- [X] [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries
-- [X] [#B] =five-whys=: require evidence + counterfactual validation per why
-
-**** Brainstorming
-- [X] [#B] =brainstorm=: timebox + research/source rules for high-stakes designs
-
-**** Architecture
-- [X] [#B] =arch-decide=: timeless examples, drop unverifiable claims
-- [X] [#B] =arch-decide=: standardize statuses + immutability language
-- [X] [#B] =arch-design=: threat modeling + privacy/compliance as first-class inputs
-- [X] [#B] =arch-design=: separate paradigms from tactical patterns
-- [X] [#B] =arch-document=: arc42/Q42 quality scenarios
-- [X] [#B] =arch-document=: staleness + ownership metadata for generated docs
-- [X] [#B] =arch-evaluate=: confidence levels for framework-agnostic findings
-- [X] [#B] =arch-evaluate=: report skipped tool checks explicitly
-
-**** C4 modeling
-- [X] [#A] =c4-analyze= + =c4-diagram=: notation/output fallback (not draw.io-only)
-- [X] [#B] =c4-analyze= + =c4-diagram=: clarify abstraction boundaries
-
-**** Global rules
-- [X] [#B] =commits.md=: split DeepSat/Linear/Slack-specific from global rules → promoted to a top-level task (deferred for Craig)
-- [X] [#A] =commits.md= + publish flows: =humanizer=-unavailable fallback → promoted to a top-level task (deferred; humanizer premise moot)
-- [X] [#B] =verification.md=: explicit "unable to verify" reporting standard
-- [X] [#B] =testing.md=: property-based + mutation testing as escalation paths
-- [X] [#B] =testing.md=: soften absolute TDD with explicit spike protocol
-- [X] [#B] =subagents.md=: capability/availability + cost checks
-
-**** Languages
-- [X] [#A] =python-testing.md=: revisit in-memory SQLite guidance
-- [X] [#B] =python-testing.md=: separate "never mock ORM" from unit-test boundaries
-- [X] [#B] =elisp.md=: drop tool-specific advice
-- [X] [#B] =elisp-testing.md=: batch-mode + native-comp caveats
-
-**** Hooks
-- [X] [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets
-- [X] [#A] =hooks/git-commit-confirm.py= + =gh-pr-create-confirm.py=: inspect message/body files referenced by =-F= / =--body-file=
-- [X] [#B] =hooks/destructive-bash-confirm.py=: shell-aware command parsing (not regex)
-
-*** 2026-05-22 Fri @ 15:47:10 -0500 Made playwright guidance locator/assertion-first, dropped networkidle-as-readiness
-
-Rewrote the readiness guidance in both =playwright-js/SKILL.md= and =playwright-py/SKILL.md=: reconnaissance now waits for a visible app landmark via a web assertion or locator (=expect(...).toBeVisible()= / =get_by_role(...).wait_for()=), not =networkidle= (which Playwright discourages). Updated the login/form examples to =getByLabel=/=getByRole= + web assertions, the API_REFERENCE.md waiting section, and =lib/helpers.js= defaults (=waitForPageReady= now defaults to =load= and prefers a caller-supplied landmark; =authenticate= races the success indicator over a =load= navigation). node --check passes.
-
-*** 2026-05-22 Fri @ 14:23:02 -0500 Added headed/headless decision tables to both playwright skills
-
-Added matching purpose-based decision tables to =playwright-js/SKILL.md= (was "always visible") and =playwright-py/SKILL.md= Best Practices (was "always headless"). Each names its own default and points at the other skill, so the difference is deliberate, not a habit-flip: headed for interactive debugging, headless for CI/pytest. Also softened the absolutist "Always launch... headless" comment in the py example.
-
-*** 2026-05-22 Fri @ 15:47:10 -0500 Removed emoji console markers from the playwright skills
-
-Replaced every emoji status marker with a plain ASCII prefix across =playwright-js/= (run.js, lib/helpers.js, SKILL.md) and =playwright-py/= (SKILL.md, examples/*.py): 📦/⚡/📄/📥/🎭/🚀/📋/✅/❌/🔍/📸/✓/✗ → =[setup]=/=[run]=/=[ok]=/=[error]=/=[fail]= etc. Post-change emoji grep is clean (excluding node_modules); node --check and py_compile pass.
-
-*** 2026-05-22 Fri @ 14:35:16 -0500 Made accessibility a non-optional WCAG 2.2 gate in frontend-design
-
-Added an "Accessibility Gate (required before handoff)" section to =frontend-design/SKILL.md= covering keyboard operation, focus visibility, focus-not-obscured (2.2), target size (2.2), contrast, reduced motion, labels, and semantic structure — a baseline for all frontend work, not just interactive components. Rewrote the Build/Review phases to build accessibly as you go and clear the gate before handoff, and bumped =references/accessibility.md= from WCAG 2.1 to 2.2 with backing detail for the new criteria.
-
-*** 2026-05-22 Fri @ 14:35:16 -0500 Added a "creative but bounded" section to frontend-design
-
-Added a subsection under Frontend Aesthetics framing the bold/maximalist directions as tools, not obligations: domain fit, readability first, responsive stability, and no decorative effect that degrades the workflow. Reconciles rather than contradicts the maximalist encouragement (maximalism stays on the table as deliberate usable density), and ties the readability bullet to the new accessibility gate.
-
-*** 2026-05-22 Fri @ 14:35:16 -0500 Updated security-check to OWASP Top 10 2021 + WSTG mapping
-
-Replaced the older six-category list in =.claude/commands/security-check.md= with the full Top 10 2021 set, each finding mapped to a 2021 category or WSTG area. Added the four missing categories (Insecure Design, Software and Data Integrity Failures, Security Logging and Monitoring Failures, SSRF) plus explicit checks for object/function-level authorization, SSRF on URL-fetch paths, update/plugin/dependency integrity, and logging/monitoring gaps.
-
-*** 2026-05-22 Fri @ 14:35:16 -0500 Added scanner tooling + network caveats to security-check
-
-Added an optional configured-scanners step (=gitleaks=/=trufflehog= secrets, =semgrep= source patterns, OSV scanner, lockfile-diff review) that supplements the manual scans, plus a network caveat: dependency audits that can't run (offline, tool absent, DB unreachable) must report "not run" naming the tool and reason, never read as a pass. Carried that into the no-issues summary.
-
-*** 2026-05-22 Fri @ 14:35:16 -0500 Added t-way escalation guidance to pairwise-tests
-
-Added an "Escalating Beyond Pairwise (t-way)" subsection: start with pairwise across the whole space, then escalate specific high-risk clusters to 3-way+ when history, safety, security, or domain coupling says a fault needs more than two interacting factors. Lists escalation triggers and shows the sub-model order syntax (={ A, B, C } @ 3=) vs a blanket =/o:3= bump, stressing targeted not uniform escalation. Cites NIST combinatorial-testing work.
-
-*** 2026-05-22 Fri @ 14:35:16 -0500 Clarified PICT ~ syntax + honest generator-availability path in pairwise-tests
-
-Added a "~ prefix" explanation (PICT marker tagging a value as negative/invalid, not an arithmetic operator; PICT pairs negatives with valid values once and strips the marker before the SUT) and a stop-at-the-model rule: if neither the =pict= binary nor =pypict= is present, produce the model and stop rather than hand-writing a table and passing it off as PICT output.
-
-*** 2026-05-22 Fri @ 14:43:17 -0500 Renamed Metrics → Measures throughout create-v2mom
-
-Full rename across =.claude/commands/create-v2mom.md= (acronym expansions, Phase 7 heading, the "Measures must be measurable" principle, exit criteria, review questions, red flags, examples) to match Salesforce's official term. Kept the "vanity metrics" idiom intact — it's the anti-pattern term, not a section reference.
-
-*** 2026-05-22 Fri @ 14:43:17 -0500 Split strategy from execution in create-v2mom task migration
-
-Rewrote Phase 8 (and tightened Phase 5.5): tasks stay in the backlog grouped by method, and each method gains a one-line link to where its tasks live, instead of transplanting the task tree into the V2MOM. Strategy (V2MOM) and execution (backlog) are now explicitly separate sources of truth, keeping the V2MOM concise.
-
-*** 2026-05-22 Fri @ 14:43:17 -0500 Made create-v2mom obstacles operational (mitigation/owner/cadence)
-
-Phase 6 now captures, per obstacle: name, manifestation, stakes, mitigation, owner, and review cadence — with a worked example per domain (health/finance/software), a "good obstacle" characteristic, a Phase 9 review question, and a red flag for candid-but-not-operational obstacles. An obstacle without a countermove is now flagged as an observation, not a plan.
-
-*** 2026-05-22 Fri @ 14:43:17 -0500 Corrected and narrowed the Meincke citation in prompt-engineering
-
-Fixed the title to "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests" (SSRN abstract_id=5357179) in all three spots (frontmatter, Seven Principles intro, References). Reframed the ~33%→72% result as what it is — a prompt-safety caution that persuasion raises compliance with objectionable requests — explicitly not evidence that persuasion framing improves engineering prompt quality. Kept the seven principles as a tone vocabulary.
-
-*** 2026-05-22 Fri @ 14:43:17 -0500 Added an eval-harness requirement to prompt-engineering critique mode
-
-Added critique step 7 + a checklist line: for fragile or reusable/production prompts, write 3-5 adversarial/edge inputs, run both the old and new prompt against each, and record the behavioral delta. A throwaway prompt can ship on the rewrite alone; a discipline/reused/production one can't. Without examples, "the rewrite is better" is an assertion, not a result.
-
-*** 2026-05-22 Fri @ 14:43:17 -0500 Added mandatory stale-entry + privacy pre-write checks to codify
-
-Added a "Mandatory pre-write checks" block at the top of Phase 3 (Write) in =.claude/commands/codify.md=: a stale-entry scan (update/remove no-longer-true entries in place, don't append contradictions around them) and a privacy/leak check carrying both questions verbatim — "safe if the project were public?" and "belongs in private memory instead?" — routing private content to auto-memory. Gates, not background guidance.
-
-*** 2026-05-22 Fri @ 14:06:41 -0500 Scoped review-code's CI-trust rule to reviewing, not shipping
-
-Expanded the False-Positive Filter bullet in =review-code/SKILL.md=: "trust CI, don't run builds" applies to reading a diff, not producing one. A pre-commit/pre-push flow still owes the local verification =verification.md= requires (run the suite or state "not run because..."). Closes the apparent contradiction with =verification.md= / =finish-branch=.
-
-*** 2026-05-22 Fri @ 14:06:41 -0500 Added private-vs-public CLAUDE.md citation modes to review-code
-
-Expanded the Content scope section in =review-code/SKILL.md= with two modes: a private/internal review cites =CLAUDE.md= directly; a public/team review translates the rule into the engineering reason it encodes and doesn't name the rules file (a teammate can act on the reason, not on a file they can't reach). Same principle =commits.md= states for personal tooling in public artifacts.
-
-*** 2026-05-22 Fri @ 13:48:14 -0500 Relaxed review-code "three strengths" to up-to-three-or-none
-
-Changed all three "three minimum" spots in =review-code/SKILL.md= (Strengths section, Critical Rules DO list, Anti-Patterns) to "up to three specific; say none found on a tiny or weak diff." Reframed the old "No Strengths section" anti-pattern as "Skipping strengths out of laziness" so a substantive diff still demands them while a weak one can honestly report nothing notable. Landed alongside Craig's adjacent edit telling reviewers not to explain why a strength is good (sycophantic padding).
-
-*** 2026-05-22 Fri @ 14:12:24 -0500 Removed review-process language from respond-to-review commit guidance
-
-Replaced the =fix: Address review — [description]= example (and the matching description-line phrasing) in =.claude/commands/respond-to-review.md= with "name the actual fix (=fix: validate export filename=), not the review that prompted it." Killed the non-ASCII dash and the process-in-commit pattern that conflicted with =commits.md=.
-
-*** 2026-05-22 Fri @ 14:12:24 -0500 Made respond-to-review fetch unresolved threads + resolve after verification
-
-Rewrote section 1 (Gather) in =.claude/commands/respond-to-review.md= to pull =reviewThreads= via =gh api graphql= with =isResolved=, skipping already-resolved threads so settled feedback isn't re-processed; top-level conversation comments still come from REST. Added a section-4 step: reply and resolve a thread only after the fix is verified, never before.
-
-*** 2026-05-22 Fri @ 14:12:24 -0500 Verified respond-to-cj-comments no longer embeds an absolute path (moot)
-
-Already resolved by a prior migration: =grep= for =/home/= and =/Users/= in =.claude/commands/respond-to-cj-comments.md= returns nothing. The public-writing section refers to the rules by name, not by local path. No edit needed.
-
-*** 2026-05-22 Fri @ 14:12:24 -0500 Closed respond-to-cj-comments humanizer/emacsclient fallback (largely moot)
-
-Overtaken by two later changes: =/humanizer= was replaced by =/voice personal= (no =/humanizer= invocation remains), and the mandatory =emacsclient= summary-open was replaced by the in-place VERIFY-task pattern (workflow line ~262, Craig's 2026-05-12 standing instruction). Only a stale descriptive phrase remained — tidied "humanizer's signs of AI writing" to "the signs of AI writing." The original fresh-environment-fallback concern no longer applies as written.
-
-*** 2026-05-22 Fri @ 14:51:37 -0500 Fixed finish-branch base-branch detection
-
-Rewrote Phase 2: resolve the base *branch name* in priority order (open PR's =baseRefName=, then =git symbolic-ref --short refs/remotes/origin/HEAD= stripped, then ask), and compute the merge-base *SHA* separately only where a commit range is needed. Made the branch-name-vs-merge-base distinction explicit, since the old command returned a SHA where a branch name was needed.
-
-*** 2026-05-22 Fri @ 14:51:37 -0500 Made finish-branch merge safer + worktree-aware
-
-Added pre-flight checks to Option 1 (Merge Locally): dirty-tree refusal with no auto-stash, protected-branch awareness, upstream-gated =git pull --ff-only=, and merge-commit-vs-rebase as a team-policy choice instead of a hardcoded =--no-ff=. Replaced the fragile =git worktree list | grep <branch>= detection with a =git rev-parse --git-dir= vs =--git-common-dir= comparison plus =git worktree list --porcelain= for the path.
-
-*** 2026-05-22 Fri @ 14:51:37 -0500 Added tool-availability + ceremony-scale paths to start-work
-
-Added a "Tool availability" section (graceful degradation when Linear MCP / =gh= / =/voice= / Playwright are missing — do what's available, surface what isn't, don't block) and a "Ceremony scale" section (trivial / small / standard tiers so a two-line fix skips ticket+branch+gates unless asked). The =humanizer= reference in the original item is moot — the file already uses =/voice= throughout.
-
-*** 2026-05-22 Fri @ 14:51:37 -0500 Resolved start-work claim-before-justify rollback risk
-
-Split the claim by tracker type: personal todo.org claims defer to after the Justify gate (a killed task needs no rollback), while team trackers (Linear/GitHub) still claim first to signal intent but record prior state (status, assignee, label) so the Phase 2 rollback restores exactly it. Updated the per-tracker rollback steps and the matching anti-pattern.
-
-*** 2026-05-22 Fri @ 14:28:41 -0500 Verified add-tests typescript-testing.md reference resolves (moot)
-
-Resolved since the audit: =languages/typescript/claude/rules/typescript-testing.md= now exists, and =add-tests/SKILL.md:68= references it by bare filename, the same way it references =python-testing.md= (both get copied into a project's =.claude/rules/=). The "missing file" premise no longer holds. No edit needed.
-
-*** 2026-05-22 Fri @ 14:28:41 -0500 Added a category-exception protocol to add-tests
-
-Added an exception note to step 7 (proposal) in =add-tests/SKILL.md=: pure adapters, generated code, tiny pass-through wrappers, and framework glue may skip a category that would only re-test the framework, but the skip must be stated and justified in the plan and the behavior covered at integration/E2E level — never a silent omission. Step 12 (write) now points back to "honor documented category exceptions."
-
-*** 2026-05-22 Fri @ 14:25:37 -0500 Added environment + recent-change capture to debug Phase 1
-
-Added a fourth Phase-1 step in =debug/SKILL.md=: record versions, feature-flag/config state, dataset/fixture, seed/clock, concurrency, and recent commits/config-infra changes. Noted that intermittent bugs usually live in environment/state transitions (and "what changed recently" is often the fastest route), while a deterministic local bug only needs a one-liner. Updated the phase's closing recap to include the context.
-
-*** 2026-05-22 Fri @ 14:25:37 -0500 Constrained root-cause-trace defense-in-depth to boundaries
-
-Rewrote step b in =root-cause-trace/SKILL.md=: instead of "add a check at each layer that could have caught it," add one only at a layer that owns a boundary or invariant — ingress/trust, persistence, invariant-owning service, final render. Added the explicit rule that a pass-through function owning neither shouldn't get a duplicate null check (validation spam). Recast the three example layers as the boundary types.
-
-*** 2026-05-22 Fri @ 14:25:37 -0500 Required evidence + counterfactual per why in five-whys
-
-Expanded step 2 in =five-whys/SKILL.md=: each link now owes an evidence field (a log/commit/metric/config you can point to) and a counterfactual check (remove this cause — does the symptom above plausibly not happen?). Framed the counterfactual as the main guard against monocausal storytelling, and updated the worked example to show both fields.
-
-*** 2026-05-22 Fri @ 15:51:59 -0500 Added timebox + fresh-sources rules to brainstorm
-
-Phase 1 gained a "Timebox the dialogue" rule (aim for the one-sentence restatement in ~5-8 questions, then move on and park the rest as open questions). Phase 2 gained "Ground high-stakes claims in fresh sources" (check load-bearing claims about markets/regulations/tools/vendors/APIs against a current source; mark unverified ones as assumptions). The design-doc skeleton gained an "## Assumptions" section that distinguishes researched facts (with source) from assumptions (to confirm before building).
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-decide examples timeless + required citations
-
-Dated the MongoDB multi-document-transaction example (scoped to 2024-01) with a backing reference, and added a "Cite, don't assert" Do: every concrete technical claim about a tool/version/platform carries a link, doc, version, or "checked YYYY-MM" date, or gets a domain-neutral placeholder — so unsourced "X can't do Y" doesn't rot into stale fact.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Standardized arch-decide ADR statuses + immutability rule
-
-Declared a canonical five-status set (Proposed, Accepted, Rejected, Deprecated, Superseded) with an explicit "no synonyms" line, and spelled out the immutability rule in the Don'ts: an accepted ADR's body is frozen, only status/link metadata changes, a changed decision gets a new superseding ADR and the old one stays as the historical record.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Added Trust/Data/Compliance phase to arch-design
-
-Added a new Phase 4 (Trust, Data, and Compliance) before the paradigm shortlist: trust boundaries, data classification, abuse/misuse cases, privacy constraints, compliance evidence, and operational ownership — surfaced early so the architecture is drawn around them, not retrofitted by a downstream =security-check=. Threaded into the workflow list, brief template (new §6), review checklist, and anti-patterns.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Split paradigms from tactical patterns in arch-design
-
-Split Phase 5's single mixed table into Step 1 (pick one paradigm: monolith/microservices/layered/event-driven/serverless/pipeline/space-based) and Step 2 (compose tactical patterns: DDD, hexagonal, CQRS, event sourcing — several or none, often per-module), with composition examples and an anti-pattern against treating DDD/CQRS as alternatives to a paradigm. Recommendation + brief now name a paradigm plus composed patterns.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Expanded arch-document quality scenarios to the Q42 six-part template
-
-Replaced §10's thin "Under [condition]..." template with the arc42/Q42 six-part structure (source, stimulus, environment, artifact, response, response measure), each glossed, with the cart-checkout example rewritten across all six parts. A one-line prose form stays acceptable once all six parts are recoverable.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Added staleness/ownership metadata to arch-document output
-
-Added a per-section metadata block (owner, generated-against SHA + date, review cadence, "stale-when" conditions) as an HTML-comment header plus a visible Doc-status note, with field-fill guidance, and a whole-document Doc Status table replacing the README's "Last Updated" stub. Wired into the review checklist and an "Undated docs" anti-pattern.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Added confidence levels to arch-evaluate findings
-
-Added a "Confidence and Provenance" subsection: every framework-agnostic finding carries High/Medium/Low + how it was determined, with a required "Not fully checked because..." note when scale, runtime imports, reflection, or dynamic dispatch cap certainty. Updated the example findings and review checklist; a finding with no note now asserts a full read.
-
-*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-evaluate report skipped tool checks explicitly
-
-Replaced "skip silently" with explicit reporting: for each detected language whose tool isn't configured or can't run, emit an Info "tool not configured / not run" finding (with an example) so the audit shows what was and wasn't verified. A check that didn't run no longer reads as a pass. Updated workflow step 4 and the review checklist.
-
-*** 2026-05-22 Fri @ 14:51:37 -0500 Added notation/output fallback to c4-analyze + c4-diagram
-
-Both commands now treat C4 as notation-independent: a "Choosing a notation" section (draw.io XML, Structurizr DSL, Mermaid with native C4 types, PlantUML/C4-PlantUML) and a headless fallback that emits a text notation (Mermaid or Structurizr DSL) and skips PNG-export/desktop-open when =drawio= or a GUI is absent, rather than failing. draw.io is now one option, not the only one.
-
-*** 2026-05-22 Fri @ 14:51:37 -0500 Clarified C4 abstraction boundaries in c4-analyze + c4-diagram
-
-Added an "Abstraction boundaries" section to both: a Container is a separately deployable/runnable unit (not synonymous with a Docker container — a SPA or managed DB counts), a Component lives inside one Container and isn't separately deployable. Added a 4e "Verify single abstraction level" check that walks every element and relationship to confirm it stays at the diagram's level, notation-independent.
-
-*** 2026-05-22 Fri @ 15:10:35 -0500 Added "When You Cannot Verify" standard to verification.md
-
-Added a section requiring, when a verification command can't run, a four-part report: command attempted, why it couldn't run, risk left unverified, and the smallest next command for the user. States the principle that a check that didn't run is never reported as a pass — "unable to verify" is a required honest outcome, not silence. Placed after Red Flags.
-
-*** 2026-05-22 Fri @ 15:10:35 -0500 Added property-based + mutation testing escalation to testing.md
-
-Added an "Escalation Beyond Category and Pairwise" section: property-based testing for invariants over a broad input domain (round-trips, idempotence, ordering — Hypothesis/fast-check/proptest) and mutation testing for when high line coverage hides thin assertions (mutmut/cosmic-ray/Stryker). Both framed as escalation paths to reach for on a gap, not gates on every unit.
-
-*** 2026-05-22 Fri @ 15:10:35 -0500 Added a disciplined spike protocol to testing.md
-
-Formalized the existing "I need to spike first" excuse-table row into a "Spike Exception (Disciplined)" subsection under TDD Discipline: TDD stays the default, but a spike is sanctioned when all three hold — timeboxed, spike code not committed, and the first failing test written before productionizing the discovered approach. Built on the existing row rather than contradicting it.
-
-*** 2026-05-22 Fri @ 15:10:35 -0500 Added pre-dispatch availability + cost checks to subagents.md
-
-Added a "Pre-Dispatch Checks" section with two gates: Availability (no Agent capability → do the work in the main thread under the same scope/constraints/output discipline the contract would enforce) and Cost (when writing the full contract costs more than the task, do it inline). Cross-references the existing "Don't Subagent At All" section and "Subagenting trivial work" anti-pattern rather than duplicating.
-
-*** 2026-05-22 Fri @ 15:06:04 -0500 Revised python-testing SQLite guidance toward production-like DBs
-
-Replaced "prefer in-memory SQLite for speed" with: run ORM/query tests against a production-like DB (same engine as prod, often containerized), since SQLite diverges from Postgres/MySQL on query semantics, constraints, transactions, JSON, time zones, and indexes (a test can pass on SQLite and fail in prod). SQLite stays only for pure unit tests with no DB-semantics dependency.
-
-*** 2026-05-22 Fri @ 15:06:04 -0500 Clarified python-testing ORM-mocking boundary
-
-Changed the "never mock" bullet from "ORM queries" to "ORM internals (querysets, sessions, model internals)" and added a paragraph: domain services use real model methods/validation, but a thin orchestration unit can inject a fake at a deliberate data-access port (a repository/interface the code owns). That's still mocking at a boundary, not at ORM internals.
-
-*** 2026-05-22 Fri @ 15:06:04 -0500 Made elisp.md editing advice tool-agnostic
-
-Rephrased the "prefer Write over repeated Edits" bullet around intent: land nontrivial Elisp as one cohesive change rather than dribbling it in over tiny partial edits (which accumulate paren mismatches), and run paren-balance + byte-compile checks immediately after, whatever editing mechanism the environment uses.
-
-*** 2026-05-22 Fri @ 15:06:04 -0500 Added batch-mode + native-comp caveats to elisp-testing.md
-
-Added three sections: Batch-Mode Reproducibility (=emacs --batch= as source of truth, no interactive-session state, no blocking prompts, deterministic), Isolating Emacs State (temp =user-emacs-directory=, explicit load-path, declared deps only, with an unwind-protect sandbox example), and Byte-Compile/Native-Comp Warnings (=byte-compile-error-on-warn=, native-comp gated on =native-comp-available-p= and kept opt-in/version-aware).
-
-*** 2026-05-22 Fri @ 15:16:22 -0500 Synced hooks/README install snippets with the destructive hook (opt-in)
-
-Brought the README's manual-install and settings-JSON snippets in line with the canonical =hooks/settings-snippet.json= (which already wires all three) and the Makefile's opt-in design: added the destructive-bash-confirm.py symlink as an opt-in step, added its settings entry, and reworded the note to say all three are no-op-safe but the destructive gate is opt-in (=make install-hooks= excludes it by default — link manually before relying on the snippet entry).
-
-*** 2026-05-22 Fri @ 15:35:06 -0500 Hooks now scan file-backed commit/PR messages
-
-Added =read_referenced_file()= to =_common.py= (safe local read: missing/oversize/non-UTF-8 → None) and wired it in: =git-commit-confirm.py= =extract_commit_message= now handles =-F=/=--file=/=--file===<path>= (reads + scans the file, falls through to UNPARSEABLE → asks if unreadable), and =gh-pr-create-confirm.py= reads =--body-file= content instead of a placeholder. Attribution scanning now sees the real committed/posted text. Built a pytest harness (=hooks/tests/=, importlib-by-path loader for the hyphen-named hooks) and wired =hooks/tests= into =make test=. 54 hook tests pass; full suite green.
-
-*** 2026-05-22 Fri @ 15:35:06 -0500 Rewrote destructive-bash rm parsing on shlex
-
-=detect_rm_rf= now tokenizes with =shlex.split= instead of a whitespace split, so quoted/spaced paths and combined/separate/reordered flags (=-rf=, =-r -f=, =-fr=, =--recursive=/=--force=) all parse. Fails toward asking — returns a sentinel that still fires the modal — on unbalanced quotes or when a forced recursive rm coexists with a compound/pipeline/substitution/redirect construct. Documented the supported/unsupported shell constructs in the docstrings, and extended the dangerous-path banner to =$HOME=-prefixed and wildcard targets. Covered by 25 new tests. (Pre-existing, out-of-scope: path-prefixed =rm= like =/bin/rm= still isn't matched.)
-
** TODO [#C] Build =/update-skills= skill for keeping forks in sync with upstream
:PROPERTIES:
:LAST_REVIEWED: 2026-05-20
@@ -1385,88 +1007,6 @@ having a skill to generate or check OV-1-shaped artifacts. Don't build
speculatively — defense-specific notations are narrow enough that each
skill should be driven by a concrete contract need, not aspiration.
-** DONE [#B] Add =make remove= for interactive ruleset removal via fzf
-CLOSED: [2026-05-22 Fri]
-Shipped: =scripts/remove.sh= (three modes — =--list=, =--remove-selected= reading stdin, and the default fzf-multi interactive flow) + =make remove= target + =scripts/tests/remove.bats= (5 cases). Lists only symlinks resolving into the repo (foreign links left alone); rm's picked links while leaving repo sources untouched; reports-and-continues on a missing target; quiet no-op on empty selection. shellcheck clean, make test green. Dropped the stale =bridge= entry per the note below.
-
-Add a Makefile target that lists every currently-installed ruleset entry
-and lets me pick one or more to remove via fzf. Granular alternative to
-=make uninstall= (removes everything) and =make uninstall-hooks= (removes
-only hooks).
-
-*** Why this matters
-
-Tearing down a single skill, rule, hook, or config file currently means
-either running =make uninstall= and re-installing what I want to keep,
-or =rm=ing the symlink directly and remembering the exact path. Both are
-friction. An interactive picker lets me filter, multi-select with Tab,
-and confirm with Enter — the typical fzf flow. Costs about 3-5 seconds
-per teardown instead of 15+ seconds of "what's the exact name?".
-
-*** Design
-
-The recipe builds a tab-separated list of every currently-installed item,
-categorized by type, and pipes it to =fzf --multi=. The user filters,
-marks with Tab, and confirms with Enter. The recipe parses the selections
-and =rm=s the matching symlinks.
-
-#+begin_example
- skill debug
- rule commits.md
- hook destructive-bash-confirm.py
- config settings.json
- commands commands
- bridge claude-rules
-#+end_example
-
-Each line is =<kind>\t<name>=. The recipe maps =<kind>= to the right path:
-
-- =skill= → =$(SKILLS_DIR)/<name>=
-- =rule= → =$(RULES_DIR)/<name>=
-- =hook= → =$(HOOKS_DIR)/<name>=
-- =config= → =$(CLAUDE_DIR)/<name>=
-- =commands= → =$(CLAUDE_DIR)/commands=
-- =bridge= → =$(SKILLS_DIR)/claude-rules=
-
-Source files in =rulesets/= stay untouched. =make install= re-creates the
-removed links if needed (the install loop is idempotent).
-
-*** Edge cases
-
-- Esc instead of Enter → empty selection → clean exit, no removal.
-- Filter to nothing then Enter → same as Esc.
-- Selected item already gone → =rm= fails visibly, processing continues
- on the rest.
-- =fzf= not installed → fail fast with a clear error (matches the pattern
- used by =install-lang=).
-
-*** Possible extensions
-
-- Parallel =make pick-install= target that lists not-yet-installed items
- and installs the chosen ones. Symmetric UX, same fzf flow.
-- Confirmation prompt when more than N items selected (defense against
- accidental select-all).
-- =--source= flag that also runs =git rm= against the rulesets source for
- the selected item. Probably bad idea — too easy to lose work.
-- The =bridge → $(SKILLS_DIR)/claude-rules= entry above is stale — the
- bridge symlink got removed in a later commit. Drop that bullet when the
- recipe lands.
-
-** DONE [#B] Document the =mcp/= install pipeline in =mcp/README.org=
-CLOSED: [2026-05-22 Fri]
-Wrote =mcp/README.org= covering everything in the "what to cover" list: the file layout (tracked vs gitignored), the secrets-bundle shape (plain =${VAR}= secrets + base64-bundled OAuth artifacts, AES256 symmetric =gpg -c=), the install flow (decrypt → materialize keys/token caches at mode 600 → expand → register unregistered, idempotent), the http/sse-vs-stdio transport split, token rotation when a Google refresh token is revoked, and adding a new server. Grounded in a read of the actual =install.py= + =servers.json=.
-
-=mcp/= has =install.py=, =servers.json=, =secrets.env.gpg=, =gcp-oauth.keys.json= (gitignored, regenerated at install). No README. Coming back to this in three months I'll re-discover how the bundle is structured, what =install.py= does, and how to rotate tokens. Saving that re-discovery is the whole point.
-
-*** What to cover
-
-- Layout: what each file is, which are tracked vs gitignored.
-- Secrets bundle shape: how vars are listed in =secrets.env=, the symmetric-encryption pattern (=gpg -c --cipher-algo AES256=), the base64-bundled OAuth artifacts (=GCP_OAUTH_KEYS_JSON_B64=, =GOOGLE_DOCS_PERSONAL_TOKEN_B64=, =GOOGLE_DOCS_WORK_TOKEN_B64=).
-- Install flow: =make install-mcp= → =install.py= decrypts, writes the keys file and Google Docs token caches at mode 600, expands =${VAR}= in =servers.json=, calls =claude mcp add --scope user= for unregistered servers. Idempotent.
-- Token rotation: when a refresh token gets revoked, the recovery flow (re-auth on one machine, re-bundle, recommit).
-- Adding a new server: edit =servers.json=, add any new =${VAR}= placeholders to the bundle, re-encrypt.
-- The OAuth dance for HTTP-transport servers (linear, notion) versus stdio (google-docs-*) — different paths, different gotchas.
-
** TODO [#C] Add =make uninstall-mcp= + =mcp/install.py --check= for symmetry
Currently the MCP install pipeline only flows one direction. No way to remove rulesets-managed MCP servers in one command. No way to ask "what's the drift between =servers.json= and =claude mcp list=" without eyeballing.
@@ -2059,3 +1599,458 @@ See also the DoD-specific notations section under the later TODO
(=c4-*= rename revisit) — OV-1 is flagged there as the highest-value
starting point across the DoD notation landscape (SysML, DoDAF/UAF,
IDEF1X). This entry is the execution plan for that starting point.
+** DONE [#A] Split team-specific publishing rules out of commits.md :commits:
+CLOSED: [2026-05-22 Fri]
+Shipped 3cb467e. Moved the DeepSat publishing steps (Linear ticket-state, the Slack notification protocol + channel ID, the GHE host, the team merge norm, the Linear ticket-body structure) out of the global =claude-rules/commits.md= into =teams/deepsat/claude/rules/publishing.md=. The global file keeps the universal skeleton and uses seams ("run the project's publishing overlay here if present") like startup-extras. Added =install-team= (targeted per-project copy, keyed on PROJECT, never globally symlinked) and generalized =sync-language-bundle.sh= to keep team overlays fresh at startup (3 new bats; make test green).
+
+Remaining deploy step (cross-project, surfaced to Craig): install the overlay into the DeepSat work project — =make install-team TEAM=deepsat PROJECT=<deepsat-path>= — so it actually loads there.
+** DONE [#A] Define a /voice-unavailable fallback in the commits.md publish flow :commits:
+CLOSED: [2026-05-22 Fri]
+Added an "If =/voice= is unavailable" paragraph to the Single-skill gate in =commits.md=: walk the same patterns inline (the flow already names which matter), state the skill was unavailable and the pass was applied by hand ("/voice unavailable — patterns walked inline"), and flag the missing skill for install. The gate is the pattern walk, not the tooling. The original "=humanizer= unavailable" framing was moot (humanizer → /voice).
+** DONE [#A] wrap-it-up Step 3.5 assumes GitHub-family remote :chore:quick:
+CLOSED: [2026-05-22 Fri]
+:PROPERTIES:
+:LAST_REVIEWED: 2026-05-20
+:END:
+Documented the assumption inline at =wrap-it-up.org= Step 3.5 (chose the lightweight path over a provider-agnostic rewrite): the =gh= lookup expects a GitHub-family host, holds today via DeepSat on GHE, flagged for update if a future Linear project lands on GitLab/Gitea/Bitbucket.
+Triggered by: 2026-05-16 wrap-it-up github.com cleanup (audit of the same file).
+
+Step 3.5 (Linear ticket-state hygiene) at =wrap-it-up.org:207= says "the project's GitHub remote — use =gh pr list ...=". Currently fine in practice: the step is Linear-gated, and the only Linear-using project is DeepSat (on =deepsat.ghe.com=, a GitHub-family host where =gh= works). Would break if a future Linear-using project lived on a non-GitHub host (gitlab, gitea, bitbucket). Either drop the GitHub-family assumption (provider-agnostic lookup, harder) or document the assumption explicitly so future projects know the step needs an update if they don't fit.
+** DONE [#C] Review pass: tighten skills and rulesets after 2026-05-04 audit
+CLOSED: [2026-05-22 Fri]
+:PROPERTIES:
+:LAST_REVIEWED: 2026-05-20
+:END:
+All 55 grouped-index items dispositioned (2026-05-22): ~49 edited across skills, commands, rule files, hooks, and the two playwright skills; several came out moot post-audit (humanizer→voice, skills→commands, typescript ruleset added); the two commits.md items shipped as the team-overlay split + /voice fallback. Freshness-checked each item against current reality before editing.
+
+Source notes used in this pass:
+- C4 official docs: C4 is notation-independent; System Context and Container
+ diagrams are enough for most teams; every diagram needs title, key/legend,
+ explicit element types, and audience-appropriate abstraction.
+ [[https://c4model.com/diagrams][C4 diagrams]],
+ [[https://c4model.com/diagrams/notation][C4 notation]],
+ [[https://c4model.com/abstractions/component][C4 component]]
+- arc42 docs: quality requirements need measurable scenarios; section 10
+ should reference top quality goals and capture lesser quality requirements
+ with specific measures. [[https://docs.arc42.org/section-10/][arc42 section 10]],
+ [[https://quality.arc42.org/articles/specify-quality-requirements][specifying quality requirements]]
+- ADR references: ADRs capture one justified architecturally significant
+ decision and its rationale; Nygard's original guidance emphasizes short,
+ numbered, repository-stored records and superseding rather than rewriting old
+ decisions. [[https://adr.github.io/][adr.github.io]],
+ [[https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions][Nygard ADR article]]
+- Playwright docs: prefer user-visible locators and web assertions; locators
+ auto-wait and retry; =networkidle= is discouraged for testing readiness.
+ [[https://playwright.dev/docs/best-practices][Playwright best practices]],
+ [[https://playwright.dev/docs/locators][Playwright locators]],
+ [[https://playwright.dev/docs/next/api/class-page][Playwright page API]]
+- OWASP references: Top 10 2021 includes Broken Access Control,
+ Cryptographic Failures, Injection, Insecure Design, Security
+ Misconfiguration, Vulnerable and Outdated Components, Identification and
+ Authentication Failures, Software and Data Integrity Failures, Security
+ Logging and Monitoring Failures, and SSRF; WSTG adds a broader testing map
+ across configuration, identity, authn/z, sessions, input validation, error
+ handling, cryptography, business logic, client-side, and API testing.
+ [[https://owasp.org/Top10/2021/][OWASP Top 10 2021]],
+ [[https://owasp.org/www-project-web-security-testing-guide/latest/4-Web_Application_Security_Testing/][OWASP WSTG]]
+- V2MOM references: Salesforce calls the last M "Measures" and emphasizes a
+ simple alignment document with prioritized Methods, explicit Obstacles, and
+ measurable outcomes. [[https://trailhead.salesforce.com/content/learn/modules/selfmotivation/get-focused-with-your-personal-v2mom][Salesforce Trailhead personal V2MOM]],
+ [[https://www.salesforce.com/blog/?p=12][Salesforce V2MOM alignment]]
+- Prompt research: the cited Meincke paper is titled "Call Me A Jerk:
+ Persuading AI to Comply with Objectionable Requests"; its scope is
+ persuasion increasing compliance with objectionable requests, not a general
+ proof that persuasion framing improves prompt quality.
+ [[https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5357179][SSRN paper]]
+- Combinatorial testing references: NIST supports t-way combinatorial testing
+ and notes pairwise is one covering strength, with higher-strength arrays
+ useful for failures requiring more interacting factors.
+ [[https://www.nist.gov/publications/practical-combinatorial-testing-beyond-pairwise][NIST beyond pairwise]],
+ [[https://www.nist.gov/publications/combinatorial-software-testing][NIST combinatorial testing]]
+
+*** Grouped index (for batching by area)
+
+Each item below is a one-line summary of a sub-TODO further down. Tick the box when the matching sub-TODO is moved to =DONE=. Items are grouped by area so they can be batched (e.g., "do all Playwright items in one session").
+
+**** Browser testing
+- [X] [#A] =playwright-js=: locator/assertion-first guidance (replace raw CSS, =networkidle=)
+- [X] [#B] =playwright-js= + =playwright-py=: reconcile headless/visible defaults
+- [X] [#B] =playwright-js= + =playwright-py=: remove emoji console markers from examples
+
+**** Frontend / UI
+- [X] [#B] =frontend-design=: WCAG 2.2 alignment, accessibility non-optional
+- [X] [#B] =frontend-design=: harmonize aesthetic guidance with anti-pattern rules
+
+**** Security
+- [X] [#A] =security-check=: OWASP 2021 + WSTG coverage
+- [X] [#B] =security-check=: tooling and offline/network caveats
+
+**** Combinatorial testing
+- [X] [#B] =pairwise-tests=: t-way escalation guidance beyond pairwise
+- [X] [#B] =pairwise-tests=: clarify negative value syntax + generator availability
+
+**** V2MOM
+- [X] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment)
+- [X] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog
+- [X] [#B] =create-v2mom=: mitigation/owner fields for Obstacles
+
+**** Prompt engineering
+- [X] [#A] =prompt-engineering=: correct/narrow Meincke citation
+- [X] [#B] =prompt-engineering=: eval-harness requirement for production prompts
+
+**** Codify
+- [X] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md=
+
+**** Code review
+- [X] [#A] =review-code=: resolve local-verification vs CI boundary
+- [X] [#B] =review-code=: =CLAUDE.md= citation scope for public artifacts
+- [X] [#B] =review-code=: relax three-strengths rule for tiny/failing diffs
+
+**** PR / review responses
+- [X] [#A] =respond-to-review=: remove review-process language from commit messages
+- [X] [#B] =respond-to-review=: use unresolved threads + resolution state
+- [X] [#B] =respond-to-cj-comments=: drop personal absolute paths from public-writing (moot — already clean)
+- [X] [#B] =respond-to-cj-comments=: fallback when =humanizer= or =emacsclient= unavailable (moot — superseded by /voice + VERIFY pattern)
+
+**** Branch workflow
+- [X] [#A] =finish-branch=: fix base-branch detection
+- [X] [#B] =finish-branch=: worktree-aware pull/merge safety
+- [X] [#B] =start-work=: tool-availability + ceremony-scaling rules
+- [X] [#B] =start-work=: claim-before-justify rollback risk
+
+**** Tests / TDD
+- [X] [#B] =add-tests=: fix missing =typescript-testing.md= reference or add ruleset (moot — ruleset now exists)
+- [X] [#B] =add-tests=: explicit exceptions to "all three categories per function"
+
+**** Debugging / RCA
+- [X] [#B] =debug=: capture environment + recent-change context before hypotheses
+- [X] [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries
+- [X] [#B] =five-whys=: require evidence + counterfactual validation per why
+
+**** Brainstorming
+- [X] [#B] =brainstorm=: timebox + research/source rules for high-stakes designs
+
+**** Architecture
+- [X] [#B] =arch-decide=: timeless examples, drop unverifiable claims
+- [X] [#B] =arch-decide=: standardize statuses + immutability language
+- [X] [#B] =arch-design=: threat modeling + privacy/compliance as first-class inputs
+- [X] [#B] =arch-design=: separate paradigms from tactical patterns
+- [X] [#B] =arch-document=: arc42/Q42 quality scenarios
+- [X] [#B] =arch-document=: staleness + ownership metadata for generated docs
+- [X] [#B] =arch-evaluate=: confidence levels for framework-agnostic findings
+- [X] [#B] =arch-evaluate=: report skipped tool checks explicitly
+
+**** C4 modeling
+- [X] [#A] =c4-analyze= + =c4-diagram=: notation/output fallback (not draw.io-only)
+- [X] [#B] =c4-analyze= + =c4-diagram=: clarify abstraction boundaries
+
+**** Global rules
+- [X] [#B] =commits.md=: split DeepSat/Linear/Slack-specific from global rules → promoted to a top-level task (deferred for Craig)
+- [X] [#A] =commits.md= + publish flows: =humanizer=-unavailable fallback → promoted to a top-level task (deferred; humanizer premise moot)
+- [X] [#B] =verification.md=: explicit "unable to verify" reporting standard
+- [X] [#B] =testing.md=: property-based + mutation testing as escalation paths
+- [X] [#B] =testing.md=: soften absolute TDD with explicit spike protocol
+- [X] [#B] =subagents.md=: capability/availability + cost checks
+
+**** Languages
+- [X] [#A] =python-testing.md=: revisit in-memory SQLite guidance
+- [X] [#B] =python-testing.md=: separate "never mock ORM" from unit-test boundaries
+- [X] [#B] =elisp.md=: drop tool-specific advice
+- [X] [#B] =elisp-testing.md=: batch-mode + native-comp caveats
+
+**** Hooks
+- [X] [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets
+- [X] [#A] =hooks/git-commit-confirm.py= + =gh-pr-create-confirm.py=: inspect message/body files referenced by =-F= / =--body-file=
+- [X] [#B] =hooks/destructive-bash-confirm.py=: shell-aware command parsing (not regex)
+
+*** 2026-05-22 Fri @ 15:47:10 -0500 Made playwright guidance locator/assertion-first, dropped networkidle-as-readiness
+
+Rewrote the readiness guidance in both =playwright-js/SKILL.md= and =playwright-py/SKILL.md=: reconnaissance now waits for a visible app landmark via a web assertion or locator (=expect(...).toBeVisible()= / =get_by_role(...).wait_for()=), not =networkidle= (which Playwright discourages). Updated the login/form examples to =getByLabel=/=getByRole= + web assertions, the API_REFERENCE.md waiting section, and =lib/helpers.js= defaults (=waitForPageReady= now defaults to =load= and prefers a caller-supplied landmark; =authenticate= races the success indicator over a =load= navigation). node --check passes.
+
+*** 2026-05-22 Fri @ 14:23:02 -0500 Added headed/headless decision tables to both playwright skills
+
+Added matching purpose-based decision tables to =playwright-js/SKILL.md= (was "always visible") and =playwright-py/SKILL.md= Best Practices (was "always headless"). Each names its own default and points at the other skill, so the difference is deliberate, not a habit-flip: headed for interactive debugging, headless for CI/pytest. Also softened the absolutist "Always launch... headless" comment in the py example.
+
+*** 2026-05-22 Fri @ 15:47:10 -0500 Removed emoji console markers from the playwright skills
+
+Replaced every emoji status marker with a plain ASCII prefix across =playwright-js/= (run.js, lib/helpers.js, SKILL.md) and =playwright-py/= (SKILL.md, examples/*.py): 📦/⚡/📄/📥/🎭/🚀/📋/✅/❌/🔍/📸/✓/✗ → =[setup]=/=[run]=/=[ok]=/=[error]=/=[fail]= etc. Post-change emoji grep is clean (excluding node_modules); node --check and py_compile pass.
+
+*** 2026-05-22 Fri @ 14:35:16 -0500 Made accessibility a non-optional WCAG 2.2 gate in frontend-design
+
+Added an "Accessibility Gate (required before handoff)" section to =frontend-design/SKILL.md= covering keyboard operation, focus visibility, focus-not-obscured (2.2), target size (2.2), contrast, reduced motion, labels, and semantic structure — a baseline for all frontend work, not just interactive components. Rewrote the Build/Review phases to build accessibly as you go and clear the gate before handoff, and bumped =references/accessibility.md= from WCAG 2.1 to 2.2 with backing detail for the new criteria.
+
+*** 2026-05-22 Fri @ 14:35:16 -0500 Added a "creative but bounded" section to frontend-design
+
+Added a subsection under Frontend Aesthetics framing the bold/maximalist directions as tools, not obligations: domain fit, readability first, responsive stability, and no decorative effect that degrades the workflow. Reconciles rather than contradicts the maximalist encouragement (maximalism stays on the table as deliberate usable density), and ties the readability bullet to the new accessibility gate.
+
+*** 2026-05-22 Fri @ 14:35:16 -0500 Updated security-check to OWASP Top 10 2021 + WSTG mapping
+
+Replaced the older six-category list in =.claude/commands/security-check.md= with the full Top 10 2021 set, each finding mapped to a 2021 category or WSTG area. Added the four missing categories (Insecure Design, Software and Data Integrity Failures, Security Logging and Monitoring Failures, SSRF) plus explicit checks for object/function-level authorization, SSRF on URL-fetch paths, update/plugin/dependency integrity, and logging/monitoring gaps.
+
+*** 2026-05-22 Fri @ 14:35:16 -0500 Added scanner tooling + network caveats to security-check
+
+Added an optional configured-scanners step (=gitleaks=/=trufflehog= secrets, =semgrep= source patterns, OSV scanner, lockfile-diff review) that supplements the manual scans, plus a network caveat: dependency audits that can't run (offline, tool absent, DB unreachable) must report "not run" naming the tool and reason, never read as a pass. Carried that into the no-issues summary.
+
+*** 2026-05-22 Fri @ 14:35:16 -0500 Added t-way escalation guidance to pairwise-tests
+
+Added an "Escalating Beyond Pairwise (t-way)" subsection: start with pairwise across the whole space, then escalate specific high-risk clusters to 3-way+ when history, safety, security, or domain coupling says a fault needs more than two interacting factors. Lists escalation triggers and shows the sub-model order syntax (={ A, B, C } @ 3=) vs a blanket =/o:3= bump, stressing targeted not uniform escalation. Cites NIST combinatorial-testing work.
+
+*** 2026-05-22 Fri @ 14:35:16 -0500 Clarified PICT ~ syntax + honest generator-availability path in pairwise-tests
+
+Added a "~ prefix" explanation (PICT marker tagging a value as negative/invalid, not an arithmetic operator; PICT pairs negatives with valid values once and strips the marker before the SUT) and a stop-at-the-model rule: if neither the =pict= binary nor =pypict= is present, produce the model and stop rather than hand-writing a table and passing it off as PICT output.
+
+*** 2026-05-22 Fri @ 14:43:17 -0500 Renamed Metrics → Measures throughout create-v2mom
+
+Full rename across =.claude/commands/create-v2mom.md= (acronym expansions, Phase 7 heading, the "Measures must be measurable" principle, exit criteria, review questions, red flags, examples) to match Salesforce's official term. Kept the "vanity metrics" idiom intact — it's the anti-pattern term, not a section reference.
+
+*** 2026-05-22 Fri @ 14:43:17 -0500 Split strategy from execution in create-v2mom task migration
+
+Rewrote Phase 8 (and tightened Phase 5.5): tasks stay in the backlog grouped by method, and each method gains a one-line link to where its tasks live, instead of transplanting the task tree into the V2MOM. Strategy (V2MOM) and execution (backlog) are now explicitly separate sources of truth, keeping the V2MOM concise.
+
+*** 2026-05-22 Fri @ 14:43:17 -0500 Made create-v2mom obstacles operational (mitigation/owner/cadence)
+
+Phase 6 now captures, per obstacle: name, manifestation, stakes, mitigation, owner, and review cadence — with a worked example per domain (health/finance/software), a "good obstacle" characteristic, a Phase 9 review question, and a red flag for candid-but-not-operational obstacles. An obstacle without a countermove is now flagged as an observation, not a plan.
+
+*** 2026-05-22 Fri @ 14:43:17 -0500 Corrected and narrowed the Meincke citation in prompt-engineering
+
+Fixed the title to "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests" (SSRN abstract_id=5357179) in all three spots (frontmatter, Seven Principles intro, References). Reframed the ~33%→72% result as what it is — a prompt-safety caution that persuasion raises compliance with objectionable requests — explicitly not evidence that persuasion framing improves engineering prompt quality. Kept the seven principles as a tone vocabulary.
+
+*** 2026-05-22 Fri @ 14:43:17 -0500 Added an eval-harness requirement to prompt-engineering critique mode
+
+Added critique step 7 + a checklist line: for fragile or reusable/production prompts, write 3-5 adversarial/edge inputs, run both the old and new prompt against each, and record the behavioral delta. A throwaway prompt can ship on the rewrite alone; a discipline/reused/production one can't. Without examples, "the rewrite is better" is an assertion, not a result.
+
+*** 2026-05-22 Fri @ 14:43:17 -0500 Added mandatory stale-entry + privacy pre-write checks to codify
+
+Added a "Mandatory pre-write checks" block at the top of Phase 3 (Write) in =.claude/commands/codify.md=: a stale-entry scan (update/remove no-longer-true entries in place, don't append contradictions around them) and a privacy/leak check carrying both questions verbatim — "safe if the project were public?" and "belongs in private memory instead?" — routing private content to auto-memory. Gates, not background guidance.
+
+*** 2026-05-22 Fri @ 14:06:41 -0500 Scoped review-code's CI-trust rule to reviewing, not shipping
+
+Expanded the False-Positive Filter bullet in =review-code/SKILL.md=: "trust CI, don't run builds" applies to reading a diff, not producing one. A pre-commit/pre-push flow still owes the local verification =verification.md= requires (run the suite or state "not run because..."). Closes the apparent contradiction with =verification.md= / =finish-branch=.
+
+*** 2026-05-22 Fri @ 14:06:41 -0500 Added private-vs-public CLAUDE.md citation modes to review-code
+
+Expanded the Content scope section in =review-code/SKILL.md= with two modes: a private/internal review cites =CLAUDE.md= directly; a public/team review translates the rule into the engineering reason it encodes and doesn't name the rules file (a teammate can act on the reason, not on a file they can't reach). Same principle =commits.md= states for personal tooling in public artifacts.
+
+*** 2026-05-22 Fri @ 13:48:14 -0500 Relaxed review-code "three strengths" to up-to-three-or-none
+
+Changed all three "three minimum" spots in =review-code/SKILL.md= (Strengths section, Critical Rules DO list, Anti-Patterns) to "up to three specific; say none found on a tiny or weak diff." Reframed the old "No Strengths section" anti-pattern as "Skipping strengths out of laziness" so a substantive diff still demands them while a weak one can honestly report nothing notable. Landed alongside Craig's adjacent edit telling reviewers not to explain why a strength is good (sycophantic padding).
+
+*** 2026-05-22 Fri @ 14:12:24 -0500 Removed review-process language from respond-to-review commit guidance
+
+Replaced the =fix: Address review — [description]= example (and the matching description-line phrasing) in =.claude/commands/respond-to-review.md= with "name the actual fix (=fix: validate export filename=), not the review that prompted it." Killed the non-ASCII dash and the process-in-commit pattern that conflicted with =commits.md=.
+
+*** 2026-05-22 Fri @ 14:12:24 -0500 Made respond-to-review fetch unresolved threads + resolve after verification
+
+Rewrote section 1 (Gather) in =.claude/commands/respond-to-review.md= to pull =reviewThreads= via =gh api graphql= with =isResolved=, skipping already-resolved threads so settled feedback isn't re-processed; top-level conversation comments still come from REST. Added a section-4 step: reply and resolve a thread only after the fix is verified, never before.
+
+*** 2026-05-22 Fri @ 14:12:24 -0500 Verified respond-to-cj-comments no longer embeds an absolute path (moot)
+
+Already resolved by a prior migration: =grep= for =/home/= and =/Users/= in =.claude/commands/respond-to-cj-comments.md= returns nothing. The public-writing section refers to the rules by name, not by local path. No edit needed.
+
+*** 2026-05-22 Fri @ 14:12:24 -0500 Closed respond-to-cj-comments humanizer/emacsclient fallback (largely moot)
+
+Overtaken by two later changes: =/humanizer= was replaced by =/voice personal= (no =/humanizer= invocation remains), and the mandatory =emacsclient= summary-open was replaced by the in-place VERIFY-task pattern (workflow line ~262, Craig's 2026-05-12 standing instruction). Only a stale descriptive phrase remained — tidied "humanizer's signs of AI writing" to "the signs of AI writing." The original fresh-environment-fallback concern no longer applies as written.
+
+*** 2026-05-22 Fri @ 14:51:37 -0500 Fixed finish-branch base-branch detection
+
+Rewrote Phase 2: resolve the base *branch name* in priority order (open PR's =baseRefName=, then =git symbolic-ref --short refs/remotes/origin/HEAD= stripped, then ask), and compute the merge-base *SHA* separately only where a commit range is needed. Made the branch-name-vs-merge-base distinction explicit, since the old command returned a SHA where a branch name was needed.
+
+*** 2026-05-22 Fri @ 14:51:37 -0500 Made finish-branch merge safer + worktree-aware
+
+Added pre-flight checks to Option 1 (Merge Locally): dirty-tree refusal with no auto-stash, protected-branch awareness, upstream-gated =git pull --ff-only=, and merge-commit-vs-rebase as a team-policy choice instead of a hardcoded =--no-ff=. Replaced the fragile =git worktree list | grep <branch>= detection with a =git rev-parse --git-dir= vs =--git-common-dir= comparison plus =git worktree list --porcelain= for the path.
+
+*** 2026-05-22 Fri @ 14:51:37 -0500 Added tool-availability + ceremony-scale paths to start-work
+
+Added a "Tool availability" section (graceful degradation when Linear MCP / =gh= / =/voice= / Playwright are missing — do what's available, surface what isn't, don't block) and a "Ceremony scale" section (trivial / small / standard tiers so a two-line fix skips ticket+branch+gates unless asked). The =humanizer= reference in the original item is moot — the file already uses =/voice= throughout.
+
+*** 2026-05-22 Fri @ 14:51:37 -0500 Resolved start-work claim-before-justify rollback risk
+
+Split the claim by tracker type: personal todo.org claims defer to after the Justify gate (a killed task needs no rollback), while team trackers (Linear/GitHub) still claim first to signal intent but record prior state (status, assignee, label) so the Phase 2 rollback restores exactly it. Updated the per-tracker rollback steps and the matching anti-pattern.
+
+*** 2026-05-22 Fri @ 14:28:41 -0500 Verified add-tests typescript-testing.md reference resolves (moot)
+
+Resolved since the audit: =languages/typescript/claude/rules/typescript-testing.md= now exists, and =add-tests/SKILL.md:68= references it by bare filename, the same way it references =python-testing.md= (both get copied into a project's =.claude/rules/=). The "missing file" premise no longer holds. No edit needed.
+
+*** 2026-05-22 Fri @ 14:28:41 -0500 Added a category-exception protocol to add-tests
+
+Added an exception note to step 7 (proposal) in =add-tests/SKILL.md=: pure adapters, generated code, tiny pass-through wrappers, and framework glue may skip a category that would only re-test the framework, but the skip must be stated and justified in the plan and the behavior covered at integration/E2E level — never a silent omission. Step 12 (write) now points back to "honor documented category exceptions."
+
+*** 2026-05-22 Fri @ 14:25:37 -0500 Added environment + recent-change capture to debug Phase 1
+
+Added a fourth Phase-1 step in =debug/SKILL.md=: record versions, feature-flag/config state, dataset/fixture, seed/clock, concurrency, and recent commits/config-infra changes. Noted that intermittent bugs usually live in environment/state transitions (and "what changed recently" is often the fastest route), while a deterministic local bug only needs a one-liner. Updated the phase's closing recap to include the context.
+
+*** 2026-05-22 Fri @ 14:25:37 -0500 Constrained root-cause-trace defense-in-depth to boundaries
+
+Rewrote step b in =root-cause-trace/SKILL.md=: instead of "add a check at each layer that could have caught it," add one only at a layer that owns a boundary or invariant — ingress/trust, persistence, invariant-owning service, final render. Added the explicit rule that a pass-through function owning neither shouldn't get a duplicate null check (validation spam). Recast the three example layers as the boundary types.
+
+*** 2026-05-22 Fri @ 14:25:37 -0500 Required evidence + counterfactual per why in five-whys
+
+Expanded step 2 in =five-whys/SKILL.md=: each link now owes an evidence field (a log/commit/metric/config you can point to) and a counterfactual check (remove this cause — does the symptom above plausibly not happen?). Framed the counterfactual as the main guard against monocausal storytelling, and updated the worked example to show both fields.
+
+*** 2026-05-22 Fri @ 15:51:59 -0500 Added timebox + fresh-sources rules to brainstorm
+
+Phase 1 gained a "Timebox the dialogue" rule (aim for the one-sentence restatement in ~5-8 questions, then move on and park the rest as open questions). Phase 2 gained "Ground high-stakes claims in fresh sources" (check load-bearing claims about markets/regulations/tools/vendors/APIs against a current source; mark unverified ones as assumptions). The design-doc skeleton gained an "## Assumptions" section that distinguishes researched facts (with source) from assumptions (to confirm before building).
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-decide examples timeless + required citations
+
+Dated the MongoDB multi-document-transaction example (scoped to 2024-01) with a backing reference, and added a "Cite, don't assert" Do: every concrete technical claim about a tool/version/platform carries a link, doc, version, or "checked YYYY-MM" date, or gets a domain-neutral placeholder — so unsourced "X can't do Y" doesn't rot into stale fact.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Standardized arch-decide ADR statuses + immutability rule
+
+Declared a canonical five-status set (Proposed, Accepted, Rejected, Deprecated, Superseded) with an explicit "no synonyms" line, and spelled out the immutability rule in the Don'ts: an accepted ADR's body is frozen, only status/link metadata changes, a changed decision gets a new superseding ADR and the old one stays as the historical record.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Added Trust/Data/Compliance phase to arch-design
+
+Added a new Phase 4 (Trust, Data, and Compliance) before the paradigm shortlist: trust boundaries, data classification, abuse/misuse cases, privacy constraints, compliance evidence, and operational ownership — surfaced early so the architecture is drawn around them, not retrofitted by a downstream =security-check=. Threaded into the workflow list, brief template (new §6), review checklist, and anti-patterns.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Split paradigms from tactical patterns in arch-design
+
+Split Phase 5's single mixed table into Step 1 (pick one paradigm: monolith/microservices/layered/event-driven/serverless/pipeline/space-based) and Step 2 (compose tactical patterns: DDD, hexagonal, CQRS, event sourcing — several or none, often per-module), with composition examples and an anti-pattern against treating DDD/CQRS as alternatives to a paradigm. Recommendation + brief now name a paradigm plus composed patterns.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Expanded arch-document quality scenarios to the Q42 six-part template
+
+Replaced §10's thin "Under [condition]..." template with the arc42/Q42 six-part structure (source, stimulus, environment, artifact, response, response measure), each glossed, with the cart-checkout example rewritten across all six parts. A one-line prose form stays acceptable once all six parts are recoverable.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Added staleness/ownership metadata to arch-document output
+
+Added a per-section metadata block (owner, generated-against SHA + date, review cadence, "stale-when" conditions) as an HTML-comment header plus a visible Doc-status note, with field-fill guidance, and a whole-document Doc Status table replacing the README's "Last Updated" stub. Wired into the review checklist and an "Undated docs" anti-pattern.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Added confidence levels to arch-evaluate findings
+
+Added a "Confidence and Provenance" subsection: every framework-agnostic finding carries High/Medium/Low + how it was determined, with a required "Not fully checked because..." note when scale, runtime imports, reflection, or dynamic dispatch cap certainty. Updated the example findings and review checklist; a finding with no note now asserts a full read.
+
+*** 2026-05-22 Fri @ 14:59:32 -0500 Made arch-evaluate report skipped tool checks explicitly
+
+Replaced "skip silently" with explicit reporting: for each detected language whose tool isn't configured or can't run, emit an Info "tool not configured / not run" finding (with an example) so the audit shows what was and wasn't verified. A check that didn't run no longer reads as a pass. Updated workflow step 4 and the review checklist.
+
+*** 2026-05-22 Fri @ 14:51:37 -0500 Added notation/output fallback to c4-analyze + c4-diagram
+
+Both commands now treat C4 as notation-independent: a "Choosing a notation" section (draw.io XML, Structurizr DSL, Mermaid with native C4 types, PlantUML/C4-PlantUML) and a headless fallback that emits a text notation (Mermaid or Structurizr DSL) and skips PNG-export/desktop-open when =drawio= or a GUI is absent, rather than failing. draw.io is now one option, not the only one.
+
+*** 2026-05-22 Fri @ 14:51:37 -0500 Clarified C4 abstraction boundaries in c4-analyze + c4-diagram
+
+Added an "Abstraction boundaries" section to both: a Container is a separately deployable/runnable unit (not synonymous with a Docker container — a SPA or managed DB counts), a Component lives inside one Container and isn't separately deployable. Added a 4e "Verify single abstraction level" check that walks every element and relationship to confirm it stays at the diagram's level, notation-independent.
+
+*** 2026-05-22 Fri @ 15:10:35 -0500 Added "When You Cannot Verify" standard to verification.md
+
+Added a section requiring, when a verification command can't run, a four-part report: command attempted, why it couldn't run, risk left unverified, and the smallest next command for the user. States the principle that a check that didn't run is never reported as a pass — "unable to verify" is a required honest outcome, not silence. Placed after Red Flags.
+
+*** 2026-05-22 Fri @ 15:10:35 -0500 Added property-based + mutation testing escalation to testing.md
+
+Added an "Escalation Beyond Category and Pairwise" section: property-based testing for invariants over a broad input domain (round-trips, idempotence, ordering — Hypothesis/fast-check/proptest) and mutation testing for when high line coverage hides thin assertions (mutmut/cosmic-ray/Stryker). Both framed as escalation paths to reach for on a gap, not gates on every unit.
+
+*** 2026-05-22 Fri @ 15:10:35 -0500 Added a disciplined spike protocol to testing.md
+
+Formalized the existing "I need to spike first" excuse-table row into a "Spike Exception (Disciplined)" subsection under TDD Discipline: TDD stays the default, but a spike is sanctioned when all three hold — timeboxed, spike code not committed, and the first failing test written before productionizing the discovered approach. Built on the existing row rather than contradicting it.
+
+*** 2026-05-22 Fri @ 15:10:35 -0500 Added pre-dispatch availability + cost checks to subagents.md
+
+Added a "Pre-Dispatch Checks" section with two gates: Availability (no Agent capability → do the work in the main thread under the same scope/constraints/output discipline the contract would enforce) and Cost (when writing the full contract costs more than the task, do it inline). Cross-references the existing "Don't Subagent At All" section and "Subagenting trivial work" anti-pattern rather than duplicating.
+
+*** 2026-05-22 Fri @ 15:06:04 -0500 Revised python-testing SQLite guidance toward production-like DBs
+
+Replaced "prefer in-memory SQLite for speed" with: run ORM/query tests against a production-like DB (same engine as prod, often containerized), since SQLite diverges from Postgres/MySQL on query semantics, constraints, transactions, JSON, time zones, and indexes (a test can pass on SQLite and fail in prod). SQLite stays only for pure unit tests with no DB-semantics dependency.
+
+*** 2026-05-22 Fri @ 15:06:04 -0500 Clarified python-testing ORM-mocking boundary
+
+Changed the "never mock" bullet from "ORM queries" to "ORM internals (querysets, sessions, model internals)" and added a paragraph: domain services use real model methods/validation, but a thin orchestration unit can inject a fake at a deliberate data-access port (a repository/interface the code owns). That's still mocking at a boundary, not at ORM internals.
+
+*** 2026-05-22 Fri @ 15:06:04 -0500 Made elisp.md editing advice tool-agnostic
+
+Rephrased the "prefer Write over repeated Edits" bullet around intent: land nontrivial Elisp as one cohesive change rather than dribbling it in over tiny partial edits (which accumulate paren mismatches), and run paren-balance + byte-compile checks immediately after, whatever editing mechanism the environment uses.
+
+*** 2026-05-22 Fri @ 15:06:04 -0500 Added batch-mode + native-comp caveats to elisp-testing.md
+
+Added three sections: Batch-Mode Reproducibility (=emacs --batch= as source of truth, no interactive-session state, no blocking prompts, deterministic), Isolating Emacs State (temp =user-emacs-directory=, explicit load-path, declared deps only, with an unwind-protect sandbox example), and Byte-Compile/Native-Comp Warnings (=byte-compile-error-on-warn=, native-comp gated on =native-comp-available-p= and kept opt-in/version-aware).
+
+*** 2026-05-22 Fri @ 15:16:22 -0500 Synced hooks/README install snippets with the destructive hook (opt-in)
+
+Brought the README's manual-install and settings-JSON snippets in line with the canonical =hooks/settings-snippet.json= (which already wires all three) and the Makefile's opt-in design: added the destructive-bash-confirm.py symlink as an opt-in step, added its settings entry, and reworded the note to say all three are no-op-safe but the destructive gate is opt-in (=make install-hooks= excludes it by default — link manually before relying on the snippet entry).
+
+*** 2026-05-22 Fri @ 15:35:06 -0500 Hooks now scan file-backed commit/PR messages
+
+Added =read_referenced_file()= to =_common.py= (safe local read: missing/oversize/non-UTF-8 → None) and wired it in: =git-commit-confirm.py= =extract_commit_message= now handles =-F=/=--file=/=--file===<path>= (reads + scans the file, falls through to UNPARSEABLE → asks if unreadable), and =gh-pr-create-confirm.py= reads =--body-file= content instead of a placeholder. Attribution scanning now sees the real committed/posted text. Built a pytest harness (=hooks/tests/=, importlib-by-path loader for the hyphen-named hooks) and wired =hooks/tests= into =make test=. 54 hook tests pass; full suite green.
+
+*** 2026-05-22 Fri @ 15:35:06 -0500 Rewrote destructive-bash rm parsing on shlex
+
+=detect_rm_rf= now tokenizes with =shlex.split= instead of a whitespace split, so quoted/spaced paths and combined/separate/reordered flags (=-rf=, =-r -f=, =-fr=, =--recursive=/=--force=) all parse. Fails toward asking — returns a sentinel that still fires the modal — on unbalanced quotes or when a forced recursive rm coexists with a compound/pipeline/substitution/redirect construct. Documented the supported/unsupported shell constructs in the docstrings, and extended the dangerous-path banner to =$HOME=-prefixed and wildcard targets. Covered by 25 new tests. (Pre-existing, out-of-scope: path-prefixed =rm= like =/bin/rm= still isn't matched.)
+** DONE [#B] Add =make remove= for interactive ruleset removal via fzf
+CLOSED: [2026-05-22 Fri]
+Shipped: =scripts/remove.sh= (three modes — =--list=, =--remove-selected= reading stdin, and the default fzf-multi interactive flow) + =make remove= target + =scripts/tests/remove.bats= (5 cases). Lists only symlinks resolving into the repo (foreign links left alone); rm's picked links while leaving repo sources untouched; reports-and-continues on a missing target; quiet no-op on empty selection. shellcheck clean, make test green. Dropped the stale =bridge= entry per the note below.
+
+Add a Makefile target that lists every currently-installed ruleset entry
+and lets me pick one or more to remove via fzf. Granular alternative to
+=make uninstall= (removes everything) and =make uninstall-hooks= (removes
+only hooks).
+
+*** Why this matters
+
+Tearing down a single skill, rule, hook, or config file currently means
+either running =make uninstall= and re-installing what I want to keep,
+or =rm=ing the symlink directly and remembering the exact path. Both are
+friction. An interactive picker lets me filter, multi-select with Tab,
+and confirm with Enter — the typical fzf flow. Costs about 3-5 seconds
+per teardown instead of 15+ seconds of "what's the exact name?".
+
+*** Design
+
+The recipe builds a tab-separated list of every currently-installed item,
+categorized by type, and pipes it to =fzf --multi=. The user filters,
+marks with Tab, and confirms with Enter. The recipe parses the selections
+and =rm=s the matching symlinks.
+
+#+begin_example
+ skill debug
+ rule commits.md
+ hook destructive-bash-confirm.py
+ config settings.json
+ commands commands
+ bridge claude-rules
+#+end_example
+
+Each line is =<kind>\t<name>=. The recipe maps =<kind>= to the right path:
+
+- =skill= → =$(SKILLS_DIR)/<name>=
+- =rule= → =$(RULES_DIR)/<name>=
+- =hook= → =$(HOOKS_DIR)/<name>=
+- =config= → =$(CLAUDE_DIR)/<name>=
+- =commands= → =$(CLAUDE_DIR)/commands=
+- =bridge= → =$(SKILLS_DIR)/claude-rules=
+
+Source files in =rulesets/= stay untouched. =make install= re-creates the
+removed links if needed (the install loop is idempotent).
+
+*** Edge cases
+
+- Esc instead of Enter → empty selection → clean exit, no removal.
+- Filter to nothing then Enter → same as Esc.
+- Selected item already gone → =rm= fails visibly, processing continues
+ on the rest.
+- =fzf= not installed → fail fast with a clear error (matches the pattern
+ used by =install-lang=).
+
+*** Possible extensions
+
+- Parallel =make pick-install= target that lists not-yet-installed items
+ and installs the chosen ones. Symmetric UX, same fzf flow.
+- Confirmation prompt when more than N items selected (defense against
+ accidental select-all).
+- =--source= flag that also runs =git rm= against the rulesets source for
+ the selected item. Probably bad idea — too easy to lose work.
+- The =bridge → $(SKILLS_DIR)/claude-rules= entry above is stale — the
+ bridge symlink got removed in a later commit. Drop that bullet when the
+ recipe lands.
+** DONE [#B] Document the =mcp/= install pipeline in =mcp/README.org=
+CLOSED: [2026-05-22 Fri]
+Wrote =mcp/README.org= covering everything in the "what to cover" list: the file layout (tracked vs gitignored), the secrets-bundle shape (plain =${VAR}= secrets + base64-bundled OAuth artifacts, AES256 symmetric =gpg -c=), the install flow (decrypt → materialize keys/token caches at mode 600 → expand → register unregistered, idempotent), the http/sse-vs-stdio transport split, token rotation when a Google refresh token is revoked, and adding a new server. Grounded in a read of the actual =install.py= + =servers.json=.
+
+=mcp/= has =install.py=, =servers.json=, =secrets.env.gpg=, =gcp-oauth.keys.json= (gitignored, regenerated at install). No README. Coming back to this in three months I'll re-discover how the bundle is structured, what =install.py= does, and how to rotate tokens. Saving that re-discovery is the whole point.
+
+*** What to cover
+
+- Layout: what each file is, which are tracked vs gitignored.
+- Secrets bundle shape: how vars are listed in =secrets.env=, the symmetric-encryption pattern (=gpg -c --cipher-algo AES256=), the base64-bundled OAuth artifacts (=GCP_OAUTH_KEYS_JSON_B64=, =GOOGLE_DOCS_PERSONAL_TOKEN_B64=, =GOOGLE_DOCS_WORK_TOKEN_B64=).
+- Install flow: =make install-mcp= → =install.py= decrypts, writes the keys file and Google Docs token caches at mode 600, expands =${VAR}= in =servers.json=, calls =claude mcp add --scope user= for unregistered servers. Idempotent.
+- Token rotation: when a refresh token gets revoked, the recovery flow (re-auth on one machine, re-bundle, recommit).
+- Adding a new server: edit =servers.json=, add any new =${VAR}= placeholders to the bundle, re-encrypt.
+- The OAuth dance for HTTP-transport servers (linear, notion) versus stdio (google-docs-*) — different paths, different gotchas.