diff options
| -rw-r--r-- | todo.org | 393 |
1 files changed, 199 insertions, 194 deletions
@@ -5,7 +5,9 @@ Tracking TODOs for the rulesets repo that span more than one commit. Project-scoped (not the global =~/sync/org/roam/inbox.org= list). -* TODO [#A] Build =create-documentation= skill for high-quality project/product docs +* Rulesets Open Work + +** TODO [#A] Build =create-documentation= skill for high-quality project/product docs Create a Claude skill named =create-documentation= that can plan, write, refresh, and review software documentation across README files, project docs, @@ -19,7 +21,7 @@ documentation system around a product or repo: onboarding, tutorials, how-to guides, reference, explanation, operations, troubleshooting, contribution, release/upgrade, and publication format. -** Why this matters +*** Why this matters The repo currently has strong skills for architecture, testing, review, debugging, and workflow. It does not have a general documentation skill that: @@ -33,9 +35,9 @@ debugging, and workflow. It does not have a general documentation skill that: - Treats docs as a maintained product surface with verification, ownership, navigation, accessibility, and freshness checks. -** Research notes +*** Research notes -*** Documentation frameworks and best-practice sources +**** Documentation frameworks and best-practice sources - Diataxis separates documentation by reader need: - Tutorials: learning-oriented, take the reader by the hand. @@ -79,9 +81,9 @@ debugging, and workflow. It does not have a general documentation skill that: and release-note patterns. Do not vendor wholesale; use as prior art. Source: [[https://www.thegooddocsproject.dev/][The Good Docs Project]] -*** Praised project docs to analyze and steal from +**** Praised project docs to analyze and steal from -**** Django +***** Django Why it works: - It labels the doc types directly and explains when to use each. @@ -100,7 +102,7 @@ Patterns to use: Source: [[https://docs.djangoproject.com/en/5.2/][Django documentation]] -**** Kubernetes +***** Kubernetes Why it works: - It has a large, complex product but maintains separate lanes for Concepts, @@ -122,7 +124,7 @@ Patterns to use: Sources: [[https://kubernetes.io/docs/home/][Kubernetes docs home]], [[https://kubernetes.io/docs/tasks/][Kubernetes tasks]] -**** Rust +***** Rust Why it works: - Rust has a "bookshelf" rather than one overloaded manual: The Book, Rust by @@ -144,7 +146,7 @@ Patterns to use: Source: [[https://doc.rust-lang.org/][Rust documentation]] -**** Stripe API docs +***** Stripe API docs Why it works: - The API reference is organized around resources and common cross-cutting @@ -169,7 +171,7 @@ Patterns to use: Source: [[https://docs.stripe.com/api][Stripe API Reference]] -**** FastAPI +***** FastAPI Why it works: - Documentation is part of the framework's value proposition: OpenAPI and JSON @@ -189,9 +191,9 @@ Patterns to use: Source: [[https://fastapi.tiangolo.com/features/][FastAPI features]] -** Format and presentation decisions +*** Format and presentation decisions -*** Default source format: Markdown +**** Default source format: Markdown Use =.md= as the default for shared project documentation when: - The repo is on GitHub/GitLab/Forgejo and readers browse docs in the web UI. @@ -208,7 +210,7 @@ MkDocs is a good reference point: Markdown source, YAML config, built-in dev server, static HTML output, and easy hosting. Source: [[https://www.mkdocs.org/][MkDocs]] -*** Use Org when the document is Emacs-native or personal/planning-heavy +**** Use Org when the document is Emacs-native or personal/planning-heavy Use =.org= when: - The user's workflow is explicitly Emacs/org-mode. @@ -226,7 +228,7 @@ format for non-Emacs contributors. Sources: [[https://orgmode.org/org.html][Org manual]], [[https://orgmode.org/worg/org-tutorials/org-publish-html-tutorial.html][Org publish HTML tutorial]] -*** Use HTML as generated/published output, rarely as hand-authored source +**** Use HTML as generated/published output, rarely as hand-authored source Use =.html= when: - The deliverable is a published static documentation site. @@ -240,7 +242,7 @@ Prefer generated HTML from Markdown/Org/reStructuredText/AsciiDoc/OpenAPI over hand-authored HTML. Hand-edit HTML only for standalone artifacts, custom landing pages, or cases where the project already treats HTML templates as docs source. -*** Consider generated/spec-backed formats +**** Consider generated/spec-backed formats Use generated reference when possible: - API reference: OpenAPI/Swagger/ReDoc/Scalar from code/spec. @@ -253,7 +255,7 @@ The skill should not duplicate generated reference by hand. It should improve source comments, schema descriptions, examples, front matter, and surrounding guides. -*** Presentation requirements +**** Presentation requirements Every generated doc set should have: - A docs home or README that routes by reader intent. @@ -274,9 +276,9 @@ Every generated doc set should have: - Optional LLM-friendly surfaces for larger doc sets: =llms.txt=, "copy as Markdown" equivalents, concise page summaries, and stable anchors. -** Proposed skill design +*** Proposed skill design -*** Skill name and trigger +**** Skill name and trigger Name: =create-documentation= @@ -296,7 +298,7 @@ Do not trigger for: - inline code comments/docstrings only, unless the user asks to create docs from them. -*** V1 should be one orchestrating skill, not many separate skills +**** V1 should be one orchestrating skill, not many separate skills Build v1 as one skill with explicit phases and subcommands rather than a set of separate skills. Rationale: @@ -320,7 +322,7 @@ Support discoverable subcommands inside one skill: The default =/create-documentation <scope>= runs audit -> plan -> write -> review, asking for confirmation before broad rewrites. -*** Future split if v1 gets too large +**** Future split if v1 gets too large If the skill grows past a manageable size, split into a discoverable =documentation-*= chain. Names and order: @@ -341,9 +343,9 @@ Keep =create-documentation= as the orchestrator and user-facing entry point. The chain is discoverable because every helper starts with =documentation-= and the orchestrator prints the next command at each handoff. -** V1 workflow details +*** V1 workflow details -*** Phase 1: Intake and classification +**** Phase 1: Intake and classification Ask only what is missing from local context: - Who is the reader? New user, evaluator, integrator, maintainer, operator, @@ -374,7 +376,7 @@ Classify the work into one or more doc types: - Security/compliance docs. - Examples/cookbook. -*** Phase 2: Audit existing material +**** Phase 2: Audit existing material Inventory: - =README*=, =docs/=, =doc/=, =site/=, =mkdocs.yml=, =docusaurus.config.*=, @@ -394,7 +396,7 @@ Use =rg= first. For API/CLI reference, prefer structured sources: OpenAPI/JSON Schema, package metadata, command =--help= output, docstrings, or language-native documentation tooling. -*** Phase 3: Documentation plan +**** Phase 3: Documentation plan Write a short plan before broad edits: - Audiences and priority order. @@ -409,7 +411,7 @@ Write a short plan before broad edits: Stop for confirmation when the plan moves or rewrites more than one file. -*** Phase 4: Write or update docs +**** Phase 4: Write or update docs Writing rules: - Lead with the reader's goal, not the implementation history. @@ -519,7 +521,7 @@ Runbook: ## Post-incident notes #+end_example -*** Phase 5: Presentation and publishing +**** Phase 5: Presentation and publishing If docs are repo-local only: - Ensure links render on GitHub/GitLab. @@ -538,7 +540,7 @@ If docs are web-published: - Check links, nav, search, mobile viewport, and accessibility basics. - Do not commit generated =site/= output unless the project already does. -*** Phase 6: Verification +**** Phase 6: Verification Verification should match doc type: - Commands in quickstarts/how-tos: run them or mark not run with reason. @@ -558,7 +560,7 @@ Final report must say: - What could not be verified. - Known gaps/follow-ups. -** Relationship to existing skills +*** Relationship to existing skills - =arch-document=: use when the requested docs are specifically architecture docs from brief + ADRs + C4/arc42. =create-documentation= may call it, then @@ -576,7 +578,7 @@ Final report must say: - =codify=: use after a documentation session reveals reusable project-specific documentation rules. -** Quality bar and anti-patterns +*** Quality bar and anti-patterns The skill should reject: - A giant README that mixes tutorial, reference, architecture, and operations. @@ -594,7 +596,7 @@ The skill should reject: - Publishing generated HTML as source unless the project explicitly owns HTML docs that way. -** Acceptance criteria for building the skill +*** Acceptance criteria for building the skill - [ ] Directory =create-documentation/= with =SKILL.md=. - [ ] Frontmatter description includes positive and negative triggers. @@ -620,7 +622,7 @@ The skill should reject: references for progressive disclosure. - [ ] Run =./scripts/lint.sh= after adding the skill. -** Open design questions before implementation +*** Open design questions before implementation - Should the user-facing command be exactly =/create-documentation= while internal helper names use =documentation-*=, or should all names share the @@ -639,7 +641,7 @@ The skill should reject: public/library/API docs: =llms.txt= or markdown export is valuable, but normal human navigation remains primary. -* TODO [#A] Review pass: tighten skills and rulesets after 2026-05-04 audit +** TODO [#A] Review pass: tighten skills and rulesets after 2026-05-04 audit Source notes used in this pass: - C4 official docs: C4 is notation-independent; System Context and Container @@ -686,69 +688,69 @@ Source notes used in this pass: [[https://www.nist.gov/publications/practical-combinatorial-testing-beyond-pairwise][NIST beyond pairwise]], [[https://www.nist.gov/publications/combinatorial-software-testing][NIST combinatorial testing]] -** Grouped index (for batching by area) +*** Grouped index (for batching by area) Each item below is a one-line summary of a sub-TODO further down. Tick the box when the matching sub-TODO is moved to =DONE=. Items are grouped by area so they can be batched (e.g., "do all Playwright items in one session"). -*** Browser testing +**** Browser testing - [ ] [#A] =playwright-js=: locator/assertion-first guidance (replace raw CSS, =networkidle=) - [ ] [#B] =playwright-js= + =playwright-py=: reconcile headless/visible defaults - [ ] [#B] =playwright-js= + =playwright-py=: remove emoji console markers from examples -*** Frontend / UI +**** Frontend / UI - [ ] [#B] =frontend-design=: WCAG 2.2 alignment, accessibility non-optional - [ ] [#B] =frontend-design=: harmonize aesthetic guidance with anti-pattern rules -*** Security +**** Security - [ ] [#A] =security-check=: OWASP 2021 + WSTG coverage - [ ] [#B] =security-check=: tooling and offline/network caveats -*** Combinatorial testing +**** Combinatorial testing - [ ] [#B] =pairwise-tests=: t-way escalation guidance beyond pairwise - [ ] [#B] =pairwise-tests=: clarify negative value syntax + generator availability -*** V2MOM +**** V2MOM - [ ] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment) - [ ] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog - [ ] [#B] =create-v2mom=: mitigation/owner fields for Obstacles -*** Prompt engineering +**** Prompt engineering - [ ] [#A] =prompt-engineering=: correct/narrow Meincke citation - [ ] [#B] =prompt-engineering=: eval-harness requirement for production prompts -*** Codify +**** Codify - [ ] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md= -*** Code review +**** Code review - [ ] [#A] =review-code=: resolve local-verification vs CI boundary - [ ] [#B] =review-code=: =CLAUDE.md= citation scope for public artifacts - [ ] [#B] =review-code=: relax three-strengths rule for tiny/failing diffs -*** PR / review responses +**** PR / review responses - [ ] [#A] =respond-to-review=: remove review-process language from commit messages - [ ] [#B] =respond-to-review=: use unresolved threads + resolution state - [ ] [#B] =respond-to-cj-comments=: drop personal absolute paths from public-writing - [ ] [#B] =respond-to-cj-comments=: fallback when =humanizer= or =emacsclient= unavailable -*** Branch workflow +**** Branch workflow - [ ] [#A] =finish-branch=: fix base-branch detection - [ ] [#B] =finish-branch=: worktree-aware pull/merge safety - [ ] [#B] =start-work=: tool-availability + ceremony-scaling rules - [ ] [#B] =start-work=: claim-before-justify rollback risk -*** Tests / TDD +**** Tests / TDD - [ ] [#B] =add-tests=: fix missing =typescript-testing.md= reference or add ruleset - [ ] [#B] =add-tests=: explicit exceptions to "all three categories per function" -*** Debugging / RCA +**** Debugging / RCA - [ ] [#B] =debug=: capture environment + recent-change context before hypotheses - [ ] [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries - [ ] [#B] =five-whys=: require evidence + counterfactual validation per why -*** Brainstorming +**** Brainstorming - [ ] [#B] =brainstorm=: timebox + research/source rules for high-stakes designs -*** Architecture +**** Architecture - [ ] [#B] =arch-decide=: timeless examples, drop unverifiable claims - [ ] [#B] =arch-decide=: standardize statuses + immutability language - [ ] [#B] =arch-design=: threat modeling + privacy/compliance as first-class inputs @@ -758,11 +760,11 @@ Each item below is a one-line summary of a sub-TODO further down. Tick the box w - [ ] [#B] =arch-evaluate=: confidence levels for framework-agnostic findings - [ ] [#B] =arch-evaluate=: report skipped tool checks explicitly -*** C4 modeling +**** C4 modeling - [ ] [#A] =c4-analyze= + =c4-diagram=: notation/output fallback (not draw.io-only) - [ ] [#B] =c4-analyze= + =c4-diagram=: clarify abstraction boundaries -*** Global rules +**** Global rules - [ ] [#B] =commits.md=: split DeepSat/Linear/Slack-specific from global rules - [ ] [#A] =commits.md= + publish flows: =humanizer=-unavailable fallback - [ ] [#B] =verification.md=: explicit "unable to verify" reporting standard @@ -770,18 +772,18 @@ Each item below is a one-line summary of a sub-TODO further down. Tick the box w - [ ] [#B] =testing.md=: soften absolute TDD with explicit spike protocol - [ ] [#B] =subagents.md=: capability/availability + cost checks -*** Languages +**** Languages - [ ] [#A] =python-testing.md=: revisit in-memory SQLite guidance - [ ] [#B] =python-testing.md=: separate "never mock ORM" from unit-test boundaries - [ ] [#B] =elisp.md=: drop tool-specific advice - [ ] [#B] =elisp-testing.md=: batch-mode + native-comp caveats -*** Hooks +**** Hooks - [ ] [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets - [ ] [#A] =hooks/git-commit-confirm.py= + =gh-pr-create-confirm.py=: inspect message/body files referenced by =-F= / =--body-file= - [ ] [#B] =hooks/destructive-bash-confirm.py=: shell-aware command parsing (not regex) -** TODO [#A] =playwright-js=: replace raw CSS/page actions and =networkidle= defaults with locator/assertion-first guidance +*** TODO [#A] =playwright-js=: replace raw CSS/page actions and =networkidle= defaults with locator/assertion-first guidance Current examples lean on =page.click=, =page.fill=, =waitForSelector=, and =waitForLoadState('networkidle')=. Official Playwright guidance prefers @@ -790,20 +792,20 @@ calls =networkidle= discouraged for testing. Keep reconnaissance, but revise it to wait for a visible app-specific landmark instead of treating network quiet as readiness. -** TODO [#B] =playwright-js= and =playwright-py=: reconcile headless/visible-browser defaults +*** TODO [#B] =playwright-js= and =playwright-py=: reconcile headless/visible-browser defaults =playwright-js= says visible Chromium by default; =playwright-py= says headless by default. That may be intentional, but the difference should be explicit: interactive visual debugging -> headed, CI/pytest smoke tests -> headless. Add a small decision table so agents don't flip modes by habit. -** TODO [#B] =playwright-js= and =playwright-py=: remove emoji console markers from examples +*** TODO [#B] =playwright-js= and =playwright-py=: remove emoji console markers from examples The broader rules discourage emojis in shared engineering output. The Playwright examples print camera/check/cross emoji. Replace with plain ASCII status prefixes. -** TODO [#B] =frontend-design=: make accessibility non-optional and align with WCAG 2.2 +*** TODO [#B] =frontend-design=: make accessibility non-optional and align with WCAG 2.2 The workflow only loads =references/accessibility.md= for interactive components. Accessibility should be a baseline for all frontend work: keyboard @@ -811,7 +813,7 @@ operation, focus visibility/not-obscured, target size, contrast, reduced motion, labels, and semantic structure. Add WCAG 2.2-oriented gates before handoff. -** TODO [#B] =frontend-design=: harmonize aesthetic guidance with current UI anti-pattern rules +*** TODO [#B] =frontend-design=: harmonize aesthetic guidance with current UI anti-pattern rules The skill encourages gradient meshes, heavy texture, custom cursors, overlap, and maximalist directions. Those can conflict with the repo's newer frontend @@ -820,7 +822,7 @@ single-hue palettes, unreadable layouts, and marketing-style dashboards. Add a "creative but bounded" section: domain fit, readability, responsive stability, and no decorative effects that degrade the task workflow. -** TODO [#A] =security-check=: update OWASP coverage to the 2021 categories and WSTG test areas +*** TODO [#A] =security-check=: update OWASP coverage to the 2021 categories and WSTG test areas The current security checklist uses older category names and misses several current Top 10 items: Insecure Design, Software and Data Integrity Failures, @@ -829,7 +831,7 @@ each finding maps to either OWASP Top 10 2021 or a WSTG area, and add explicit checks for authorization object/function-level access, SSRF URL fetches, integrity of update/plugin paths, and security-relevant logging gaps. -** TODO [#B] =security-check=: add practical tooling and offline/network caveats +*** TODO [#B] =security-check=: add practical tooling and offline/network caveats Add optional use of project-configured scanners such as =gitleaks= or =trufflehog= for secrets, =semgrep= for source patterns, =pip-audit= / =npm @@ -837,7 +839,7 @@ audit= / OSV where configured, and lockfile diff review. Note that dependency audits may need network access and should report "not run" clearly rather than silently passing. -** TODO [#B] =pairwise-tests=: add t-way escalation guidance beyond pairwise +*** TODO [#B] =pairwise-tests=: add t-way escalation guidance beyond pairwise Pairwise is a pragmatic default, but NIST's combinatorial testing work covers higher-strength t-way arrays too. Add a rule: start with pairwise for broad @@ -845,7 +847,7 @@ coverage, escalate selected high-risk parameter clusters to 3-way or higher when history, safety, security, or domain reasoning suggests faults require more than two interacting factors. -** TODO [#B] =pairwise-tests=: clarify negative value syntax and actual generator availability +*** TODO [#B] =pairwise-tests=: clarify negative value syntax and actual generator availability The examples use =~0= style values that are PICT-specific and easy to misread. Add a short "negative testing values are labels, not operators unless @@ -853,26 +855,26 @@ PICT treats them specially" explanation, and make the run path honest: if PICT or =pypict= is unavailable, produce the model and stop instead of implying cases were generated. -** TODO [#A] =create-v2mom=: rename "Metrics" to Salesforce's "Measures" or explicitly justify the deviation +*** TODO [#A] =create-v2mom=: rename "Metrics" to Salesforce's "Measures" or explicitly justify the deviation V2MOM's final M is officially "Measures." The skill uses "Metrics" throughout. Either rename the section and description to "Measures" or add a clear note that this fork intentionally says "Metrics" while preserving the V2MOM concept. -** TODO [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog +*** TODO [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog Salesforce presents V2MOM as a simple alignment framework. This skill's optional task-migration phase can make the V2MOM the entire todo system. Split strategy from execution: keep the V2MOM concise, and link to method-specific backlogs instead of embedding every task under the strategic document. -** TODO [#B] =create-v2mom=: add mitigation/owner fields for Obstacles +*** TODO [#B] =create-v2mom=: add mitigation/owner fields for Obstacles The current Obstacles phase captures barriers but not consistently how each will be overcome. Add "mitigation, owner, and review cadence" per obstacle so the section becomes operational instead of just candid. -** TODO [#A] =prompt-engineering=: correct and narrow the Meincke citation +*** TODO [#A] =prompt-engineering=: correct and narrow the Meincke citation The skill cites "Persuasion and Compliance in Large Language Models" but the paper found in research is "Call Me A Jerk: Persuading AI to Comply with @@ -881,21 +883,21 @@ result: it shows persuasion can raise compliance with objectionable requests, which is a cautionary prompt-safety finding, not broad evidence that persuasion principles improve engineering prompt quality. -** TODO [#B] =prompt-engineering=: add an evaluation harness requirement for production prompts +*** TODO [#B] =prompt-engineering=: add an evaluation harness requirement for production prompts Prompt critique currently ends with a rewrite and checklist. Add a requirement for fragile or reusable prompts: create 3-5 adversarial/edge examples, run the old and new prompt against them, and record the observed behavioral delta. Without examples, prompt quality remains asserted rather than verified. -** TODO [#B] =codify=: add stale-entry review and privacy checks before writing project =CLAUDE.md= +*** TODO [#B] =codify=: add stale-entry review and privacy checks before writing project =CLAUDE.md= The skill has good gates, but it should explicitly scan for stale entries, private context, and team-visible leakage before appending. Add "would this be safe if the project were public?" and "does this belong in private memory instead?" as mandatory checks, not just table background. -** TODO [#A] =review-code=: resolve the local-verification vs CI boundary +*** TODO [#A] =review-code=: resolve the local-verification vs CI boundary =review-code= says "Trust CI for lint, typecheck, test runs; don't re-run them." =verification.md= and =finish-branch= require fresh local evidence @@ -903,7 +905,7 @@ before completion. Clarify: code review should not duplicate CI while reading a PR, but pre-commit/pre-push workflows still need local verification or a clear "not run because..." statement. -** TODO [#B] =review-code=: handle public-artifact scope when citing =CLAUDE.md= +*** TODO [#B] =review-code=: handle public-artifact scope when citing =CLAUDE.md= The skill requires auditing and reporting =CLAUDE.md= adherence, while =commits.md= says personal tooling files should not be cited as authority in @@ -911,41 +913,41 @@ public artifacts. Add two output modes: private/internal review may cite =CLAUDE.md= directly; public/team review should translate the rule into the underlying engineering reason without naming personal rulesets. -** TODO [#B] =review-code=: relax mandatory "three strengths" for tiny or failing diffs +*** TODO [#B] =review-code=: relax mandatory "three strengths" for tiny or failing diffs "Three minimum" strengths can force filler on small diffs or bad PRs. Adjust to "up to three specific strengths; say none found when appropriate" so the review stays honest and avoids synthetic praise. -** TODO [#A] =respond-to-review=: remove review-process language from commit messages +*** TODO [#A] =respond-to-review=: remove review-process language from commit messages The skill suggests commits like =fix: Address review — [description]=, which conflicts with =commits.md='s "what changed and why, not the process" rule and also uses a non-ASCII dash. Replace with conventional subjects that name the actual fix, e.g. =fix: validate export filename=. -** TODO [#B] =respond-to-review=: use unresolved review threads and resolution state, not only flat comments +*** TODO [#B] =respond-to-review=: use unresolved review threads and resolution state, not only flat comments Fetching inline and top-level comments via REST misses thread resolution and can re-process already-resolved feedback. Add the same thread-level workflow as the GitHub comment-addressing skill: gather unresolved threads, group by requested change, implement, reply, and resolve only after verification. -** TODO [#B] =respond-to-cj-comments=: remove personal absolute path references from public-writing instructions +*** TODO [#B] =respond-to-cj-comments=: remove personal absolute path references from public-writing instructions The skill embeds =/home/cjennings/code/rulesets/claude-rules/commits.md= in the public-writing section. That contradicts the public-artifact scope rule. Refer to "the commit/public-writing rules" internally, and ensure any emitted public text never cites the local path. -** TODO [#B] =respond-to-cj-comments=: add fallback when =humanizer= or =emacsclient= is unavailable +*** TODO [#B] =respond-to-cj-comments=: add fallback when =humanizer= or =emacsclient= is unavailable The workflow requires =/humanizer= and opens long summaries in =emacsclient=. Neither is guaranteed in a fresh environment. Add tool-availability checks and fallbacks: apply the style passes inline if =humanizer= is absent, and write the summary file path without opening an editor if =emacsclient= fails. -** TODO [#A] =finish-branch=: fix base-branch detection +*** TODO [#A] =finish-branch=: fix base-branch detection Phase 2 says "determine base branch" but the command shown returns a merge-base commit SHA, not the branch name to check out, pull, merge into, or pass as PR @@ -953,14 +955,14 @@ base. Replace with explicit branch detection: upstream PR base if present, configured default branch from =origin/HEAD=, or user-selected branch, then compute merge-base separately. -** TODO [#B] =finish-branch=: make pull/merge steps safer and worktree-aware +*** TODO [#B] =finish-branch=: make pull/merge steps safer and worktree-aware Option 1 runs =git pull= and =git merge --no-ff= after checkout. Add checks for dirty worktree, upstream tracking, protected branches, and rebase-vs-merge team policy. Worktree detection via grepping branch names is fragile; use =git worktree list --porcelain= or =git rev-parse --git-common-dir= based checks. -** TODO [#B] =start-work=: add tool-availability and ceremony-scaling rules +*** TODO [#B] =start-work=: add tool-availability and ceremony-scaling rules The workflow assumes Linear MCP, GitHub CLI, =humanizer=, Playwright skills, and multi-commit TDD ceremony. Add a first-class "tools unavailable" path and a @@ -968,158 +970,158 @@ ceremony scale: trivial local fixes should not require the full ticket, branch, three approval gates, and commit-per-phase flow unless the user wants that process. -** TODO [#B] =start-work=: resolve the "claim before justify" rollback risk +*** TODO [#B] =start-work=: resolve the "claim before justify" rollback risk The skill marks Linear/GitHub/todo tasks in progress before the Justify gate, then says rolling back is required if justification fails. Consider moving claiming after Gate 1 for personal todo tasks, or make the rollback steps explicit per tracker with stored prior state. -** TODO [#B] =add-tests=: fix missing =typescript-testing.md= reference or add the ruleset +*** TODO [#B] =add-tests=: fix missing =typescript-testing.md= reference or add the ruleset Phase 3 references =typescript-testing.md=, but this repo currently has Python and Elisp testing rules only. Either add the TypeScript ruleset or change the skill to discover project-local JS/TS testing conventions instead of pointing to a missing file. -** TODO [#B] =add-tests=: add explicit exceptions to "all three categories per function" +*** TODO [#B] =add-tests=: add explicit exceptions to "all three categories per function" The Normal/Boundary/Error rule is useful, but some functions are pure adapters, generated code, tiny wrappers, or framework glue. Add an exception protocol: state why a category does not apply, and cover the behavior at the integration or E2E level when unit categories would test framework behavior. -** TODO [#B] =debug=: capture environment and recent-change context before hypotheses +*** TODO [#B] =debug=: capture environment and recent-change context before hypotheses The debugging workflow covers reproduction and logs, but should explicitly record environment, versions, feature flags, data set, seed/time, concurrency, and recent commits/config changes. Many intermittent failures are environment or state transitions, not just local code paths. -** TODO [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries and invariants +*** TODO [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries and invariants The skill says add defense at each intermediate layer that could have caught the bad value. That risks validation spam. Tighten it: add checks at ingress, trust boundaries, persistence boundaries, and invariant-owning layers; avoid duplicative null checks in every pass-through function. -** TODO [#B] =five-whys=: require evidence and counterfactual validation per why +*** TODO [#B] =five-whys=: require evidence and counterfactual validation per why The skill says "one best-supported answer" but should require an evidence field for each link and a counterfactual check: if this cause were removed, would the next symptom plausibly disappear? This reduces monocausal storytelling. -** TODO [#B] =brainstorm=: add timebox and research/source rules for high-stakes designs +*** TODO [#B] =brainstorm=: add timebox and research/source rules for high-stakes designs The one-question-at-a-time flow can run long. Add a timebox and a rule that claims about markets, regulations, tools, vendors, or current APIs require fresh sources. The design doc should distinguish researched facts from assumptions. -** TODO [#B] =arch-decide=: make examples technically timeless and avoid unverifiable claims +*** TODO [#B] =arch-decide=: make examples technically timeless and avoid unverifiable claims The sample ADRs include claims such as MongoDB lacking ACID for multi-document transactions "at decision time." Examples age and can teach stale facts. Replace with either clearly dated examples or domain-neutral placeholders, and require references for real technical claims in generated ADRs. -** TODO [#B] =arch-decide=: standardize statuses and immutability language +*** TODO [#B] =arch-decide=: standardize statuses and immutability language The skill mixes Accepted, Decided, Deprecated, Superseded, Rejected, and "Not Accepted." Pick a canonical status set and state that accepted ADR content is not edited except for status/link metadata; changed decisions get new ADRs that supersede old ones. -** TODO [#B] =arch-design=: add threat modeling and privacy/compliance as first-class design inputs +*** TODO [#B] =arch-design=: add threat modeling and privacy/compliance as first-class design inputs Security appears as one quality attribute, but architecture design should also ask about trust boundaries, data classification, abuse cases, privacy constraints, compliance evidence, and operational ownership. These influence architecture early and should not wait for =security-check=. -** TODO [#B] =arch-design=: separate architecture paradigms from tactical patterns +*** TODO [#B] =arch-design=: separate architecture paradigms from tactical patterns The candidate table mixes paradigms (modular monolith, microservices, event-driven) with tactical or partial patterns (DDD, CQRS, event sourcing). Revise the matrix so candidates can compose patterns rather than treating each as a mutually exclusive architecture choice. -** TODO [#B] =arch-document=: strengthen quality scenarios using arc42/Q42 structure +*** TODO [#B] =arch-document=: strengthen quality scenarios using arc42/Q42 structure Section 10 currently says "Under [condition], the system should [response] within [measure]." Expand to a compact quality-scenario template: source, stimulus, environment, artifact, response, response measure. This better matches architecture-quality practice and makes requirements testable. -** TODO [#B] =arch-document=: add staleness and ownership metadata to generated docs +*** TODO [#B] =arch-document=: add staleness and ownership metadata to generated docs arc42 docs are living documents. Add owner, source commit/date, review cadence, and "known stale when..." notes per section or in the README so generated docs do not become authoritative after the code has moved on. -** TODO [#B] =arch-evaluate=: add confidence levels for framework-agnostic findings +*** TODO [#B] =arch-evaluate=: add confidence levels for framework-agnostic findings Claude-read import graphs and public API comparisons can be incomplete in large or dynamic languages. Add confidence/provenance per finding and require "not fully checked because..." when scale or dynamic imports limit certainty. -** TODO [#B] =arch-evaluate=: report skipped tool checks explicitly +*** TODO [#B] =arch-evaluate=: report skipped tool checks explicitly The workflow says skip unconfigured language-specific tools silently, but the review checklist also wants checks run. For audit usefulness, list detected languages and "tool not configured" entries under Info instead of silent skips. -** TODO [#A] =c4-analyze= and =c4-diagram=: add notation/output fallback instead of draw.io-only +*** TODO [#A] =c4-analyze= and =c4-diagram=: add notation/output fallback instead of draw.io-only C4 is notation-independent. These skills hard-require draw.io XML, PNG export, and opening draw.io desktop. Add supported outputs (Structurizr DSL, Mermaid, PlantUML, draw.io) and a fallback path when =drawio= or a GUI is unavailable. -** TODO [#B] =c4-analyze= and =c4-diagram=: clarify C4 abstraction boundaries +*** TODO [#B] =c4-analyze= and =c4-diagram=: clarify C4 abstraction boundaries Emphasize that C4 Containers are deployable/runnable units, not necessarily Docker containers, and that Components are not separately deployable. Add a check that every relationship and element stays at one abstraction level. -** TODO [#B] =commits.md=: split DeepSat/Linear/Slack-specific publishing rules from global commit rules +*** TODO [#B] =commits.md=: split DeepSat/Linear/Slack-specific publishing rules from global commit rules The global commit rule file includes Linear status transitions and a hard-coded Slack channel. That is team-specific and may leak or misfire in unrelated projects. Move those steps to a project/team overlay, leaving global rules for author identity, attribution, commit format, review gate, and verification. -** TODO [#A] =commits.md= and publish flows: define fallback when =humanizer= is unavailable +*** TODO [#A] =commits.md= and publish flows: define fallback when =humanizer= is unavailable Several workflows make =humanizer= mandatory, but no =humanizer= skill exists in this repo. Either add the skill, install instructions, or a fallback plain-English pass that satisfies the same checks without an external skill. -** TODO [#B] =verification.md=: add explicit "unable to verify" reporting standard +*** TODO [#B] =verification.md=: add explicit "unable to verify" reporting standard The rule says run tests/lint/typecheck/build before claiming done. Add the required final wording when a command cannot be run: command attempted, reason it could not run, risk left unverified, and the smallest next command for the user to run. -** TODO [#B] =testing.md=: add property-based and mutation testing as escalation paths +*** TODO [#B] =testing.md=: add property-based and mutation testing as escalation paths The testing rules cover categories and pairwise matrices. Add guidance for property-based testing when invariants matter across broad input domains, and mutation testing when test quality is suspect despite high coverage. -** TODO [#B] =testing.md=: soften absolute TDD with an explicit spike protocol +*** TODO [#B] =testing.md=: soften absolute TDD with an explicit spike protocol The rule currently treats TDD as non-negotiable. Keep TDD as the default, but define a disciplined spike exception: timebox, do not commit spike code, write the first failing test before productionizing the discovered approach. -** TODO [#B] =subagents.md=: add capability/availability and cost checks +*** TODO [#B] =subagents.md=: add capability/availability and cost checks The rule assumes subagents exist and should handle failures. Add "if the environment lacks subagents, continue locally and preserve the same scope boundaries" plus a cost check for tasks where context handoff exceeds the work. -** TODO [#A] =languages/python/claude/rules/python-testing.md=: revisit in-memory SQLite guidance +*** TODO [#A] =languages/python/claude/rules/python-testing.md=: revisit in-memory SQLite guidance "Prefer in-memory SQLite for speed in unit tests" is risky for Django or SQLAlchemy projects whose production database is PostgreSQL/MySQL; query @@ -1127,33 +1129,33 @@ semantics, constraints, transactions, JSON, time zones, and indexes differ. Recommend production-like DBs for ORM/query behavior and reserve SQLite for pure unit tests that do not depend on database semantics. -** TODO [#B] =languages/python/claude/rules/python-testing.md=: separate "never mock ORM" from true unit-test boundaries +*** TODO [#B] =languages/python/claude/rules/python-testing.md=: separate "never mock ORM" from true unit-test boundaries For domain services, real model methods and validation are usually right. For thin orchestration units, a repository/interface fake may be cleaner than hitting a real database. Clarify the boundary: do not mock ORM internals, but do inject fakes at deliberate data-access ports. -** TODO [#B] =languages/elisp/claude/rules/elisp.md=: update editing workflow to avoid tool-specific advice +*** TODO [#B] =languages/elisp/claude/rules/elisp.md=: update editing workflow to avoid tool-specific advice The rule says prefer Write over repeated Edits. That advice is Claude-tooling specific and can conflict with environments that require patch-based edits. Rephrase around the intent: for nontrivial Elisp, make cohesive edits and run paren/byte-compile checks immediately. -** TODO [#B] =languages/elisp/claude/rules/elisp-testing.md=: add batch-mode and native-comp caveats +*** TODO [#B] =languages/elisp/claude/rules/elisp-testing.md=: add batch-mode and native-comp caveats ERT guidance is solid, but add rules for =emacs --batch= reproducibility, isolating =user-emacs-directory= / package state, and optionally catching native-comp or byte-compile warnings depending on the project's Emacs version. -** TODO [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets +*** TODO [#A] =hooks/README.md=: include =destructive-bash-confirm.py= in install/settings snippets The table documents the destructive-command hook, but the manual install and settings JSON snippets only include the commit and PR hooks. Add the destructive hook to both snippets so documented installation matches the listed hooks. -** TODO [#A] =hooks/git-commit-confirm.py= and =hooks/gh-pr-create-confirm.py=: inspect message/body files +*** TODO [#A] =hooks/git-commit-confirm.py= and =hooks/gh-pr-create-confirm.py=: inspect message/body files =commits.md= uses =git commit -F /tmp/commit-*.md= and =gh pr create --body-file ...=. The hooks currently treat file-backed messages as @@ -1161,14 +1163,14 @@ unparseable or just display the file path, so attribution scanning may miss the actual committed/posted text. Read safe local files referenced by =-F=, =--file=, and =--body-file= before deciding whether the command is clean. -** TODO [#B] =hooks/destructive-bash-confirm.py=: replace regex command parsing with shell-aware parsing where possible +*** TODO [#B] =hooks/destructive-bash-confirm.py=: replace regex command parsing with shell-aware parsing where possible The hook's regexes can miss quoted paths, variables, aliases, =env= wrappers, or compound commands, and can misidentify targets. Use =shlex= for simple commands, document unsupported shell constructs, and fail toward asking when a destructive pattern is ambiguous. -* TODO [#B] Build =ov-1= skill for DoDAF OV-1 (High-Level Operational Concept Graphic) +** TODO [#B] Build =ov-1= skill for DoDAF OV-1 (High-Level Operational Concept Graphic) Triggered by SOFWeek (May 2026, Tampa) — DeepSat attending; DoD attendees may ask for architecture diagrams. OV-1 is the universal informal @@ -1178,7 +1180,7 @@ Priority upgrades to =[#A]= if Craig confirms scenario 2 below (personal load-bearing need at the event); stays =[#B]= or drops to =[#C]= if scenario 1 (team already covers it, future asset only). -** Prior art (searched 2026-04-19) +*** Prior art (searched 2026-04-19) No existing Claude Code skill exists for DoDAF / OV-1 / SV-1 / SysML. @@ -1198,7 +1200,7 @@ Nearest prior art to lean on when building: - PlantUML for SV-1 (when that skill comes later); Mermaid or draw.io XML for OV-1 lightweight visuals. -** Build scope (when triggered) +*** Build scope (when triggered) *In scope:* - Input: prose description of a system + its operational context. @@ -1219,7 +1221,7 @@ Nearest prior art to lean on when building: Estimate: 4-6 hours. -** Craig's investigation before kickoff +*** Craig's investigation before kickoff 1. Does DeepSat's systems-engineering or marketing team already have an OV-1 (or the equivalent briefing artifact) for SOFWeek? @@ -1235,14 +1237,14 @@ Estimate: 4-6 hours. deliverables (PowerPoint slide? Cameo? Visio? affects whether the skill emits draw.io XML vs Mermaid vs pure structured spec). -** Related +*** Related See also the DoD-specific notations section under the later TODO (=c4-*= rename revisit) — OV-1 is flagged there as the highest-value starting point across the DoD notation landscape (SysML, DoDAF/UAF, IDEF1X). This entry is the execution plan for that starting point. -* TODO [#A] Build =/update-skills= skill for keeping forks in sync with upstream +** TODO [#A] Build =/update-skills= skill for keeping forks in sync with upstream The rulesets repo has a growing set of forks (=arch-decide= from wshobson/agents, =playwright-js= from lackeyjb/playwright-skill, =playwright-py= @@ -1251,7 +1253,7 @@ new templates, or scope expansions that we'd want to pull in without losing our local modifications. A skill should handle this deliberately rather than by manual re-cloning. -** Design decisions (agreed) +*** Design decisions (agreed) - *Upstream tracking:* per-fork manifest =.skill-upstream= (YAML or JSON): - =url= (GitHub URL) @@ -1269,7 +1271,7 @@ by manual re-cloning. or corrupt (can't run 3-way merge): write =.local=, =.upstream=, =.baseline= files side-by-side and surface as manual review. -** V1 Scope +*** V1 Scope - [ ] Skill at =~/code/rulesets/update-skills/= - [ ] Discovery: scan sibling skill dirs for =.skill-upstream= manifests @@ -1289,7 +1291,7 @@ by manual re-cloning. - [ ] On successful sync: update =last_synced_commit= in the manifest - [ ] =--dry-run= to preview without writing -** V2+ (deferred) +*** V2+ (deferred) - [ ] Track upstream *releases* (tags) not just branches, so skill can propose "upgrade from v1.2 to v1.3" with release notes pulled in @@ -1304,13 +1306,13 @@ by manual re-cloning. =M-x smerge-ediff= as an alternate path for users who prefer ediff over per-hunk prompts -** Initial forks to enumerate (for manifest bootstrap) +*** Initial forks to enumerate (for manifest bootstrap) - [ ] =arch-decide= → =wshobson/agents= :: =plugins/documentation-generation/skills/architecture-decision-records= :: MIT - [ ] =playwright-js= → =lackeyjb/playwright-skill= :: =skills/playwright-skill= :: MIT - [ ] =playwright-py= → =anthropics/skills= :: =skills/webapp-testing= :: Apache-2.0 -** Open questions +*** Open questions - [ ] What happens when upstream *renames* a file we fork? Skill would see "file gone from upstream, still present locally" — drop, keep, or prompt? @@ -1319,7 +1321,7 @@ by manual re-cloning. - [ ] Rate-limit / offline mode: if GitHub is unreachable, should skill fail or degrade gracefully? Likely degrade; print warning per fork. -* TODO [#B] Build /research-writer — clean-room synthesis for research-backed long-form +** TODO [#B] Build /research-writer — clean-room synthesis for research-backed long-form SCHEDULED: <2026-05-15 Fri> Gap in current rulesets: between =brainstorm= (idea refinement → design doc) @@ -1374,7 +1376,7 @@ Triggers that would prompt "let's build it now": Upstream reference (do not vendor): ComposioHQ/awesome-claude-skills =content-research-writer/SKILL.md=. -* TODO [#C] Try Skill Seekers on a real DeepSat docs-briefing need +** TODO [#C] Try Skill Seekers on a real DeepSat docs-briefing need SCHEDULED: <2026-05-15 Fri> =Skill Seekers= ([[https://github.com/yusufkaraaslan/Skill_Seekers]]) is a Python @@ -1419,7 +1421,7 @@ discard and stick with hand briefing. - Companion =skill-seekers-configs= community repo has only 8 stars despite main's 12.9k — ecosystem thinner than headline adoption -* TODO [#C] Revisit =c4-*= rename if a second notation skill ships +** TODO [#C] Revisit =c4-*= rename if a second notation skill ships Current naming keeps =c4-analyze= and =c4-diagram= as-is (framework prefix encodes the notation; "C4" is a discoverable brand). Suite membership is @@ -1461,13 +1463,13 @@ Each answers a different question: Deferred pending an actual need that's blocked on not having one of these. -** DoD-specific notations (DeepSat context) +*** DoD-specific notations (DeepSat context) Defense-contractor work uses a narrower, different notation set than commercial software. Document the trigger conditions and starting point so a future decision to build doesn't have to re-derive the landscape. -*** SysML (Systems Modeling Language) +**** SysML (Systems Modeling Language) UML 2 profile, dominant in DoD systems engineering. Six diagrams account for ~all practical use: @@ -1491,7 +1493,7 @@ and Enterprise Architect. Text-based option: PlantUML + =plantuml-sysml= =sysml-sequence=. Three or more in this cluster triggers the =arch-*-<notation>= rename discussion from the parent entry. -*** DoDAF / UAF (architecture frameworks) +**** DoDAF / UAF (architecture frameworks) Not notations themselves — frameworks that specify *which* viewpoints a program must deliver. Viewpoints are rendered using UML/SysML diagrams. @@ -1525,7 +1527,7 @@ reviewer pushback if delivering C4-shaped artifacts to a DoD audience. *Candidate skills*: =dodaf-ov1=, =dodaf-sv1= first (highest-value); =uaf-viewpoint= if newer contracts require UAF. -*** IDEF1X (data modeling) +**** IDEF1X (data modeling) FIPS 184 — federal standard for data modeling. Used in classified DoD data systems, intelligence databases, and anywhere the government @@ -1538,7 +1540,7 @@ contractor work → Crow's Foot unless the contract specifies otherwise. *Candidate skills*: =idef1x-diagram= / =idef1x-analyze= (parallel to a future =erd-diagram= / =erd-analyze= pair). -*** Tooling baseline +**** Tooling baseline - *Cameo Systems Modeler / MagicDraw* (Dassault) — commercial SysML dominant in DoD programs. @@ -1549,7 +1551,7 @@ future =erd-diagram= / =erd-analyze= pair). - *PlantUML + plantuml-sysml* — text-based, version-controllable. Fits a git-centric workflow better than any GUI tool. -*** Highest-value starting point +**** Highest-value starting point If DeepSat contracts regularly require architecture deliverables, the highest-ROI first skill is =dodaf-ov1= (or whatever naming convention @@ -1562,14 +1564,14 @@ having a skill to generate or check OV-1-shaped artifacts. Don't build speculatively — defense-specific notations are narrow enough that each skill should be driven by a concrete contract need, not aspiration. -* TODO [#B] Add =make remove= for interactive ruleset removal via fzf +** TODO [#B] Add =make remove= for interactive ruleset removal via fzf Add a Makefile target that lists every currently-installed ruleset entry and lets me pick one or more to remove via fzf. Granular alternative to =make uninstall= (removes everything) and =make uninstall-hooks= (removes only hooks). -** Why this matters +*** Why this matters Tearing down a single skill, rule, hook, or config file currently means either running =make uninstall= and re-installing what I want to keep, @@ -1578,7 +1580,7 @@ friction. An interactive picker lets me filter, multi-select with Tab, and confirm with Enter — the typical fzf flow. Costs about 3-5 seconds per teardown instead of 15+ seconds of "what's the exact name?". -** Design +*** Design The recipe builds a tab-separated list of every currently-installed item, categorized by type, and pipes it to =fzf --multi=. The user filters, @@ -1606,7 +1608,7 @@ Each line is =<kind>\t<name>=. The recipe maps =<kind>= to the right path: Source files in =rulesets/= stay untouched. =make install= re-creates the removed links if needed (the install loop is idempotent). -** Edge cases +*** Edge cases - Esc instead of Enter → empty selection → clean exit, no removal. - Filter to nothing then Enter → same as Esc. @@ -1615,7 +1617,7 @@ removed links if needed (the install loop is idempotent). - =fzf= not installed → fail fast with a clear error (matches the pattern used by =install-lang=). -** Possible extensions +*** Possible extensions - Parallel =make pick-install= target that lists not-yet-installed items and installs the chosen ones. Symmetric UX, same fzf flow. @@ -1627,34 +1629,11 @@ removed links if needed (the install loop is idempotent). bridge symlink got removed in a later commit. Drop that bullet when the recipe lands. -* DONE [#A] Add =make doctor= — verify ~/.claude/ matches repo + settings.json :feature: - -A drift detector that scans =~/.claude/= and reports anything inconsistent with what the repo expects. Single-command answer to "is my machine consistent with rulesets?" - -** Why this matters - -A 2026-05-06 sweep found =~/.claude/hooks/= didn't exist on this machine even though =settings.json= referenced =~/.claude/hooks/precompact-priorities.sh= as a PreCompact hook. Compaction would have silently failed to invoke the hook. The fix was =make install-hooks=, but the breakage was invisible until I happened to grep for it. =make doctor= run regularly (or even as part of session start) would catch this kind of drift in seconds instead of after the fact. - -** Checks - -- Every entry in =settings.json= ="hooks"= block points at a file that exists. -- Every entry in =enabledPlugins= has a matching install under =~/.claude/plugins/data/=. -- Every skill in =$(SKILLS)= has a working symlink at =~/.claude/skills/<name>=. -- Every rule in =$(RULES)= has a working symlink at =~/.claude/rules/<name>=. -- Every default hook has a symlink at =~/.claude/hooks/<name>= (warn-only — opt-out is legitimate). -- =settings.json= and =.mcp.json= symlinks resolve to the rulesets versions. -- =mcp/install.py= state matches =claude mcp list= (every server in =servers.json= is registered). -- No dangling symlinks anywhere under =~/.claude/=. - -** Output - -One line per check: =ok= / =WARN= / =FAIL=. Final summary: =N ok, M warnings, K failures=. Exit non-zero on any failure so it can ride a pre-flight check. - -* TODO [#B] Document the =mcp/= install pipeline in =mcp/README.org= +** TODO [#B] Document the =mcp/= install pipeline in =mcp/README.org= =mcp/= has =install.py=, =servers.json=, =secrets.env.gpg=, =gcp-oauth.keys.json= (gitignored, regenerated at install). No README. Coming back to this in three months I'll re-discover how the bundle is structured, what =install.py= does, and how to rotate tokens. Saving that re-discovery is the whole point. -** What to cover +*** What to cover - Layout: what each file is, which are tracked vs gitignored. - Secrets bundle shape: how vars are listed in =secrets.env=, the symmetric-encryption pattern (=gpg -c --cipher-algo AES256=), the base64-bundled OAuth artifacts (=GCP_OAUTH_KEYS_JSON_B64=, =GOOGLE_DOCS_PERSONAL_TOKEN_B64=, =GOOGLE_DOCS_WORK_TOKEN_B64=). @@ -1663,15 +1642,15 @@ One line per check: =ok= / =WARN= / =FAIL=. Final summary: =N ok, M warnings, K - Adding a new server: edit =servers.json=, add any new =${VAR}= placeholders to the bundle, re-encrypt. - The OAuth dance for HTTP-transport servers (linear, notion) versus stdio (google-docs-*) — different paths, different gotchas. -* TODO [#C] Add =make uninstall-mcp= + =mcp/install.py --check= for symmetry +** TODO [#C] Add =make uninstall-mcp= + =mcp/install.py --check= for symmetry Currently the MCP install pipeline only flows one direction. No way to remove rulesets-managed MCP servers in one command. No way to ask "what's the drift between =servers.json= and =claude mcp list=" without eyeballing. -** =make uninstall-mcp= +*** =make uninstall-mcp= Iterate over =servers.json=, run =claude mcp remove <name> -s user= for each. Ignore "not registered" errors. Idempotent. -** =mcp/install.py --check= +*** =mcp/install.py --check= Dry-run mode. Decrypt secrets, but instead of registering, print the drift report: @@ -1681,15 +1660,15 @@ Dry-run mode. Decrypt secrets, but instead of registering, print the drift repor Useful for diagnosing connection failures and for the eventual =make doctor= integration. -* TODO [#C] Update =README.org= with MCP install pipeline section +** TODO [#C] Update =README.org= with MCP install pipeline section =README.org= covers global install, per-project language bundles, and design principles, but doesn't mention =make install-mcp= or the =mcp/= directory. Add a short section after "Per-project language bundles" describing the user-scope MCP install pattern (decrypt → expand → register) and pointing at the eventual =mcp/README.org=. -* TODO [#C] Token-rotation helper for =@a-bonus/google-docs-mcp= OAuth refresh +** TODO [#C] Token-rotation helper for =@a-bonus/google-docs-mcp= OAuth refresh When a Google refresh token gets revoked (re-grant scopes, removed Connected App, account password reset), recovery is currently manual: run =npx -y @a-bonus/google-docs-mcp= with the right env, follow the URL in a browser, kill the process, base64-encode the new =token.json=, decrypt =secrets.env.gpg=, replace the var, re-encrypt. A small =mcp/refresh-google-docs-token.sh <profile>= would chain that into one command. -** Sketch +*** Sketch #+begin_src bash # usage: mcp/refresh-google-docs-token.sh personal @@ -1707,17 +1686,61 @@ rm /tmp/secrets.env.tmp The flow tonight worked but took a handful of manual steps. One script collapses it. -* DONE [#A] Build =voice= skill — combine =humanizer= with universal + personal style passes :feature: +** TODO [#C] Decide on category-3 rule copies in the deepsat tree + +While symlinking personal-project =.claude/rules/= mirrors to the rulesets canonical on 2026-05-07, two locations didn't fit the "personal mirror → symlink" pattern and were left untouched pending judgment: + +- =~/projects/work/deepsat/code/coding-rulesets/claude-rules/{testing,verification}.md= — looks like a vendored team-shared copy. +- =~/projects/work/deepsat/code/orchestration_dashboard_mvp/.claude/rules/{testing,verification}.md= — could be project-specific overrides. + +For each: read the file, diff against the rulesets canonical, decide whether it's an intentional diverge (leave alone), stale (sync content), or should canonicalize (replace with symlink and accept the cross-repo dependency). The orchestration_dashboard_mvp pair is the project where Vrezh's PR review surfaced this whole thread, so any decision there has team-visibility implications. + +** TODO [#C] Audit language-specific rule files for cross-project duplication + +The four canonical rules (=commits=, =testing=, =verification=, =subagents=) are now symlinked across the five personal-project mirrors as of 2026-05-07. But several language-specific rule files exist in multiple project mirrors and may be duplicated or drifted: + +- =python-testing.md= in =~/projects/work/.claude/rules/= +- =typescript-testing.md= in =~/projects/work/deepsat/code/.claude/rules/= +- =elisp-testing.md= and =elisp.md= in =~/.emacs.d/=, =~/code/gloss/=, =~/code/chime/= + +The Elisp pair is the most suspicious — three repos using essentially the same rules. Audit: diff these across the projects, check for drift, then decide whether to canonicalize them under =~/code/rulesets/claude-rules/languages/<lang>/= and symlink, or leave them as project-local. + + +* Rulesets Resolved +** DONE [#A] Add =make doctor= — verify ~/.claude/ matches repo + settings.json :feature: + +A drift detector that scans =~/.claude/= and reports anything inconsistent with what the repo expects. Single-command answer to "is my machine consistent with rulesets?" + +*** Why this matters + +A 2026-05-06 sweep found =~/.claude/hooks/= didn't exist on this machine even though =settings.json= referenced =~/.claude/hooks/precompact-priorities.sh= as a PreCompact hook. Compaction would have silently failed to invoke the hook. The fix was =make install-hooks=, but the breakage was invisible until I happened to grep for it. =make doctor= run regularly (or even as part of session start) would catch this kind of drift in seconds instead of after the fact. + +*** Checks + +- Every entry in =settings.json= ="hooks"= block points at a file that exists. +- Every entry in =enabledPlugins= has a matching install under =~/.claude/plugins/data/=. +- Every skill in =$(SKILLS)= has a working symlink at =~/.claude/skills/<name>=. +- Every rule in =$(RULES)= has a working symlink at =~/.claude/rules/<name>=. +- Every default hook has a symlink at =~/.claude/hooks/<name>= (warn-only — opt-out is legitimate). +- =settings.json= and =.mcp.json= symlinks resolve to the rulesets versions. +- =mcp/install.py= state matches =claude mcp list= (every server in =servers.json= is registered). +- No dangling symlinks anywhere under =~/.claude/=. + +*** Output + +One line per check: =ok= / =WARN= / =FAIL=. Final summary: =N ok, M warnings, K failures=. Exit non-zero on any failure so it can ride a pre-flight check. + +** DONE [#A] Build =voice= skill — combine =humanizer= with universal + personal style passes :feature: Combine =humanizer= with universal good-writing passes (Strunk & White, Orwell, Plain English) and the personal-style passes from =commits.md=. Two modes — =general= for arbitrary writing, =personal= for commits/PRs/comments — share a foundation and diverge on register. Built and shipped 2026-05-07: =voice/SKILL.md= with 39 numbered patterns walked sequentially. Patterns 1-25 carried over from humanizer, 26-31 are universal good-writing additions, 32-39 are personal-only. Migrated three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=). Removed the standalone =humanizer= skill since voice supersedes it. -** Why this matters +*** Why this matters Three transformations want to run together for personal-mode artifacts (commits, PR titles + bodies, PR comments) but lived in three places: =humanizer= as a skill, S&W-style universal rules nowhere (applied ad-hoc), and the personal-style passes as prose steps in =commits.md= that got re-applied by hand each time. Costs: (1) the "I forgot pass (e)" failure mode — skipping a pass without flagging is a defect but happens in practice. (2) No single-call invocation of the full transform. (3) General-mode writing (research notes, philosophy, history) got only humanizer with no universal-prose pass at all. Combining brings them under one skill with one invocation. -** Design +*** Design Two modes: @@ -1731,7 +1754,7 @@ Two modes: The 8 personal-only passes are explicitly *not* in general mode. They conflict with academic / literary / philosophical register. Forcing first-person on a Foucault essay or stripping felt-experience from a journal entry would damage the writing. -** Tier 1 universals (v1) +*** Tier 1 universals (v1) From Strunk & White, Orwell's "Politics and the English Language", Plain English Campaign, and Garner's Modern English Usage. Each is a detection-pattern + rewrite-rule pair, mechanical enough to apply consistently across runs. @@ -1741,7 +1764,7 @@ From Strunk & White, Orwell's "Politics and the English Language", Plain English - *Comma splices* — detect independent clauses joined only by comma; rewrite to period or semicolon-then-period. - *Cliché flag* — small curated list (=at the end of the day=, =moving forward=, =going forward=, =at this juncture=, =circle back=, =low-hanging fruit=, =deep dive=, =leverage= as verb). -** Tier 2 universals (v2) +*** Tier 2 universals (v2) - *Positive over negative form* (S&W) — =not unlike= → =like=, =do not fail to= → =remember to=, =did not pay any attention= → =ignored= - *Garner-style word-pair corrections* — comprise/compose, less/fewer, that/which (restrictive vs nonrestrictive), affect/effect, principal/principle @@ -1749,14 +1772,14 @@ From Strunk & White, Orwell's "Politics and the English Language", Plain English - *Tense consistency* — flag mid-paragraph tense shifts - *Acronym definition on first use* — detect uppercase tokens used before being expanded -** Tier 3 (v3, may not land) +*** Tier 3 (v3, may not land) - *Concrete-over-abstract* preference - *Emphatic word at sentence end* (S&W rule 18) - *Vary sentence length / rhythm* - *Reading-grade-level scoring* (Hemingway-style) -** Personal-style pass placement +*** Personal-style pass placement | # | Pass | Mode | Why | |---|------|------|-----| @@ -1771,11 +1794,11 @@ From Strunk & White, Orwell's "Politics and the English Language", Plain English | 9 | Terse cut (rhetorical padding: "worth noting", "it's important to understand") | personal only | Tier 1 omit-needless-words covers the worst offenders universally; aggressive cut conflicts with academic register | | 10 | Public-artifact scope check (local paths, private repos, personal tooling) | personal only — *flag-only*, no auto-rewrite | Operational/safety check, not stylistic; auto-masking risks silently editing meaningful text | -** Inclusive-language pass — explicitly excluded +*** Inclusive-language pass — explicitly excluded Considered and rejected. Conflicts with planned writing on philosophy/history topics (Foucault on sexuality and gender, history of slavery in New Orleans). Wordlist substitutions would override deliberate vocabulary choices in those genres. -** V1 scope +*** V1 scope - [ ] Skill at =~/code/rulesets/voice/= with =SKILL.md= - [ ] Frontmatter with positive triggers (commit, PR, comment, "humanize", "voice pass") and negative triggers (code, structured data, plain bullet lists) @@ -1791,22 +1814,22 @@ Considered and rejected. Conflicts with planned writing on philosophy/history to - [X] =make doctor= still passes - [X] =make lint= clean -** v2 (deferred) +*** v2 (deferred) - [ ] Tier 2 universals (positive form, word-pair corrections, parallelism, tense consistency, acronym definition) - [ ] Per-pass severity flags for Tier 1 active-voice (suggestion-only when actor is implicit; auto-rewrite when actor is named) - [ ] Reporting mode: list which passes fired and which were no-ops -** v3 (aspirational, may not land) +*** v3 (aspirational, may not land) - [ ] Tier 3 (concrete-over-abstract, emphatic-word position, sentence-length variation, reading-grade scoring) - [ ] Progressive disclosure split: =voice/SKILL.md= orchestrator + =voice/passes/<pass-name>.md= per pass with worked examples -** Migration (resolved) +*** Migration (resolved) Decision: deleted =humanizer/= entirely. Three callers (=commits.md=, =respond-to-cj-comments.md=, =start-work.md=) all updated to invoke =/voice= directly. No alias needed since nothing outside the repo invoked humanizer. -** Naming alternatives considered +*** Naming alternatives considered - =voice= — chosen. Captures both modes; broad enough. - =polish= — descriptive of multi-pass nature; less prescriptive about whose voice. @@ -1814,7 +1837,7 @@ Decision: deleted =humanizer/= entirely. Three callers (=commits.md=, =respond-t - =commit-voice= — too narrow (passes apply to research notes, emails, etc. in general mode). - =humanize= (extending current) — undersells the universal + personal additions. -** Open questions before implementation +*** Open questions before implementation Resolved during implementation: - Default mode when =/voice= is invoked bare: =general=. Personal-context callers (=commits.md= publish flow, =respond-to-cj-comments.md=) invoke =/voice personal= explicitly. Avoids accidentally first-person-ifying research notes. @@ -1822,21 +1845,3 @@ Resolved during implementation: - Public-artifact scope check (#39): flag-only, user resolves manually. Blocking would frustrate on legitimate path mentions. - Tier 1 active-voice detection: suggestion-only in v1. Auto-rewrite for unambiguous cases deferred to v2. -* TODO [#C] Decide on category-3 rule copies in the deepsat tree - -While symlinking personal-project =.claude/rules/= mirrors to the rulesets canonical on 2026-05-07, two locations didn't fit the "personal mirror → symlink" pattern and were left untouched pending judgment: - -- =~/projects/work/deepsat/code/coding-rulesets/claude-rules/{testing,verification}.md= — looks like a vendored team-shared copy. -- =~/projects/work/deepsat/code/orchestration_dashboard_mvp/.claude/rules/{testing,verification}.md= — could be project-specific overrides. - -For each: read the file, diff against the rulesets canonical, decide whether it's an intentional diverge (leave alone), stale (sync content), or should canonicalize (replace with symlink and accept the cross-repo dependency). The orchestration_dashboard_mvp pair is the project where Vrezh's PR review surfaced this whole thread, so any decision there has team-visibility implications. - -* TODO [#C] Audit language-specific rule files for cross-project duplication - -The four canonical rules (=commits=, =testing=, =verification=, =subagents=) are now symlinked across the five personal-project mirrors as of 2026-05-07. But several language-specific rule files exist in multiple project mirrors and may be duplicated or drifted: - -- =python-testing.md= in =~/projects/work/.claude/rules/= -- =typescript-testing.md= in =~/projects/work/deepsat/code/.claude/rules/= -- =elisp-testing.md= and =elisp.md= in =~/.emacs.d/=, =~/code/gloss/=, =~/code/chime/= - -The Elisp pair is the most suspicious — three repos using essentially the same rules. Audit: diff these across the projects, check for drift, then decide whether to canonicalize them under =~/code/rulesets/claude-rules/languages/<lang>/= and symlink, or leave them as project-local. |
