aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--.ai/sessions/2026-05-14-21-43-lint-org-build-and-memory-sync-investigation.org91
-rw-r--r--todo.org116
2 files changed, 160 insertions, 47 deletions
diff --git a/.ai/sessions/2026-05-14-21-43-lint-org-build-and-memory-sync-investigation.org b/.ai/sessions/2026-05-14-21-43-lint-org-build-and-memory-sync-investigation.org
new file mode 100644
index 0000000..cee2868
--- /dev/null
+++ b/.ai/sessions/2026-05-14-21-43-lint-org-build-and-memory-sync-investigation.org
@@ -0,0 +1,91 @@
+#+TITLE: Session Context — /lint-org skill build
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-05-14
+
+* Summary
+
+** Active Goal
+
+Two arcs. First (main): build the =/lint-org= command end to end per the spec at =.ai/specs/lint-org-skill-spec.md= — TDD elisp script with mechanical fixers + ERT suite, a =.claude/commands/lint-org.md= orchestrator that walks judgments inline, wrap-up integration so each evening's wrap-it-up runs the mechanical pass on =todo.org=, ship across claude-templates + rulesets in 4 commits. Second (follow-on): run =/respond-to-cj-comments= against the one cj annotation on todo.org line 11 — turned out to be a clarifying note on the memory-sync TODO, investigated the actual storage layout, filed a VERIFY for the proposed stow-based fix.
+
+** Decisions
+
+- Build as a *command* (=.claude/commands/lint-org.md=) not a model-invocable skill. Convention post =aa69245= — user-invoked entry points are commands.
+- Script emits structured stdout (one summary line + plist per issue) so the command layer parses cleanly without re-running org-lint.
+- =--check= for preview, =--followups-file=PATH= for the wrap-up's deferred-judgment routing. The script handles the append itself, so wrap-up doesn't need bash glue.
+- Followups path defaults to =~/projects/work/inbox/lint-followups.org= (where daily-prep merges in) with project-local =.ai/lint-followups.org= fallback. Override via =$LINT_ORG_FOLLOWUPS=.
+- Mechanical fixers process in descending line order so additions/deletions don't perturb earlier line numbers.
+- =misplaced-heading= is conditionally mechanical: =**X.**= at line start → =*X.*= is auto-fixed; =*** Foo= inside =verbatim= markup stays judgment. Classifier checks line N and N-1 since org-lint reports the blank line *after* the offender, not the offender itself.
+- Backup before any write (=/tmp/<basename>.before-lint-pass.<timestamp>=). Skipped in =--check= mode.
+- Canonical-source procedure followed: script + workflow edits in claude-templates first, then rsync to rulesets, then commit in both (saved memory =project_ai_scripts_canonical_source.md=).
+- Lint-org walk on the live todo.org: picked option 5 (skip) for the line-11 =cj:= block warnings since the block is actually a personal-note convention, not source code — the right tool was =/respond-to-cj-comments=, not =/lint-org=.
+
+** Data Collected / Findings
+
+- org-lint result shape on emacs 30.2: =(id [marker trust msg checker])=. Marker is a propertized string of the line number (text property =org-lint-marker= holds the actual marker); checker is the =org-lint-checker= struct (=org-lint-checker-name= → symbol). =org-lint--get-line-number= doesn't exist in this version — was a wrong guess from training data.
+- org-lint's =misplaced-heading= marker points at the *blank line after* the heading-like text, not the offending line. Both the markdown-bold case and the verbatim-asterisk case behave this way. The classifier has to look at LINE-1 first, then LINE.
+- =timestamp-syntax= warning fires on bare YYYY-MM-DD timestamps without a day-name (=<2026-05-20>= → "Parsed as: <2026-05-20 Wed>"). Not in the mechanical category list; falls through to judgment.
+- =todo.org= currently has 3 lint warnings, all on line 11: =empty-header-argument=, =wrong-header-argument=, =suspicious-language-in-src-block= — all from the same =#+begin_src cj: comment= block. Left in place by user choice (option 5 in the live walk); the cj block was handled separately via =/respond-to-cj-comments=.
+- =~/.claude/projects/-home-cjennings-code-rulesets/memory/= contains four files (=MEMORY.md=, =feedback_never_guess.md=, =project_ai_scripts_canonical_source.md=, =reference_pdftools_venv.md=). Plain dir, no symlinks, no enclosing git checkout. Memory does *not* currently sync across machines. Proposed fix: stow =~/.claude/projects= via =archsetup/dotfiles/common/.claude/projects/=. Filed as a VERIFY for Craig's approval before any moves.
+- Test totals after this session: 240 pytest + 22 lint-org ERT + 23 todo-cleanup ERT = 285 green. Byte-compile clean on the new elisp.
+
+** Files Modified
+
+claude-templates (canonical, both commits pushed to =origin/main=):
+- =138f35f feat(lint-org): add script with mechanical fixers and ERT suite= — =.ai/scripts/lint-org.el= (365 lines) + =.ai/scripts/tests/test-lint-org.el= (465 lines).
+- =4eba98c docs(wrap-it-up): run lint-org on todo.org at wrap-up= — =.ai/workflows/wrap-it-up.org= new =*** Lint org files= subsection + validation-checklist line.
+
+rulesets (both pushed to =origin/main=):
+- =f5b8688 chore(ai): sync lint-org script and wrap-it-up from claude-templates= — byte-identical pull of the script, tests, and wrap-it-up section.
+- =9f62a7c feat(lint-org): add /lint-org command + file design spec= — =.claude/commands/lint-org.md= (162 lines), =.ai/specs/lint-org-skill-spec.md= (moved from =inbox/=), =todo.org= new =[#A] Build /lint-org= entry.
+- =todo.org= (uncommitted before wrap): cj-source-block on line 11 resolved via =/respond-to-cj-comments=. Parent =[#A]= memory-sync TODO flipped to =DOING=, dated work-log subheader added under it with the memory storage investigation findings, =VERIFY= child added asking for approval on the proposed stow-based sync.
+
+** Next Steps
+
+- Carryover: =/update-skills= skill, =create-documentation= skill (large, ~600 lines of research notes already drafted in todo.org), 2026-05-04 audit review pass (12 [#A] + 38 [#B] sub-items), memory-sync setup pending the =VERIFY= answer on the proposed stow approach, =[#B]= fold-claude-templates-into-rulesets, =[#B]= =make audit=.
+- =/lint-org= goes live in the next wrap-up: =.ai/workflows/wrap-it-up.org= Step 3 runs the mechanical pass on =todo.org= and routes judgments to the follow-ups file (this session's wrap is the first one to exercise it).
+- The 3 lint warnings on todo.org line 11 stay until Craig either fixes the =cj:= src-block by hand or registers =cj:= as a known org-babel language in Emacs init. Won't auto-fix.
+
+* Session Log
+
+** Startup + spec triage (13:05 CDT)
+
+Clean startup — no interrupted session, no cross-agent traffic, no reminders. Inbox had one file: =lint-org-skill-spec.md= dropped today at 12:54, a full spec for a =/lint-org= skill that wraps =org-lint=. Spec defines two modes (interactive, mechanical-only), four mechanical-fix categories (item-number, missing-language-in-src-block, misplaced-planning-info, markdown-bold), four judgment categories (link-to-local-file, invalid-fuzzy-link, verbatim-asterisk, suspicious-language), and a wrap-up integration that runs the mechanical pass on todo.org each night.
+
+Filed: moved spec to =.ai/specs/lint-org-skill-spec.md= and added =[#A] Build =/lint-org= skill + wrap-up integration= to todo.org (just before the =/update-skills= entry). Craig picked "build it now."
+
+** API probe + repo conventions
+
+Probed =org-lint= output format in =emacs --batch -Q=. Each report item is =(id [marker trust msg checker])= — the marker is a propertized string of the line number, the checker is an =org-lint-checker= struct (name is the field that identifies the category). Earlier attempt used =org-lint--get-line-number= which doesn't exist in this Emacs version (30.2); the line number is just the marker string content.
+
+Current todo.org state: 3 warnings, all on line 11, from a single =cj:= src block (empty header argument "comment", missing colon, unknown source-block language). These are =suspicious-language-in-src-block= — judgment category per the spec.
+
+Conventions confirmed:
+- Commands live in =.claude/commands/= (rulesets-only). Patterns: =start-work.md=, =review-code.md=, =respond-to-cj-comments.md=, etc.
+- Elisp scripts live in =.ai/scripts/=, canonical in =~/projects/claude-templates/=, rsync'd to rulesets per the canonical-source procedure.
+- ERT tests live in =.ai/scripts/tests/test-<name>.el=. Makefile target =test-scripts= globs =tests/test-*.el= and runs each.
+- =todo-cleanup.el= is the precedent — =lexical-binding=, =defconst= for module constants, =defvar= for state, CLI dispatch behind a =tc--cli-invocation-p= guard so =require= from tests doesn't fire it.
+
+** Build plan (in flight)
+
+Eight tasks tracked. Currently on #1 (creating this file). Next: ERT tests first (TDD), then implementation, then command file, then smoke test, then wrap-up integration, then commit/push in both repos.
+
+Key design calls:
+- It's a command, not a model-invocable skill. Convention post =aa69245=.
+- Script emits structured stdout (s-expressions or JSON-ish lines) so the command layer can walk judgments without re-parsing org-lint output.
+- For mechanical-only mode, the script applies fixes and emits remaining judgments on stdout; the command layer (or wrap-up workflow) routes them to a carry-forward file. Don't bake the carry-forward path into the script — keep routing decoupled.
+- Defer multi-file invocation. Defer the carry-forward-file location decision (spec lean: per-project =.ai/lint-followups.org= or similar; finalize during wrap-up integration).
+
+** Build complete (18:31 CDT)
+
+Tasks 1-7 done. Quick log:
+
+- =.ai/scripts/lint-org.el= shipped — 4 mechanical fixers (item-number, missing-language-in-src-block, misplaced-planning-info, misplaced-heading markdown-bold case), judgment-emission for the rest, =--check= preview mode, =--followups-file=PATH= for daily-prep handoff. Descending line-order processing so fixes don't perturb earlier line numbers.
+- =.ai/scripts/tests/test-lint-org.el= shipped — 22 ERT cases: Normal/Boundary/Error per category, idempotency for each mechanical fixer, =--check= mode, mixed-fixture integration, backup-file creation, followups-file behavior (append on judgments, no-op when none, skipped in check mode).
+- Caught two implementation bugs via the TDD loop: (1) =org-lint= reports =misplaced-heading= at the *blank line after* the offender, not the offender itself — fixer now scans LINE and LINE-1; (2) =lo-fix-misplaced-planning= wasn't positioning point before =re-search-backward= — now goes to the reported line first.
+- =.claude/commands/lint-org.md= shipped — interactive mode walks each judgment with inline numbered options (per =interaction.md=); mechanical-only mode defers via =--followups-file=. Edge cases documented.
+- =.ai/workflows/wrap-it-up.org= updated — new =*** Lint org files= subsection in Step 3 runs the script with =--followups-file=$LINT_ORG_FOLLOWUPS= defaulting to =~/projects/work/inbox/lint-followups.org=, project-local fallback. Validation checklist gained one line.
+- All canonical edits in claude-templates first, then rsync to rulesets. =make test=: 240 pytest + 22 lint-org ERT + 23 todo-cleanup ERT = 285 green.
+- Smoke against live todo.org: =--check= reports 3 judgments on line 11 (the =cj:= src block — empty-header-argument, wrong-header-argument, suspicious-language), file MD5 unchanged. End-to-end smoke on a synthetic fixture: 4 mechanicals applied correctly, 1 judgment emitted, backup landed in =/tmp/=.
+
+Task 8 (commits in both repos) — pausing here to confirm before pushing.
diff --git a/todo.org b/todo.org
index 3aa6514..5ddf40b 100644
--- a/todo.org
+++ b/todo.org
@@ -7,10 +7,32 @@ Project-scoped (not the global =~/sync/org/roam/inbox.org= list).
* Rulesets Open Work
-** TODO [#A] Check that memories are sync'd across machines via git.m
-#+begin_src cj: comment
-this means we need to link the memory file in ~/.claude if it's not already
-#+end_src
+** DOING [#A] Check that memories are sync'd across machines via git.m
+
+*** 2026-05-14 Thu @ 19:14:11 -0500 Investigate current memory storage
+
+Memory files live at
+[[file:/home/cjennings/.claude/projects/-home-cjennings-code-rulesets/memory/][~/.claude/projects/-home-cjennings-code-rulesets/memory/]]
+— four files including =MEMORY.md= and three individual entries
+(=feedback_never_guess.md=, =project_ai_scripts_canonical_source.md=,
+=reference_pdftools_venv.md=). The directory is a plain unmanaged dir
+(no symlink, no enclosing git checkout). Neither
+[[file:/home/cjennings/.claude/][~/.claude/]] itself nor any subtree
+containing the project-memory dirs is tracked in
+[[file:/home/cjennings/code/archsetup/][archsetup]] or
+[[file:/home/cjennings/code/rulesets/][rulesets]]. Without a symlink
+into a stowed or tracked location, memory files don't survive a new
+machine setup or a dotfiles restore.
+
+Proposed setup: stow =~/.claude/projects= →
+=archsetup/dotfiles/common/.claude/projects/= (path doesn't exist yet
+— it's the target location pending VERIFY).
+Create the destination in archsetup, move existing per-project
+=projects/<encoded-cwd>/memory/= dirs there, run =stow= to link, then
+commit + push archsetup. After that, every machine running =stow=
+picks up the same memory tree.
+
+*** VERIFY Approve stow-based sync of ~/.claude/projects via archsetup/dotfiles/common/
** TODO [#B] Document rulesets + claude-templates pull-before-project ordering in protocols.org
Startup currently pulls claude-templates in Phase A.0 and fast-forwards the
@@ -815,20 +837,20 @@ calls =networkidle= discouraged for testing. Keep reconnaissance, but revise it
to wait for a visible app-specific landmark instead of treating network quiet
as readiness.
-*** TODO [#B] =playwright-js= and =playwright-py=: reconcile headless/visible-browser defaults
+*** TODO [#A] =playwright-js= and =playwright-py=: reconcile headless/visible-browser defaults
=playwright-js= says visible Chromium by default; =playwright-py= says
headless by default. That may be intentional, but the difference should be
explicit: interactive visual debugging -> headed, CI/pytest smoke tests ->
headless. Add a small decision table so agents don't flip modes by habit.
-*** TODO [#B] =playwright-js= and =playwright-py=: remove emoji console markers from examples
+*** TODO [#A] =playwright-js= and =playwright-py=: remove emoji console markers from examples
The broader rules discourage emojis in shared engineering output. The
Playwright examples print camera/check/cross emoji. Replace with plain ASCII
status prefixes.
-*** TODO [#B] =frontend-design=: make accessibility non-optional and align with WCAG 2.2
+*** TODO [#A] =frontend-design=: make accessibility non-optional and align with WCAG 2.2
The workflow only loads =references/accessibility.md= for interactive
components. Accessibility should be a baseline for all frontend work: keyboard
@@ -836,7 +858,7 @@ operation, focus visibility/not-obscured, target size, contrast, reduced
motion, labels, and semantic structure. Add WCAG 2.2-oriented gates before
handoff.
-*** TODO [#B] =frontend-design=: harmonize aesthetic guidance with current UI anti-pattern rules
+*** TODO [#A] =frontend-design=: harmonize aesthetic guidance with current UI anti-pattern rules
The skill encourages gradient meshes, heavy texture, custom cursors, overlap,
and maximalist directions. Those can conflict with the repo's newer frontend
@@ -854,7 +876,7 @@ each finding maps to either OWASP Top 10 2021 or a WSTG area, and add explicit
checks for authorization object/function-level access, SSRF URL fetches,
integrity of update/plugin paths, and security-relevant logging gaps.
-*** TODO [#B] =security-check=: add practical tooling and offline/network caveats
+*** TODO [#A] =security-check=: add practical tooling and offline/network caveats
Add optional use of project-configured scanners such as =gitleaks= or
=trufflehog= for secrets, =semgrep= for source patterns, =pip-audit= / =npm
@@ -862,7 +884,7 @@ audit= / OSV where configured, and lockfile diff review. Note that dependency
audits may need network access and should report "not run" clearly rather than
silently passing.
-*** TODO [#B] =pairwise-tests=: add t-way escalation guidance beyond pairwise
+*** TODO [#A] =pairwise-tests=: add t-way escalation guidance beyond pairwise
Pairwise is a pragmatic default, but NIST's combinatorial testing work covers
higher-strength t-way arrays too. Add a rule: start with pairwise for broad
@@ -870,7 +892,7 @@ coverage, escalate selected high-risk parameter clusters to 3-way or higher
when history, safety, security, or domain reasoning suggests faults require
more than two interacting factors.
-*** TODO [#B] =pairwise-tests=: clarify negative value syntax and actual generator availability
+*** TODO [#A] =pairwise-tests=: clarify negative value syntax and actual generator availability
The examples use =~0= style values that are PICT-specific and easy to
misread. Add a short "negative testing values are labels, not operators unless
@@ -884,14 +906,14 @@ V2MOM's final M is officially "Measures." The skill uses "Metrics" throughout.
Either rename the section and description to "Measures" or add a clear note
that this fork intentionally says "Metrics" while preserving the V2MOM concept.
-*** TODO [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog
+*** TODO [#A] =create-v2mom=: prevent task migration from turning V2MOM into a backlog
Salesforce presents V2MOM as a simple alignment framework. This skill's
optional task-migration phase can make the V2MOM the entire todo system. Split
strategy from execution: keep the V2MOM concise, and link to method-specific
backlogs instead of embedding every task under the strategic document.
-*** TODO [#B] =create-v2mom=: add mitigation/owner fields for Obstacles
+*** TODO [#A] =create-v2mom=: add mitigation/owner fields for Obstacles
The current Obstacles phase captures barriers but not consistently how each
will be overcome. Add "mitigation, owner, and review cadence" per obstacle so
@@ -906,14 +928,14 @@ result: it shows persuasion can raise compliance with objectionable requests,
which is a cautionary prompt-safety finding, not broad evidence that persuasion
principles improve engineering prompt quality.
-*** TODO [#B] =prompt-engineering=: add an evaluation harness requirement for production prompts
+*** TODO [#A] =prompt-engineering=: add an evaluation harness requirement for production prompts
Prompt critique currently ends with a rewrite and checklist. Add a requirement
for fragile or reusable prompts: create 3-5 adversarial/edge examples, run the
old and new prompt against them, and record the observed behavioral delta.
Without examples, prompt quality remains asserted rather than verified.
-*** TODO [#B] =codify=: add stale-entry review and privacy checks before writing project =CLAUDE.md=
+*** TODO [#A] =codify=: add stale-entry review and privacy checks before writing project =CLAUDE.md=
The skill has good gates, but it should explicitly scan for stale entries,
private context, and team-visible leakage before appending. Add "would this be
@@ -928,7 +950,7 @@ before completion. Clarify: code review should not duplicate CI while reading a
PR, but pre-commit/pre-push workflows still need local verification or a clear
"not run because..." statement.
-*** TODO [#B] =review-code=: handle public-artifact scope when citing =CLAUDE.md=
+*** TODO [#A] =review-code=: handle public-artifact scope when citing =CLAUDE.md=
The skill requires auditing and reporting =CLAUDE.md= adherence, while
=commits.md= says personal tooling files should not be cited as authority in
@@ -936,7 +958,7 @@ public artifacts. Add two output modes: private/internal review may cite
=CLAUDE.md= directly; public/team review should translate the rule into the
underlying engineering reason without naming personal rulesets.
-*** TODO [#B] =review-code=: relax mandatory "three strengths" for tiny or failing diffs
+*** TODO [#A] =review-code=: relax mandatory "three strengths" for tiny or failing diffs
"Three minimum" strengths can force filler on small diffs or bad PRs. Adjust to
"up to three specific strengths; say none found when appropriate" so the review
@@ -949,21 +971,21 @@ conflicts with =commits.md='s "what changed and why, not the process" rule and
also uses a non-ASCII dash. Replace with conventional subjects that name the
actual fix, e.g. =fix: validate export filename=.
-*** TODO [#B] =respond-to-review=: use unresolved review threads and resolution state, not only flat comments
+*** TODO [#A] =respond-to-review=: use unresolved review threads and resolution state, not only flat comments
Fetching inline and top-level comments via REST misses thread resolution and
can re-process already-resolved feedback. Add the same thread-level workflow as
the GitHub comment-addressing skill: gather unresolved threads, group by
requested change, implement, reply, and resolve only after verification.
-*** TODO [#B] =respond-to-cj-comments=: remove personal absolute path references from public-writing instructions
+*** TODO [#A] =respond-to-cj-comments=: remove personal absolute path references from public-writing instructions
The skill embeds =/home/cjennings/code/rulesets/claude-rules/commits.md= in
the public-writing section. That contradicts the public-artifact scope rule.
Refer to "the commit/public-writing rules" internally, and ensure any emitted
public text never cites the local path.
-*** TODO [#B] =respond-to-cj-comments=: add fallback when =humanizer= or =emacsclient= is unavailable
+*** TODO [#A] =respond-to-cj-comments=: add fallback when =humanizer= or =emacsclient= is unavailable
The workflow requires =/humanizer= and opens long summaries in =emacsclient=.
Neither is guaranteed in a fresh environment. Add tool-availability checks and
@@ -978,14 +1000,14 @@ base. Replace with explicit branch detection: upstream PR base if present,
configured default branch from =origin/HEAD=, or user-selected branch, then
compute merge-base separately.
-*** TODO [#B] =finish-branch=: make pull/merge steps safer and worktree-aware
+*** TODO [#A] =finish-branch=: make pull/merge steps safer and worktree-aware
Option 1 runs =git pull= and =git merge --no-ff= after checkout. Add checks for
dirty worktree, upstream tracking, protected branches, and rebase-vs-merge team
policy. Worktree detection via grepping branch names is fragile; use =git
worktree list --porcelain= or =git rev-parse --git-common-dir= based checks.
-*** TODO [#B] =start-work=: add tool-availability and ceremony-scaling rules
+*** TODO [#A] =start-work=: add tool-availability and ceremony-scaling rules
The workflow assumes Linear MCP, GitHub CLI, =humanizer=, Playwright skills, and
multi-commit TDD ceremony. Add a first-class "tools unavailable" path and a
@@ -993,102 +1015,102 @@ ceremony scale: trivial local fixes should not require the full ticket,
branch, three approval gates, and commit-per-phase flow unless the user wants
that process.
-*** TODO [#B] =start-work=: resolve the "claim before justify" rollback risk
+*** TODO [#A] =start-work=: resolve the "claim before justify" rollback risk
The skill marks Linear/GitHub/todo tasks in progress before the Justify gate,
then says rolling back is required if justification fails. Consider moving
claiming after Gate 1 for personal todo tasks, or make the rollback steps
explicit per tracker with stored prior state.
-*** TODO [#B] =add-tests=: fix missing =typescript-testing.md= reference or add the ruleset
+*** TODO [#A] =add-tests=: fix missing =typescript-testing.md= reference or add the ruleset
Phase 3 references =typescript-testing.md=, but this repo currently has Python
and Elisp testing rules only. Either add the TypeScript ruleset or change the
skill to discover project-local JS/TS testing conventions instead of pointing
to a missing file.
-*** TODO [#B] =add-tests=: add explicit exceptions to "all three categories per function"
+*** TODO [#A] =add-tests=: add explicit exceptions to "all three categories per function"
The Normal/Boundary/Error rule is useful, but some functions are pure adapters,
generated code, tiny wrappers, or framework glue. Add an exception protocol:
state why a category does not apply, and cover the behavior at the integration
or E2E level when unit categories would test framework behavior.
-*** TODO [#B] =debug=: capture environment and recent-change context before hypotheses
+*** TODO [#A] =debug=: capture environment and recent-change context before hypotheses
The debugging workflow covers reproduction and logs, but should explicitly
record environment, versions, feature flags, data set, seed/time, concurrency,
and recent commits/config changes. Many intermittent failures are environment
or state transitions, not just local code paths.
-*** TODO [#B] =root-cause-trace=: constrain defense-in-depth to trust boundaries and invariants
+*** TODO [#A] =root-cause-trace=: constrain defense-in-depth to trust boundaries and invariants
The skill says add defense at each intermediate layer that could have caught
the bad value. That risks validation spam. Tighten it: add checks at ingress,
trust boundaries, persistence boundaries, and invariant-owning layers; avoid
duplicative null checks in every pass-through function.
-*** TODO [#B] =five-whys=: require evidence and counterfactual validation per why
+*** TODO [#A] =five-whys=: require evidence and counterfactual validation per why
The skill says "one best-supported answer" but should require an evidence
field for each link and a counterfactual check: if this cause were removed,
would the next symptom plausibly disappear? This reduces monocausal storytelling.
-*** TODO [#B] =brainstorm=: add timebox and research/source rules for high-stakes designs
+*** TODO [#A] =brainstorm=: add timebox and research/source rules for high-stakes designs
The one-question-at-a-time flow can run long. Add a timebox and a rule that
claims about markets, regulations, tools, vendors, or current APIs require
fresh sources. The design doc should distinguish researched facts from
assumptions.
-*** TODO [#B] =arch-decide=: make examples technically timeless and avoid unverifiable claims
+*** TODO [#A] =arch-decide=: make examples technically timeless and avoid unverifiable claims
The sample ADRs include claims such as MongoDB lacking ACID for multi-document
transactions "at decision time." Examples age and can teach stale facts. Replace
with either clearly dated examples or domain-neutral placeholders, and require
references for real technical claims in generated ADRs.
-*** TODO [#B] =arch-decide=: standardize statuses and immutability language
+*** TODO [#A] =arch-decide=: standardize statuses and immutability language
The skill mixes Accepted, Decided, Deprecated, Superseded, Rejected, and "Not
Accepted." Pick a canonical status set and state that accepted ADR content is
not edited except for status/link metadata; changed decisions get new ADRs that
supersede old ones.
-*** TODO [#B] =arch-design=: add threat modeling and privacy/compliance as first-class design inputs
+*** TODO [#A] =arch-design=: add threat modeling and privacy/compliance as first-class design inputs
Security appears as one quality attribute, but architecture design should also
ask about trust boundaries, data classification, abuse cases, privacy
constraints, compliance evidence, and operational ownership. These influence
architecture early and should not wait for =security-check=.
-*** TODO [#B] =arch-design=: separate architecture paradigms from tactical patterns
+*** TODO [#A] =arch-design=: separate architecture paradigms from tactical patterns
The candidate table mixes paradigms (modular monolith, microservices,
event-driven) with tactical or partial patterns (DDD, CQRS, event sourcing).
Revise the matrix so candidates can compose patterns rather than treating each
as a mutually exclusive architecture choice.
-*** TODO [#B] =arch-document=: strengthen quality scenarios using arc42/Q42 structure
+*** TODO [#A] =arch-document=: strengthen quality scenarios using arc42/Q42 structure
Section 10 currently says "Under [condition], the system should [response]
within [measure]." Expand to a compact quality-scenario template: source,
stimulus, environment, artifact, response, response measure. This better
matches architecture-quality practice and makes requirements testable.
-*** TODO [#B] =arch-document=: add staleness and ownership metadata to generated docs
+*** TODO [#A] =arch-document=: add staleness and ownership metadata to generated docs
arc42 docs are living documents. Add owner, source commit/date, review cadence,
and "known stale when..." notes per section or in the README so generated docs
do not become authoritative after the code has moved on.
-*** TODO [#B] =arch-evaluate=: add confidence levels for framework-agnostic findings
+*** TODO [#A] =arch-evaluate=: add confidence levels for framework-agnostic findings
Claude-read import graphs and public API comparisons can be incomplete in large
or dynamic languages. Add confidence/provenance per finding and require "not
fully checked because..." when scale or dynamic imports limit certainty.
-*** TODO [#B] =arch-evaluate=: report skipped tool checks explicitly
+*** TODO [#A] =arch-evaluate=: report skipped tool checks explicitly
The workflow says skip unconfigured language-specific tools silently, but the
review checklist also wants checks run. For audit usefulness, list detected
@@ -1100,13 +1122,13 @@ C4 is notation-independent. These skills hard-require draw.io XML, PNG export,
and opening draw.io desktop. Add supported outputs (Structurizr DSL, Mermaid,
PlantUML, draw.io) and a fallback path when =drawio= or a GUI is unavailable.
-*** TODO [#B] =c4-analyze= and =c4-diagram=: clarify C4 abstraction boundaries
+*** TODO [#A] =c4-analyze= and =c4-diagram=: clarify C4 abstraction boundaries
Emphasize that C4 Containers are deployable/runnable units, not necessarily
Docker containers, and that Components are not separately deployable. Add a
check that every relationship and element stays at one abstraction level.
-*** TODO [#B] =commits.md=: split DeepSat/Linear/Slack-specific publishing rules from global commit rules
+*** TODO [#A] =commits.md=: split DeepSat/Linear/Slack-specific publishing rules from global commit rules
The global commit rule file includes Linear status transitions and a hard-coded
Slack channel. That is team-specific and may leak or misfire in unrelated
@@ -1119,26 +1141,26 @@ Several workflows make =humanizer= mandatory, but no =humanizer= skill exists
in this repo. Either add the skill, install instructions, or a fallback
plain-English pass that satisfies the same checks without an external skill.
-*** TODO [#B] =verification.md=: add explicit "unable to verify" reporting standard
+*** TODO [#A] =verification.md=: add explicit "unable to verify" reporting standard
The rule says run tests/lint/typecheck/build before claiming done. Add the
required final wording when a command cannot be run: command attempted, reason
it could not run, risk left unverified, and the smallest next command for the
user to run.
-*** TODO [#B] =testing.md=: add property-based and mutation testing as escalation paths
+*** TODO [#A] =testing.md=: add property-based and mutation testing as escalation paths
The testing rules cover categories and pairwise matrices. Add guidance for
property-based testing when invariants matter across broad input domains, and
mutation testing when test quality is suspect despite high coverage.
-*** TODO [#B] =testing.md=: soften absolute TDD with an explicit spike protocol
+*** TODO [#A] =testing.md=: soften absolute TDD with an explicit spike protocol
The rule currently treats TDD as non-negotiable. Keep TDD as the default, but
define a disciplined spike exception: timebox, do not commit spike code, write
the first failing test before productionizing the discovered approach.
-*** TODO [#B] =subagents.md=: add capability/availability and cost checks
+*** TODO [#A] =subagents.md=: add capability/availability and cost checks
The rule assumes subagents exist and should handle failures. Add "if the
environment lacks subagents, continue locally and preserve the same scope
@@ -1152,21 +1174,21 @@ semantics, constraints, transactions, JSON, time zones, and indexes differ.
Recommend production-like DBs for ORM/query behavior and reserve SQLite for
pure unit tests that do not depend on database semantics.
-*** TODO [#B] =languages/python/claude/rules/python-testing.md=: separate "never mock ORM" from true unit-test boundaries
+*** TODO [#A] =languages/python/claude/rules/python-testing.md=: separate "never mock ORM" from true unit-test boundaries
For domain services, real model methods and validation are usually right. For
thin orchestration units, a repository/interface fake may be cleaner than
hitting a real database. Clarify the boundary: do not mock ORM internals, but
do inject fakes at deliberate data-access ports.
-*** TODO [#B] =languages/elisp/claude/rules/elisp.md=: update editing workflow to avoid tool-specific advice
+*** TODO [#A] =languages/elisp/claude/rules/elisp.md=: update editing workflow to avoid tool-specific advice
The rule says prefer Write over repeated Edits. That advice is Claude-tooling
specific and can conflict with environments that require patch-based edits.
Rephrase around the intent: for nontrivial Elisp, make cohesive edits and run
paren/byte-compile checks immediately.
-*** TODO [#B] =languages/elisp/claude/rules/elisp-testing.md=: add batch-mode and native-comp caveats
+*** TODO [#A] =languages/elisp/claude/rules/elisp-testing.md=: add batch-mode and native-comp caveats
ERT guidance is solid, but add rules for =emacs --batch= reproducibility,
isolating =user-emacs-directory= / package state, and optionally catching
@@ -1186,7 +1208,7 @@ unparseable or just display the file path, so attribution scanning may miss the
actual committed/posted text. Read safe local files referenced by =-F=,
=--file=, and =--body-file= before deciding whether the command is clean.
-*** TODO [#B] =hooks/destructive-bash-confirm.py=: replace regex command parsing with shell-aware parsing where possible
+*** TODO [#A] =hooks/destructive-bash-confirm.py=: replace regex command parsing with shell-aware parsing where possible
The hook's regexes can miss quoted paths, variables, aliases, =env= wrappers,
or compound commands, and can misidentify targets. Use =shlex= for simple