docs(specs): pilot the spec-sort retrofit on rulesets' own pile

Five specs moved to docs/specs/ with confirmed lifecycle keywords: agent-knowledge-base IMPLEMENTED (with the stated reason in its history line), inbox-workflow-consolidation READY, autonomous-batch-execution READY, encourage-kb-contribution READY, and wrapup-routing DOING. spec-sort recomputed twelve todo.org links and the moved specs' own outbound links, and stamped :LAST_SPEC_SORT:. The status board now answers "what's live" in one grep. The four -spec.org-named files in docs/design without a spec spine stay put as notes.
author: Craig Jennings <c@cjennings.net> 2026-07-02 00:19:56 -0400
committer: Craig Jennings <c@cjennings.net> 2026-07-02 00:19:56 -0400
commit: f4b64d6141156cf0ee2a2c2a13cda256f0bf0c84 (patch)
tree: 76534c13c9b8f07d8f5315cf437c1aed0e0e1b67 /docs/specs
parent: 80ca5d00c4ddd481308ed8ce0c2f270bd34604c0 (diff)
download: rulesets-f4b64d6141156cf0ee2a2c2a13cda256f0bf0c84.tar.gz
rulesets-f4b64d6141156cf0ee2a2c2a13cda256f0bf0c84.zip
5 files changed, 1340 insertions, 0 deletions
diff --git a/docs/specs/2026-06-16-autonomous-batch-execution-spec.org b/docs/specs/2026-06-16-autonomous-batch-execution-spec.org
new file mode 100644
index 0000000..5e7b853
--- /dev/null
+++ b/docs/specs/2026-06-16-autonomous-batch-execution-spec.org
@@ -0,0 +1,391 @@
+#+TITLE: Autonomous-Batch Task Execution — Spec
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-06-16
+#+TODO: TODO | DONE
+#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+* READY Autonomous-Batch Task Execution — Spec
+:PROPERTIES:
+:ID:       90f623cd-fdbe-4f5c-b63d-b2f84d9151cf
+:END:
+- 2026-07-02 Thu @ 00:17:01 -0400 — retrofitted by spec-sort; status set to READY (evidence-based, human-confirmed)
+
+* Metadata
+| Status   | ready                                                              |
+|----------+--------------------------------------------------------------------|
+| Owner    | Craig Jennings                                                     |
+|----------+--------------------------------------------------------------------|
+| Reviewer | Craig Jennings                                                     |
+|----------+--------------------------------------------------------------------|
+| Date     | 2026-06-16                                                         |
+|----------+--------------------------------------------------------------------|
+| Related  | [[file:../../working/inbox-zero-phase-e/proposed-inbox-zero.org][Phase E proposal]]; [[file:../design/2026-06-15-fix-speedrun-workflow-proposal.org][speedrun proposal]] |
+|----------+--------------------------------------------------------------------|
+
+* Summary
+
+Two proposals arrived within a day of each other describing the same capability: have Claude work a batch of small, well-marked tasks autonomously, with a full quality bar per task and no per-step approval gate. The inbox-zero "Phase E" proposal drives it from a tag/priority query on a recurring loop; the "speedrun" proposal drives it from an explicit ordered list a human dictates in-session. This spec reconciles both into one feature: a single dedicated workflow, =work-the-backlog.org=, that holds the task-execution logic, with two thin callers feeding it. It also designs the instrumentation that measures whether the autonomy is actually paying off.
+
+* Problem / Context
+
+Craig has a standing backlog of small, solo-doable fixes across several projects, already marked with a tag convention (=:next:=, =:quick:+:solo:=). Doing them by hand one at a time is the bottleneck — the context-switch and the per-commit approval ceremony dominate the actual work. He wants Claude to burn these down unattended: on a recurring loop for the routed inbox case, and on demand when he batches a named list and says "speedrun, no approvals until done." The speedrun is the away-from-desk / working-on-something-else mode, so it must be able to take on larger tasks too — not only sub-30-minute ones — or it forces him to stay at the desk for anything non-trivial.
+
+Two separate proposals tried to answer this:
+
+- *Phase E* (in =inbox-zero.org=, edited in =.emacs.d= as a stopgap) bolted autonomous execution onto the inbox-zero workflow's on-demand and loop callers. The sender flagged the seam as the open question: coupling capture-routing with autonomous-implementation pollutes inbox-zero's three existing callers (startup, wrap-up, on-demand), two of which must never execute anything.
+- *speedrun* (a =.emacs.d= theme-studio session that worked well) is the same execution loop driven by an explicit ordered task set, with end-of-set paging and always-push.
+
+They overlap almost entirely. The execution loop — eligibility gate, act-vs-file decision, per-task quality bar, bounded run — is identical. Only the *input* differs (tag query vs explicit list) and the *session mode* differs (loop default vs no-approvals + always-push + page). Building them as two features would duplicate the execution logic and let the two copies drift. The forces: keep inbox-zero's callers clean, share one execution loop, and make the autonomy safe enough to run unattended on a 30-minute timer without Craig watching.
+
+A second, explicit ask from Craig: instrument this so its effectiveness is measurable. "Gather data on this and create some org-roam articles we can look at later." Autonomous execution that silently makes bad commits is worse than no autonomy; the only way to know which it is, is to measure tasks completed vs deferred vs reverted, and human corrections in the following session, over time.
+
+* Goals and Non-Goals
+
+** Goals
+- One workflow, =work-the-backlog.org=, owns the task-execution loop. Both input shapes (tag query, explicit list) and both session modes feed it.
+- inbox-zero's three existing callers stay clean: the loop caller chains into =work-the-backlog= *after* routing; startup and wrap-up never touch it.
+- The *no-approvals speedrun* is a thin named preset, not a second implementation: autonomous-commit + always-push + end-of-set page, fed an explicit ordered list, with all approvals front-loaded into a single pre-flight step (below) so the run itself is uninterrupted.
+- Eligibility is decided by *crisp, checkable criteria*, not adjectives: a mechanical tag/status gate (=:solo:= + status =TODO=), then a per-task defer checklist whose keystone is "can I write the failing test from the task text without inventing a requirement?" Task *size* is explicitly not a gate — a large task is decomposed into per-logical-commit chunks, not deferred.
+- The autonomy tags (=:solo:=, =:quick:=) carry hard definitions in =todo-format.md= and are applied + enforced as a mandatory step in the task-review and task-audit workflows, so the run-time gate trusts the author's tag instead of re-deriving it.
+- Commit autonomy defaults to file-only (surface a diff, no auto-commit). A project opts into autonomous commit+push explicitly via its per-project waiver.
+- Hard guardrails: refuse any task carrying data-loss / irreversible / external-state risk without a checkpoint; gather any one-or-two quick decisions a task needs *up front* (speedrun) rather than guessing; file a =VERIFY= for anything underspecified or needing design deliberation; a per-run cap / kill switch beyond "one task per run."
+- A lightweight per-run metrics log plus a periodic synthesis step that writes org-roam KB articles summarizing the trend.
+
+** Non-Goals
+- *Not* a replacement for =/start-work=. Tasks needing deliberation or design stay with =/start-work= and its approval gates. This feature only touches the marked, solo set — regardless of size.
+- *Not* a new tag convention. It reads the project's own priority/tag scheme header; it never invents or hardcodes tags across projects.
+- *Not* an inbox-routing change. =inbox-zero.org= keeps its A-D phases. The Phase E text added in =.emacs.d= as a stopgap is *removed* and its logic moves here.
+- *Not* a multi-project orchestrator. One run works one project's backlog. Cross-project handoff stays with =inbox-send= and the paging reply.
+- *Not* a credential-handling or external-API feature. Tasks that touch secrets or external mutations are out of the eligible set by the guardrail.
+
+** Scope tiers
+- *v1:* =work-the-backlog.org=; crisp =:solo:= / =:quick:= definitions in =todo-format.md= plus their mandatory application in task-review and task-audit; the eligibility gate (=:solo:= + status =TODO=, read against the project's scheme header); the act-vs-file *defer checklist* (test-writability keystone, enumerated data-loss list, already-satisfied, design-deliberation); the no-approvals speedrun's pre-flight decision-gathering step; file-only commit default with per-project opt-in; the loop caller wiring and inbox-zero Phase E removal; the speedrun preset with end-of-set =notify --persist= page; the per-run metrics log (structured JSONL).
+- *Out of scope:* a token-budget kill switch (cap is a task count in v1); cross-project batch runs; a dashboard or live UI over the metrics.
+- *vNext (log to todo.org):* the periodic org-roam synthesis step if it doesn't make v1; a token/cost budget alongside the task-count cap (more pressing now that task size is uncapped — a single large task can run long in the unattended loop); auto-detection of "human corrected my autonomous commit" from the next session's diff.
+
+* Design
+
+** Overview
+
+The architecture is one execution workflow with two callers and one preset, plus an instrumentation sidecar.
+
+#+begin_example
+  inbox-zero loop caller  ──(after Phase D routing)──┐
+                                                     ├──▶  work-the-backlog.org  ──▶ metrics log (JSONL)
+  no-approvals speedrun   ──(explicit ordered list)──┘                                      │
+   = pre-flight Q&A + autonomous-commit + push + page                                       ▼
+                                                                          periodic synthesis ──▶ org-roam KB articles
+#+end_example
+
+=work-the-backlog.org= is the only place the execution loop lives. It takes a *task set* (however assembled) and a *session mode* (which gates commit autonomy and paging), and works the set under a fixed safety contract. The two callers differ only in how they build the task set and which session mode they pass.
+
+This is the seam the Phase E sender asked for: separating capture-routing (inbox-zero) from autonomous-implementation (work-the-backlog) keeps inbox-zero's startup and wrap-up callers — which must never execute anything — untouched. The loop caller is the only one of inbox-zero's callers that chains forward into execution, and it does so as an explicit second step after routing completes, not as a phase buried inside inbox-zero.
+
+** The execution loop (two-altitude: caller's view)
+
+A caller hands =work-the-backlog= three things:
+
+1. *A task set* — either an explicit ordered list of task headings (speedrun), or the result of a tag/priority query against =todo.org= (the loop). The workflow does not care which; it receives an ordered list of candidate tasks.
+2. *A session mode* — =file-only= (default) or =autonomous-commit= (requires the project's per-project waiver), and a paging flag.
+3. *A run cap* — the maximum number of tasks to complete this run.
+
+It returns: per-task outcome (implemented+committed / implemented+diff-surfaced / deferred-VERIFY / dropped-by-craig / skipped-ineligible), and a metrics record per task.
+
+** The execution loop (implementer's view)
+
+For the task set, in order, until the run cap is hit:
+
+1. *Eligibility gate* (below). Ineligible → record =skipped-ineligible=, next task.
+2. *Scope read* of the relevant code. Cheap; just enough to run the defer checklist.
+3. *Defer checklist* (below). Any hit → record the deferral reason (or, under the speedrun preset, route the quick-question gap to the pre-flight Q&A), next task.
+4. *Implement* under the project's commit discipline: TDD red→green→refactor, then =/review-code --staged=, fix all Critical/Important, then close the task per =todo-format.md=. Decompose into as many logical commits as the change needs — size is not capped.
+5. *Commit autonomy branch:*
+   - =file-only= → surface the diff, do *not* commit. Record =implemented-diff-surfaced=.
+   - =autonomous-commit= → =/voice personal= on the message, commit individually, push per the project's flow. Record =implemented-committed=.
+6. *Record metrics* for the task (the JSONL append, below).
+7. Decrement the cap. At zero, stop.
+
+After the set: if the paging flag is set, fire the end-of-set page (below). Surface the run summary.
+
+** Eligibility gate (mechanical — no judgment)
+
+A task is autonomous-safe when *both* hold. This layer is a lookup, not a judgment; all the judgment lives in the defer checklist below.
+
+1. *Status is =TODO=* — never =VERIFY=, =DOING=, =DONE=, or =CANCELLED=. =VERIFY= is the "awaiting Craig's manual confirmation" marker; auto-implementing one defeats the manual check it represents. The do-not-implement set is safe-by-omission: anything not plainly =TODO= (plus any project-declared "hold" marker) is out.
+2. *Tagged =:solo:=* — the autonomy tag, resolved against the project's priority/tag scheme header (not hardcoded). =:solo:= carries a hard definition (see Tag definitions, below): the task is completable without Craig's involvement beyond at most one or two quick decisions answerable up front, with no design deliberation. A project whose scheme declares a different autonomous-safe tag set overrides the default. Priority / =:next:= drive *ordering* within the eligible set, not eligibility.
+
+Task *size* is deliberately absent from this gate. The old "≤ ~30 minutes / one logical commit" criterion is removed: a large but well-specified, decision-free task is in scope and is decomposed into per-logical-commit chunks during implementation. Size never sends a task to =/start-work=; only *deliberation* or *risk* does (the checklist below). This is what makes the speedrun usable as an away-from-desk mode rather than a sub-30-minute-only mode.
+
+*** Tag definitions (land in =todo-format.md=, enforced in task-review + task-audit)
+
+- *=:solo:= — autonomy.* The task can be completed without Craig's involvement, except for at most one or two quick decisions that can be stated and answered before the run starts. No open design question, no "weigh these approaches," no waiting on Craig mid-task. This is the eligibility tag.
+- *=:quick:= — effort hint only.* A small, fast task. Informational for batching and estimating a run's duration; *not* an eligibility gate (size no longer gates).
+
+Both tags are applied at task creation and *re-checked as a mandatory step* in the task-review and task-audit workflows, so the run-time gate can trust the author's tag rather than re-derive autonomy and effort from the task body. A task-review or task-audit that skips the =:solo:= / =:quick:= assessment is incomplete.
+
+** Act-vs-file decision (the defer checklist)
+
+After the scope read, run each eligible candidate through the checklist below. Each item is a concrete, answerable question, not an adjective. *Any* hit — or any "unsure" — sends the task to defer (or, for a quick-decision gap under the speedrun preset, to the pre-flight Q&A). Only a task that clears every item is implemented.
+
+1. *Test-writability (the keystone).* Can I write the failing test from the task text — plus any decisions gathered up front — without inventing a requirement? *No / unsure* → underspecified. Under the speedrun preset, if the gap is one or two quick answerable questions, route it to the pre-flight Q&A; otherwise file a =VERIFY= noting what's missing. Under the unattended loop, file the =VERIFY= (no one to ask). This replaces the old "clear / bounded / underspecified" adjectives with an action that fails loudly: if the red test isn't writable, the task isn't ready.
+2. *Data-loss / irreversible / external operation.* Does implementing it require any of: =rm= of non-scratch data, =git reset --hard= / force-push, =DROP= / =DELETE= / =TRUNCATE=, file truncate/overwrite of persisted content, a schema or data migration, any external or shared-state mutation, any credential touch? *Yes* → do NOT implement; file a =VERIFY= naming the risk. This is the hard safety gate; an upfront answer never overrides it without an explicit checkpoint. Replaces the vague "data-loss risk" with an enumerated, greppable set.
+3. *Already-satisfied.* Does the scope read show the desired end-state already holds? *Yes* → file a =VERIFY= noting it (the "raise max spans to 5 — every cap was already 8" case) and move on. Don't make a no-op change.
+4. *Design deliberation.* Does the task carry an unresolved design question, a "weigh these approaches" with real tradeoffs, or a TBD that isn't a quick factual answer? *Yes* → under the speedrun preset, if it collapses to one or two quick questions, route to pre-flight Q&A; otherwise file and surface as a =/start-work= candidate. Under the loop, file. The discriminator is now *quick-answerable question* vs *deliberation* — not task size.
+
+A task that clears 1–4 is implemented under the project's commit discipline, decomposed into as many logical commits as the change needs. When genuinely unsure which side a task falls on, defer — a wrong auto-implement costs a revert *and* the next-session correction the metrics are designed to catch.
+
+** Pre-flight decision gathering (the no-approvals speedrun's only interaction)
+
+The speedrun preset front-loads every approval into one step before the run, so the run itself is uninterrupted — that is what "no approvals" means. It is *not* "no input ever"; it is "all input first, then hands-off."
+
+When Craig kicks off a speedrun over an explicit list:
+
+1. *Gather* the named task set.
+2. *Scope-read and classify* each task against the eligibility gate + defer checklist: ready (clears the checklist), needs-quick-decisions (one or two upfront-answerable questions — checklist item 1 or 4), or drop (data-loss / irreversible, or design deliberation that isn't a quick question).
+3. *Order* the list (priority, then the author's ordering / =:next:=).
+4. *Intro the work* — present the ordered plan: what will run, what was dropped and why, and the batched questions for the needs-quick-decisions tasks.
+5. *Craig answers each question, or says "skip this"* → a skipped task is removed from the run (recorded =dropped-by-craig=); an answered task has the answer recorded so implementation works from the decision, not a guess.
+6. *Run the finalized list autonomously* — no further approvals until done.
+7. *End-of-set page* with completed + remaining + skipped.
+
+The unattended *loop* caller has no human at kickoff, so it cannot gather decisions: there, a needs-quick-decisions task simply defers (files its note) like any other checklist hit. The pre-flight Q&A is a speedrun-preset capability, not a loop one.
+
+** Session modes and the no-approvals speedrun preset
+
+Two orthogonal session-mode dimensions feed the loop:
+
+- *Commit autonomy:* =file-only= (default) or =autonomous-commit=. =autonomous-commit= is honored only when the project carries the per-project waiver (=.emacs.d= and =rulesets= have it; most projects do not). Absent the waiver, a request for =autonomous-commit= degrades to =file-only= and says so.
+- *Paging:* on or off. End-of-set only.
+
+The *no-approvals speedrun* is the named preset = =autonomous-commit= + always-push + paging-on, fed an *explicit ordered list*, run after the pre-flight decision-gathering step above. It is not a separate code path; it is a label for that combination of mode flags plus the explicit-list input, with the pre-flight Q&A as its only interactive moment. The loop caller, by contrast, runs =file-only= (unless the project has the waiver and opts the loop into commits) with paging off, fed the *tag query*, with no pre-flight step.
+
+** Bounding the run and the kill switch
+
+Default cap: one task per run for the loop caller — implement the highest-priority eligible candidate (=[#A]= before =[#B]= before =[#C]=), record, then stop and let the next tick continue. The speedrun preset works the whole explicit list in order (the human bounded it by naming it), still one commit per logical change.
+
+The kill switch is a hard per-run task cap passed by the caller, independent of "one per run": even the speedrun stops at the cap and pages with the remainder listed. A loop that fires every 30 minutes and commits unattended needs a ceiling that a runaway can't exceed. With task size now uncapped, the count cap no longer bounds *cost* — a single large task can run long — so a token/cost budget is the most pressing vNext addition.
+
+** End-of-set paging
+
+When the set is done (or the cap is hit), if paging is on, fire one page — end-of-set only, never per-task:
+
+#+begin_src sh
+notify alarm "Page" "<project>: <N> done, <M> remaining — <one-line summary>" --persist
+#+end_src
+
+=--persist= keeps it on screen until dismissed (the page-me convention). The message carries the project name, the completed count, and the remaining count, so Craig can reply confirming ready + naming the next project in one turn. The page-signal wrapper removed 2026-06-12 is reconciled to =notify= here — there is no separate page-signal call.
+
+* Alternatives Considered
+
+** Fold execution into inbox-zero (the Phase E stopgap shape)
+- Good, because it's the smallest diff — the loop caller already runs inbox-zero, so execution is "one more phase."
+- Bad, because it couples capture-routing with implementation. inbox-zero has three callers; startup and wrap-up must never execute. A Phase E inside inbox-zero forces both to carry a "skip Phase E" caveat and risks a future caller running it by accident.
+- Neutral, because the eligibility-gate and defer-checklist text is identical either way — only its *home* differs.
+
+** Two separate features (keep Phase E and speedrun distinct)
+- Good, because each proposal ships as written with no reconciliation work.
+- Bad, because the execution loop is duplicated in two places and will drift; a guardrail tightened in one won't reach the other. Two ways to do autonomous execution is two things to audit.
+- Neutral, because the input and session-mode differences are real — but they're thin caller-level differences, not a reason to fork the engine.
+
+** Keep the task-size gate (defer anything over ~30 minutes)
+- Good, because it bounds per-task cost and blast radius with a single number.
+- Bad, because it defeats the away-from-desk use case — anything non-trivial bounces back to Craig, so he can't actually leave. Size correlates poorly with risk; a large mechanical refactor is safer than a tiny change to persisted state.
+- Neutral, because the things size was a proxy for (risk, cost) are covered directly — risk by the data-loss checklist, cost by the run cap (and the vNext token budget). The defer checklist's deliberation item, not size, is what routes genuine =/start-work= tasks out.
+
+** Autonomous-commit as the default
+- Good, because it's faster end-to-end with no diff to review.
+- Bad, because most projects lack the per-project waiver, and an unattended loop committing to a project that never opted in is exactly the failure the file-only default prevents. The blast radius of a bad autonomous commit is a revert plus lost trust in the loop.
+- Neutral, because the projects that *do* want it (=.emacs.d=, =rulesets=) opt in explicitly, so the capability is available where it's wanted without being the default everywhere.
+
+* Decisions [8/8]
+
+** DONE Eligibility tag set and where it's read
+- Owner / by-when: Craig / spec-review
+- Context: Projects' priority/tag schemes vary, and the =todo-format.md= scheme header is the declared per-project source of truth. Task size is no longer a gate, so eligibility rests on the autonomy tag, not an effort cap.
+- Decision: Eligibility = status =TODO= AND the =:solo:= autonomy tag, resolved against the project's scheme header (a project may declare a different autonomous-safe set). Priority / =:next:= drive ordering, not eligibility. =:quick:= is an effort hint, never a gate.
+- Consequences: easier — one workflow works across projects with different vocab, and the gate is a pure lookup; harder — a project with no/malformed scheme header needs a fallback, and the default (=:solo:=) must be defined precisely enough that two projects agree.
+
+** DONE Crisp =:solo:= / =:quick:= definitions, enforced in task-review + task-audit
+- Owner / by-when: Craig / spec-review
+- Context: The run-time gate is only as crisp as the tags. Today =:quick:= / =:solo:= are listed in the scheme header with no hard definition, and nothing enforces that tasks get assessed for them.
+- Decision: Define =:solo:= (completable without Craig beyond at most one-or-two upfront-answerable quick decisions; no design deliberation) and =:quick:= (small/fast effort hint only) in =todo-format.md=, and make assessing both a *mandatory step* in the task-review and task-audit workflows. A review/audit that skips the assessment is incomplete.
+- Consequences: easier — authoring-time judgment by the human who knows the answer, and the run-time gate trusts the tag; harder — task-review and task-audit grow a required step, and existing untagged tasks need a back-fill pass.
+
+** DONE The do-not-auto-implement marker set
+- Owner / by-when: Craig / spec-review
+- Context: =VERIFY= means "awaiting Craig's manual confirmation"; other projects may use markers differently.
+- Decision: Do-not-implement = any status that is not =TODO=, plus any project-declared "hold" marker. Safe-by-omission: exclude anything not plainly =TODO=.
+- Consequences: easier — portable, and manual-check tasks can't auto-run; harder — richer per-project overrides need marker semantics in the scheme header, which most lack, so the default must stay conservative.
+
+** DONE Pre-flight decision gathering for the speedrun preset
+- Owner / by-when: Craig / spec-review
+- Context: Forcing every decision-needing task to defer wastes the away-from-desk use case — many tasks need only one or two quick answers Craig could give at kickoff. The speedrun is interactive at its start but must be hands-off after.
+- Decision: The speedrun preset gathers + orders the set, intros the work, and batches all needed quick decisions into one pre-flight Q&A; Craig answers or says "skip this" (drops the task); the run then proceeds with zero further approvals. The unattended loop has no kickoff human, so it defers decision-needing tasks instead.
+- Consequences: easier — "no approvals" becomes "all approvals first," which fits working-while-away, and larger / lightly-underspecified tasks become runnable; harder — the classifier must reliably split quick-question vs real-deliberation, and the recorded answers must reach the implementer so it works from the decision, not a guess.
+
+** DONE Commit-autonomy opt-in mechanism
+- Owner / by-when: Craig / spec-review
+- Context: =file-only= is the default; =.emacs.d= and =rulesets= have a per-project waiver allowing autonomous commits. Where does the workflow *read* that a project has opted in?
+- Decision: Read the opt-in from the project's existing per-project waiver location (=notes.org= Workflow State or =CLAUDE.md=), not a new config file. Two flags: "has commit waiver" and "loop may commit" can differ.
+- Consequences: easier — no new config surface, reuses the existing waiver concept; harder — the waiver location/format must be pinned for deterministic detection, and "waiver yes, loop-commit no" needs the two-flag split.
+
+** DONE Run-cap default and the kill switch shape
+- Owner / by-when: Craig / spec-review
+- Context: The loop default is one task per run; the speedrun works an explicit list. Both need a hard ceiling. Task size is now uncapped, so a single task can be large.
+- Decision: The caller passes a hard per-run task cap (loop default 1; speedrun = length of the explicit list, capped at a ceiling); stop + page with the remainder when the cap is hit. v1 caps by task count, not token budget.
+- Consequences: easier — a simple caller-controlled integer with a bounded task count; harder — a count cap doesn't bound *cost*, and with size uncapped a single large task can run long, so a token budget is vNext and more pressing than before.
+
+** DONE Metrics log location and format
+- Owner / by-when: Craig / spec-review
+- Context: Per-run metrics must land somewhere structured and queryable, per-project, and survive across sessions for the synthesis step to read.
+- Decision: Append one JSONL record per task to a per-project log at =.ai/metrics/work-the-backlog.jsonl=, git-tracked, with the synthesis step reading the union across projects.
+- Consequences: easier — append-only JSONL is trivial to write and =jq=-queryable, and per-project keeps it local to the work; harder — a git-tracked log adds commit churn, and "union across projects" needs the synthesis step to know where every log lives.
+
+** DONE Synthesis cadence and trigger
+- Owner / by-when: Craig / spec-review
+- Context: Craig wants periodic org-roam articles summarizing the data. What triggers synthesis, and how often?
+- Decision: Run synthesis on an explicit trigger ("synthesize backlog metrics") and optionally a weekly scheduled run, writing one KB node per synthesis under =~/org/roam/agents/= per the knowledge-base rule.
+- Consequences: easier — an explicit trigger means no surprise writes, and the KB rule already governs node shape; harder — a weekly run needs a scheduler entry, and the personal-only write-classification must gate it so work-project metrics never land in the KB.
+
+* Implementation phases
+
+** Phase 0 — Tag definitions + task-review/audit enforcement
+Add the hard =:solo:= / =:quick:= definitions to =todo-format.md=, and add the mandatory tag-assessment step to the task-review and task-audit workflows. Independent of the workflow build; lands first so the eligibility gate has crisp tags to read and existing tasks start getting assessed. Tree stays working: these are rule + workflow prose additions.
+
+** Phase 1 — Extract the execution loop into work-the-backlog.org
+Write =work-the-backlog.org= holding the eligibility gate, defer checklist, per-task quality bar, and run-cap logic — taking a task set + session mode + cap as input. Remove the stopgap "Phase E" text from =inbox-zero.org= (restore it to its A-D shape) in the same change so there's one home, not two. Tree stays working: inbox-zero reverts to routing-only, and the new workflow is callable but not yet wired to the loop.
+
+** Phase 2 — Wire the two callers
+Add the loop caller's chain step (after inbox-zero Phase D, invoke work-the-backlog with the tag query + file-only + cap 1) and the no-approvals speedrun preset (pre-flight decision-gathering → explicit list + autonomous-commit + always-push + paging-on). Both go through the same workflow; only the speedrun runs the pre-flight Q&A. Tree stays working: each caller is independently testable.
+
+** Phase 3 — File-only vs autonomous-commit gate
+Implement the commit-autonomy branch: read the per-project waiver, degrade =autonomous-commit= to =file-only= when absent, surface the degrade. Tree stays working: default file-only behavior is the safe path even before the waiver-read lands.
+
+** Phase 4 — The defer checklist, pre-flight Q&A, and the page
+Implement the act-vs-file defer checklist (test-writability keystone, enumerated data-loss list, already-satisfied, design-deliberation), the speedrun pre-flight decision-gathering (gather → classify → order → intro → batch-ask → skip/answer), the =VERIFY=-on-ambiguity filing, and the end-of-set =notify alarm ... --persist= page. Tree stays working: the checklist only ever *reduces* what runs, and the pre-flight step only runs under the speedrun preset.
+
+** Phase 5 — Metrics log
+Append the per-task JSONL record at each task outcome. Tree stays working: logging is a side effect that doesn't alter execution.
+
+** Phase 6 — Synthesis to org-roam
+Write the synthesis step: read the JSONL union, compute the per-run and trend metrics (below), write a KB node under =~/org/roam/agents/= per the knowledge-base rule, personal-projects-only classification enforced. Tree stays working: synthesis is read-only over the logs plus a KB write.
+
+* Acceptance criteria
+- [ ] =work-the-backlog.org= exists and is the only home for the execution loop; =inbox-zero.org= is back to its A-D routing-only shape with no Phase E.
+- [ ] The loop caller chains into work-the-backlog after routing; startup and wrap-up never invoke it.
+- [ ] The no-approvals speedrun runs as the preset (pre-flight Q&A → autonomous-commit + always-push + end-page) over an explicit ordered list, one commit per logical change.
+- [ ] =:solo:= and =:quick:= carry hard definitions in =todo-format.md=, and task-review + task-audit both refuse to complete without assessing them.
+- [ ] Eligibility = status =TODO= AND =:solo:=, read from the project's scheme header, not hardcoded; a =VERIFY= / =DOING= / =DONE= / =CANCELLED= task is skipped by the gate.
+- [ ] Task size never sends a task to =/start-work=; a large but =:solo:=, well-specified task runs and is decomposed into per-logical-commit chunks.
+- [ ] The defer checklist fires correctly: a task whose red test isn't writable (and isn't a quick-question gap), one carrying an enumerated data-loss operation, an already-satisfied one, and one needing design deliberation are each deferred (or routed to pre-flight Q&A under the speedrun), not implemented.
+- [ ] Under the speedrun preset, a task needing one or two quick decisions is surfaced in the pre-flight Q&A; "skip this" drops it, an answer is recorded and used; the run then proceeds with no further approvals.
+- [ ] Under the unattended loop, a decision-needing task defers (no pre-flight Q&A).
+- [ ] In a project without the commit waiver, an =autonomous-commit= request degrades to file-only and says so; no commit is made.
+- [ ] The run stops at the per-run cap and pages with the remaining tasks listed.
+- [ ] Each task outcome appends one JSONL record to =.ai/metrics/work-the-backlog.jsonl=.
+- [ ] The synthesis step reads the logs and writes a KB node under =~/org/roam/agents/=; it refuses to write for work-classified projects.
+
+* Effectiveness measurement
+
+This section answers Craig's explicit ask: measure whether autonomous-batch execution is actually effective, and build the "gather data → org-roam articles" loop.
+
+** What "effective" means here
+
+The autonomy is effective if it completes real work that *stays* completed — i.e. tasks land green and the next session doesn't have to undo or fix them. The two failure modes to catch are (1) the loop defers everything (over-cautious, no value delivered) and (2) the loop implements badly (commits that get reverted or hand-corrected next session). Both are measurable.
+
+** Per-run metrics (the JSONL record)
+
+One record per task, appended to =.ai/metrics/work-the-backlog.jsonl= at each task outcome:
+
+| Field             | Meaning                                                             |
+|-------------------+--------------------------------------------------------------------|
+| =ts=              | ISO timestamp of the task outcome                                   |
+|-------------------+--------------------------------------------------------------------|
+| =run_id=          | UUID shared by all tasks in one run                                |
+|-------------------+--------------------------------------------------------------------|
+| =project=         | project basename                                                    |
+|-------------------+--------------------------------------------------------------------|
+| =caller=          | =loop= or =speedrun=                                                |
+|-------------------+--------------------------------------------------------------------|
+| =task=            | task heading (slug)                                                 |
+|-------------------+--------------------------------------------------------------------|
+| =outcome=         | implemented-committed / implemented-diff / deferred-verify /        |
+|                   | skipped-ineligible / dropped-by-craig (skipped at pre-flight)      |
+|-------------------+--------------------------------------------------------------------|
+| =defer_reason=    | underspecified / data-loss / already-satisfied / needs-deliberation |
+|-------------------+--------------------------------------------------------------------|
+| =upfront_decision=| true if a pre-flight answer was recorded and used for this task     |
+|-------------------+--------------------------------------------------------------------|
+| =wall_clock_s=    | seconds from task start to outcome                                  |
+|-------------------+--------------------------------------------------------------------|
+| =commit_sha=      | for committed tasks; empty otherwise                               |
+|-------------------+--------------------------------------------------------------------|
+| =review_findings= | count of /review-code Critical+Important findings on this task      |
+|-------------------+--------------------------------------------------------------------|
+
+Per-run rollups computed at synthesis (not stored per record): tasks attempted, completed, VERIFY-deferred, dropped-by-craig, reverted; wall-clock total; commits landed; review findings per commit.
+
+** The corrections signal (the key metric)
+
+The hardest and most valuable metric is *human corrections in the following session* — did Craig revert or hand-fix an autonomous commit? v1 captures the cheap proxy: at synthesis, for each =commit_sha=, check whether a later commit touching the same files reverted it or carries a "fix"/"revert" of that change within N days. A clean run is one where the autonomous commits survive untouched. (Auto-detecting "this later commit corrected that autonomous one" precisely is a vNext refinement; the proxy — reverted-or-touched-soon-after — is good enough to flag a problem run for human review.)
+
+** Where the data lands
+
+Per-project git-tracked JSONL at =.ai/metrics/work-the-backlog.jsonl=. Append-only, =jq=-queryable, survives across sessions and machines via the normal project sync. Git-tracked so the history is auditable and the synthesis step can read it from any clone.
+
+** The synthesis loop (gather → article)
+
+On the "synthesize backlog metrics" trigger (and optionally a weekly scheduled run):
+
+1. Read the JSONL union across the personal projects the synthesizer can see.
+2. Compute the rollups and the trend: completion rate over time, defer-reason distribution, review-findings-per-commit trend, and the corrections-signal flag count.
+3. Write one org-roam KB node under =~/org/roam/agents/YYYYMMDDHHMMSS-backlog-metrics-<window>.org= per the knowledge-base rule — filetags =:agent:metrics:=, a concise title, the rollup table, the trend narrative, and =[[id:...]]= links to prior synthesis nodes so the series is traceable.
+4. Enforce the KB write-classification: *personal projects only*. A work-classified project's metrics never write to the KB — they stay in that project's own =.ai/metrics/= log and the synthesizer reports the refusal per the KB refusal contract.
+
+The KB node is the artifact Craig reviews later — "are the autonomous runs completing more and getting corrected less over the last month?" reads off the trend table without re-querying raw logs.
+
+* Readiness dimensions
+
+- *Data model & ownership:* The task set is read from =todo.org= (project-owned, user-authored). The metrics JSONL is generated, append-only, git-tracked, project-owned. KB nodes are agent-generated under =~/org/roam/agents/= (never overwriting Craig's hand-authored nodes — link only). No editable region is co-owned.
+- *Errors, empty states & failure:* Empty task set → report "nothing eligible" and stop. Malformed scheme header → fall back to the default tag reading and surface the fallback. A task that fails mid-implementation → leave the tree working (don't commit a broken state), record the failure outcome, surface it, continue to the next task. No silent data loss: the data-loss guardrail refuses irreversible tasks outright.
+- *Security & privacy:* Tasks touching credentials or external mutations are excluded by the data-loss / external-state checklist item. The KB write is personal-projects-only; work metrics never leave the project. No secrets in the JSONL (task slugs and SHAs only).
+- *Observability:* The end-of-set page surfaces the run outcome. The per-task surface (implemented / deferred + reason / dropped / skipped) is the live progress view. The metrics log + KB synthesis is the long-run observability. A bad run is isolable from the JSONL (which task, which outcome, which review findings).
+- *Performance & scale:* Expected counts are small — a handful of tasks per run, one run per 30-min tick. No bottleneck at this scale. The cap bounds the worst case on task count; with size uncapped, a single large task is the cost outlier the vNext token budget addresses. Synthesis over months of JSONL is still a small file (one record per task).
+- *Reuse & lost opportunities:* Reuses =todo-format.md= for task close + the tag definitions, =/review-code= and =/voice personal= for the quality bar, =notify= for paging, the knowledge-base rule for KB writes, the per-project waiver for commit-autonomy, and task-review / task-audit for tag enforcement. No new config file (the opt-in rides the existing waiver). The execution loop is the one new shared asset.
+- *Architecture fit & weak points:* Integration points — inbox-zero loop caller (chain after Phase D), the per-project waiver location, =todo.org= scheme header, task-review / task-audit, =~/org/roam/agents/=. Weak point: the commit-autonomy gate depends on deterministically reading the waiver; mitigated by defaulting to file-only when the read is ambiguous (fail safe, not open). Second weak point: a 30-min loop committing unattended with uncapped task size; mitigated by the hard count cap and file-only default, with the token budget as the vNext backstop.
+- *Config surface:* Per-project — commit-autonomy opt-in (via existing waiver), optional loop-commit flag, optional autonomous-safe tag override in the scheme header. Per-call — task set, session mode, run cap. Defaults: file-only, paging-off (loop) / paging-on (speedrun), cap 1 (loop).
+- *Documentation plan:* The workflow file itself is the user/operator doc (matches inbox-zero.org's self-documenting style). The =.emacs.d= stopgap note and the speedrun proposal are superseded by this spec; no separate migration doc needed beyond removing the Phase E text.
+- *Dev tooling:* N/A for new build targets — the workflows are prose, exercised by invocation. The metrics JSONL is =jq=-inspectable by hand; a tiny rollup helper may be added under =.ai/scripts/= if the synthesis prose proves to need it (decided at Phase 6, not a v1 prerequisite).
+- *Rollout, compatibility & rollback:* Rollout is removing Phase E from inbox-zero and adding work-the-backlog — both prose changes, instantly reversible. Compatibility: inbox-zero's three callers are unchanged except the loop caller gaining a forward chain. Rollback: delete work-the-backlog and the loop chain step; inbox-zero is already back to A-D. The file-only default means the worst pre-rollback state is surfaced diffs, not committed changes.
+- *External APIs & deps:* =notify alarm "Page" "<msg>" --persist= verified against =/home/cjennings/.local/bin/notify= and the page-me workflow. =~/org/roam/= KB write path and node shape verified against the knowledge-base rule. No external API calls.
+
+* Risks, Rabbit Holes, and Drawbacks
+
+- *The corrections signal is a proxy, not ground truth.* "A later commit touched the same files" over-counts (legitimate follow-up work) and under-counts (a correction in a different file). It's a flag for human review, not a verdict. Don't rabbit-hole on making it precise in v1 — the proxy plus a human glance is the design.
+- *Waiver detection drift.* If the per-project waiver location moves or its format changes, the commit-autonomy gate could mis-read. Mitigation: fail safe to file-only. Pin the waiver format in the Phase 3 decision before building.
+- *Unattended-commit blast radius.* The headline risk. Mitigated four ways: file-only default, the hard cap, the data-loss checklist item, and the metrics loop (which makes a bad run visible after the fact even if the first three let something through). With task size uncapped, the cost dimension of this risk grows — the vNext token budget is the planned fifth layer.
+- *Scope creep into /start-work territory.* Size is intentionally no longer the brake. The brake is the defer checklist's design-deliberation item plus the "when unsure, defer" rule — keep item 4 strict so genuine deliberation-class tasks still route out even when they're tagged =:solo:= by mistake.
+- *Pre-flight classifier error.* The speedrun's gather step has to split quick-answerable-question from real-deliberation. Misclassifying a deliberation task as a quick question puts a half-baked decision into an autonomous run. Mitigation: when the question isn't answerable in one or two lines, treat it as deliberation and drop it from the run, not as a pre-flight question.
+
+* Testing / Verification / Rollout
+
+Verification is by invocation against a project's real =todo.org=: run the loop caller in file-only mode and confirm it surfaces diffs without committing; run the speedrun against a small explicit list in a waiver-carrying project and confirm the pre-flight Q&A fires, "skip this" drops a task, an answer is recorded and used, then one commit per logical change + the end page; plant a =VERIFY=-status task, a data-loss task, an already-satisfied task, and a large-but-=:solo:= task and confirm the first three are skipped/refused while the large one runs and decomposes; confirm the JSONL grows one record per task; run synthesis and confirm a KB node lands (personal project) or is refused (work project). Rollout is the Phase 0-6 sequence, each leaving the tree working; the file-only default makes early phases safe to ship before the commit and paging phases land.
+
+* References / Appendix
+
+- [[file:../../working/inbox-zero-phase-e/proposed-inbox-zero.org][Phase E proposal (inbox-zero stopgap)]] and [[file:../../working/inbox-zero-phase-e/sender-note.org][its sender note with the 5 open questions]].
+- [[file:../design/2026-06-15-fix-speedrun-workflow-proposal.org][speedrun proposal]] (file retains its original on-disk name pending a rename pass).
+- [[file:../../.ai/workflows/inbox-zero.org][inbox-zero.org (canonical, A-D)]] — the routing workflow this feature decouples from.
+- =~/code/rulesets/claude-rules/knowledge-base.md= — the org-roam write contract the synthesis step follows.
+
+* Review and iteration history
+** 2026-06-16 Tue — author
+- What: initial draft reconciling the Phase E and fix-speedrun proposals into one work-the-backlog.org feature, plus the effectiveness-measurement instrumentation.
+- Why: two overlapping proposals arrived within a day; building them separately would duplicate the execution loop and let it drift. Craig also asked explicitly for measurement + org-roam synthesis.
+- Artifacts: this spec; the two source proposals under docs/design/ and working/inbox-zero-phase-e/.
+** 2026-06-28 Sun — revision (Craig)
+- What: removed the task-size gate (size no longer defers; large tasks decompose into per-commit chunks); recast the act-vs-file rule as a crisp four-item defer checklist keyed on test-writability; added crisp =:solo:= / =:quick:= definitions destined for =todo-format.md= and made their assessment mandatory in task-review + task-audit; added the speedrun's pre-flight decision-gathering step (batch the quick questions up front, "skip this" drops a task, then run hands-off); renamed "fix speedrun" → "no-approvals speedrun" in prose. Status stays draft pending ratification of the revised decisions.
+- Why: the original criteria were adjectives, not checkable; the size gate forced Craig to stay at his desk for anything non-trivial, defeating the away-from-desk use case; and decision-needing tasks were over-deferred when many need only a quick upfront answer.
+** 2026-06-29 Mon — ratified
+- What: Craig ratified all eight revised decisions; Status → ready. Implementation-ready across Phase 0 (tag definitions + task-review/audit enforcement) through Phase 6 (synthesis).
+- Why: the crisp defer checklist and the pre-flight-Q&A design resolved the "criteria too soft" and "size shouldn't gate" concerns that held the spec in draft.
diff --git a/docs/specs/2026-06-16-encourage-kb-contribution-spec.org b/docs/specs/2026-06-16-encourage-kb-contribution-spec.org
new file mode 100644
index 0000000..cfbfe79
--- /dev/null
+++ b/docs/specs/2026-06-16-encourage-kb-contribution-spec.org
@@ -0,0 +1,206 @@
+#+TITLE: Encourage Org-Roam KB Contribution Across Workflows — Spec
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-06-16
+#+TODO: TODO | DONE
+#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+* READY Encourage Org-Roam KB Contribution Across Workflows — Spec
+:PROPERTIES:
+:ID:       f67f5f45-5aa1-4a5a-8704-d636e4e16f75
+:END:
+- 2026-07-02 Thu @ 00:17:01 -0400 — retrofitted by spec-sort; status set to READY (evidence-based, human-confirmed)
+
+* Metadata
+| Status   | ready                                          |
+|----------+------------------------------------------------|
+| Owner    | Craig Jennings                                 |
+|----------+------------------------------------------------|
+| Reviewer | Craig Jennings                                 |
+|----------+------------------------------------------------|
+| Date     | 2026-06-16                                     |
+|----------+------------------------------------------------|
+| Related  | [[file:../../todo.org][rulesets todo.org]]     |
+|----------+------------------------------------------------|
+
+* Summary
+
+The org-roam KB already exists (=knowledge-base.md=: =~/org/roam/agents/=, =:agent:= filetag, capture-then-promote, personal-vs-work write boundary), but nothing in the daily workflow loop encourages agents to use it. The wrap-up's =KB: promoted N / consulted yes-no= receipt is the only touchpoint, and it fires at the very end when the session's learnings have already faded. This feature wires four light prompts into the synced template workflows — startup, triage-intake, inbox-zero, wrap-it-up — plus one curated best-practices node in the KB, so contributing durable knowledge becomes a habit the workflows nudge rather than a rule agents forget.
+
+* Problem / Context
+
+The KB rule is sound but passive. An agent reads =knowledge-base.md= once at rule-load and then never gets reminded to consult or contribute, so the KB stays nearly empty and never reaches the critical mass where consulting it pays off. The compounding asset Craig wants — a cross-project store that gets more valuable as it grows — needs a contribution habit, and habits in this system come from workflow prompts, not from a rule sitting in the background.
+
+Three gaps:
+
+1. *No quality guidance.* =knowledge-base.md= says what goes in (durable facts) and where (=agents/= nodes), but not /how/ to write a good node — atomic, descriptively titled, linked. An agent following the rule literally can still produce a junk drawer of vague, unlinked notes that no future agent can find or trust.
+2. *No mid-session capture prompts.* Triage-intake and inbox-zero both surface durable signal (a recurring pattern across messages, a reference pointer worth keeping) and then drop it. Nothing tells the agent "that was worth a node."
+3. *The only contribution prompt is too late.* Wrap-up's KB promotion check runs in Step 1, after the session, when the agent is reconstructing learnings from the log rather than capturing them while fresh.
+
+* Goals and Non-Goals
+
+** Goals
+- Curate a best-practices node in the KB that teaches agents how to write good nodes, drawing on established note-taking guidance.
+- Link that node from startup with a light, one-line encouragement to contribute through the session.
+- Add a short end-of-flow KB reminder to triage-intake and inbox-zero.
+- Add an early wrap-up prompt that asks what the agent learned worth remembering, feeding the existing =KB: promoted N= receipt.
+- Keep every prompt light and non-blocking — encouragement, never a gate.
+
+** Non-Goals
+- *Not* changing =knowledge-base.md='s write boundary, schema, or the work/personal classification. The feature builds on that rule unchanged.
+- *Not* adding a blocking gate anywhere. No workflow stalls or fails because a node wasn't written.
+- *Not* automating node creation. The agent decides what's durable; the prompts only ask the question.
+- *Not* a second receipt or metric. Wrap-up's =KB: promoted N / consulted yes-no= line stays the single instrumentation point.
+- *Not* touching the wrap-up's existing Step 1 KB-promotion sub-section's schema — the new early prompt /feeds/ it, it doesn't replace it.
+
+** Scope tiers
+- v1: the four workflow edits + the one curated best-practices node. All synced templates, so the edits propagate to every project on next startup.
+- Out of scope: a contribution-rate dashboard, per-project KB stats, auto-suggesting nodes from session content.
+- vNext: a "consult the KB before this task" prompt in start-work / spec-create (deferred — log to todo.org).
+
+* Design
+
+The feature is four small prompt insertions plus one authored artifact. The design work is mostly about /placement/ and /wording/: these are synced templates, so a prompt that reads as nagging gets paid forward to every project on every run. The governing constraint is "light enough that an agent welcomes it, specific enough that it actually fires."
+
+** The best-practices node (the artifact)
+
+The node lives at =~/org/roam/agents/<timestamp>-agent-kb-best-practices.org=, authored by hand (not agent-generated), with the standard =:agent:reference:= filetags so it's a first-class KB node agents can find by the same =rg= the rule already documents. It is the one node startup links to, and the substance the workflow prompts point at instead of re-explaining note-taking inline.
+
+Its content is curated from the established note-taking literature — Sönke Ahrens' systematization of Luhmann's Zettelkasten, Andy Matuschak's evergreen-notes practice, and the org-roam community's own guidance — distilled to the handful of principles that matter for an /agent/ writing /durable facts/, not a human building a thinking environment. Proposed outline:
+
+1. *Why the KB exists* — one paragraph: a cross-project, cross-machine asset that compounds. Consulting it saves re-deriving; contributing to it pays the next agent forward.
+2. *One idea per node (atomicity).* Each node holds a single durable fact. Atomicity is what makes a note linkable and findable — a node about three things links cleanly to none of them. (Ahrens; zettelkasten.de atomicity guide.)
+3. *Descriptive, declarative titles.* The title states the claim, not the topic: "SSH auth routes through gpg-agent with a separate cache TTL" beats "SSH notes." A title you can read as a standalone statement is one a future agent can scan and trust without opening the node. (Matuschak evergreen notes; org-roam community practice.)
+4. *Link liberally.* Use =[[id:...]]= to connect a new node to related ones; the value is in the network, not the isolated note. Link to Craig's hand-authored nodes, never edit them. (Matuschak "densely linked"; the linking principle.)
+5. *Capture, then promote.* Harness memory is the fast capture layer; the KB is for facts that cleared the durability bar. Don't promote everything — promote what transfers. (Mirrors =knowledge-base.md='s capture-then-promote.)
+6. *What goes in / what stays out.* Restate the rule's inclusion bar tersely (durable, cross-project, the why behind a decision, environment gotchas, reference pointers) and the exclusion bar (session state, task state, high-churn facts, secrets, anything the repo already records).
+7. *The write boundary.* One line pointing at =knowledge-base.md=: personal projects only, work and unknown projects never write — with the refusal contract. The node /defers/ to the rule here rather than restating the denylist, so there's one source of truth for the boundary.
+8. *Sources.* The citations below, as a reference footer.
+
+Two-altitude note: for a /reading/ agent the node is "how do I tell a good node from a bad one before I trust it?"; for a /writing/ agent it's "what shape should this fact take before I commit it?" The outline serves both — principles 2-4 are the writing checklist, 6-7 are the reading/eligibility filter.
+
+** The four workflow prompts (placement + wording)
+
+Each is the minimum that fires reliably without nagging. Exact insertion points and proposed copy are in Implementation phases below; the design rationale per prompt:
+
+- *Startup (link + light encouragement).* Startup already reads =notes.org= and surfaces nudges in Phase C. The KB encouragement rides there as one line, not a new phase — it points at the best-practices node and frames the session's contribution as welcome, not required. It fires once per session at the top, setting the frame; the other three prompts collect on it.
+- *Triage-intake (end-of-flow reminder).* Placed at the very end of Phase D / Exit Criteria, after actions ship — the moment the agent has just seen a sweep's worth of signal and might recognize a durable pattern. One line, conditional in spirit ("if anything here was durable…"), never a blocking step before close-out.
+- *Inbox-zero (end-of-flow reminder).* Same shape, placed in Phase D (Surface) after the moved/folded/dropped report — the agent has just triaged a batch and may have spotted a reference pointer worth keeping.
+- *Wrap-up (early prompt feeding the existing receipt).* Placed at the /start/ of Step 1, before the Summary is finalized, while the session is fresh — "what did you learn worth remembering, for yourself or a future agent?" The answer flows into the existing Step 1 KB-promotion sub-section and its =KB: promoted N / consulted yes-no= receipt. The early prompt and the existing check are one pipeline: the prompt captures while fresh, the existing sub-section does the promotion and writes the receipt. No second receipt.
+
+** How the early wrap-up prompt feeds the existing receipt
+
+The existing wrap-up Step 1 already has a "KB promotion check" sub-section that asks the promotion question and writes =KB: promoted N / consulted yes-no=. The new early prompt is not a second check — it's a /relocation of the asking/ to the top of Step 1 so the question lands while the session is fresh rather than after the Summary is reconstructed. The existing sub-section keeps ownership of the actual promotion (writing the =agents/= nodes per schema) and the receipt line. Concretely: the early prompt asks and collects candidate facts into the session's working notes; the existing sub-section consumes those candidates, writes the nodes, and emits the one receipt. This avoids duplication by making the early prompt a /capture/ step and the existing check the /commit + receipt/ step of the same pipeline.
+
+* Alternatives Considered
+
+** A blocking gate ("you must write ≥1 node to wrap up")
+- Good, because it would guarantee contributions and grow the KB fast.
+- Bad, because it manufactures junk — agents would write a throwaway node to clear the gate, polluting exactly the asset the feature is meant to grow. It also fights the "light, non-nagging" constraint head-on.
+- Neutral, because the receipt already gives visibility into contribution rate without forcing it.
+
+** Inlining the best-practices guidance into each workflow prompt
+- Good, because the guidance is right there at the point of use; no indirection.
+- Bad, because it's four copies of the same note-taking advice in four synced templates — duplication that drifts, and four times the prompt length, which reads as nagging. One linked node keeps each prompt to one line.
+- Neutral, because a one-node-plus-links shape is exactly what the best-practices node /teaches/, so the design eats its own dogfood.
+
+** Putting the encouragement only in =knowledge-base.md= (no workflow edits)
+- Good, because it's the least change — one rule edit, no template churn.
+- Bad, because that's the status quo that produced the problem: a rule read once at load and then forgotten. Habits in this system come from workflow prompts, not background rules.
+- Neutral, because the rule still carries the authoritative boundary; the workflow prompts are the habit layer on top.
+
+* Decisions [6/6]
+
+** DONE Where exactly does the startup link land — Phase A read, Phase C nudge, or notes.org?
+- Owner / by-when: Craig / before implementation
+- Context: Startup has three candidate homes for the KB encouragement: a Phase A parallel read of the best-practices node (costs context every session), a Phase C surfaced nudge (one line, conditional, consistent with the existing roam-inbox and task-review nudges), or a static line in each project's =notes.org= Active Reminders (per-project, not synced, drifts). The Phase C nudge matches the established nudge pattern and costs nothing when there's nothing to say.
+- Decision: We will add the encouragement as a one-line Phase C nudge in startup.org, pointing at the best-practices node by its KB path, surfaced once near the other Phase C nudges.
+- Consequences: easier — consistent with existing nudge mechanics, synced to every project, no per-session read cost; harder — one more line competing for attention in the Phase C surface, so the wording has to earn its place and stay terse.
+
+** DONE Is the startup nudge unconditional, or gated on the KB clone being present?
+- Owner / by-when: Craig / before implementation
+- Context: =~/org/roam/= isn't on every machine. The existing roam-inbox nudge already guards on the clone's presence ([ -f ~/org/roam/inbox.org ]). An unconditional KB nudge would fire on machines where the agent can't act on it.
+- Decision: We will gate the startup nudge on the roam clone being present, reusing the existing presence check, so the encouragement only appears where the agent can act on it.
+- Consequences: easier — no dead nudge on KB-less machines, mirrors the roam-inbox guard; harder — one more conditional in Phase C, and a machine without the clone gets no encouragement at all (acceptable — it can't contribute there anyway).
+
+** DONE Does the early wrap-up prompt stop and ask Craig, or self-answer silently?
+- Owner / by-when: Craig / before implementation
+- Context: Wrap-up is meant to be quick — Craig already authorized the wrap, and the existing KB-promotion check self-answers (the agent decides what's durable; work projects skip the write). An early prompt that /stops and asks Craig/ "what did you learn?" would add an interactive turn to a flow designed not to have them. But a purely silent self-answer risks the agent skipping the reflection.
+- Decision: We will have the agent self-answer the early prompt — reflect on session learnings and stage candidate facts — without stopping to ask Craig, matching the wrap-up's no-extra-turns design; the candidates flow into the existing promotion check which writes the nodes and receipt.
+- Consequences: easier — preserves wrap-up cadence, no new interactive gate, one pipeline from reflect to receipt; harder — relies on the agent actually reflecting rather than rubber-stamping "nothing learned," which the receipt makes visible over time but doesn't enforce.
+
+** DONE Do triage-intake and inbox-zero reminders fire every run, or only when the run surfaced something durable?
+- Owner / by-when: Craig / before implementation
+- Context: Both workflows run frequently (triage-intake between meetings, inbox-zero twice a session). A reminder on /every/ run is the textbook nag-fatigue failure — a line the agent learns to skip. A reminder gated on "this run surfaced a pattern / reference pointer worth keeping" fires rarely and stays meaningful, but requires the agent to make that judgment, which is softer than a mechanical condition.
+- Decision: We will make both reminders conditional in spirit — a single line phrased as "if anything here was durable, write it to the KB" that the agent acts on only when the run actually surfaced something, rather than an unconditional step; an all-quiet triage sweep or an empty inbox-zero run emits no KB line.
+- Consequences: easier — the reminder stays rare and credible, never pads a no-change sweep, fits triage-intake's deltas-only discipline; harder — "durable-looking" is an agent judgment with no mechanical check, so the reminder's effectiveness rides on the best-practices node teaching that judgment well.
+
+** DONE Best-practices node: agent-authored once, or hand-authored by Craig?
+- Owner / by-when: Craig / before implementation
+- Context: =knowledge-base.md= says agents never edit Craig's hand-authored nodes. The best-practices node is /about/ how agents write nodes — if an agent authors it, future agents may treat it as fair game to edit; if Craig hand-authors it, it's protected and stable but he writes it. Given it's a foundational reference the whole feature points at, stability matters.
+- Decision: We will have Craig hand-author the best-practices node from the outline in this spec, so it's a protected, stable reference; the spec supplies the full drafted content for him to review and commit.
+- Consequences: easier — the node is stable and protected from agent edits, one authoritative reference; harder — Craig writes (or reviews-and-commits) it rather than delegating, and updates to it are his call, not an agent's.
+
+** DONE Read side: how does startup surface lessons to consult, not just encourage contribution?
+- Owner / by-when: Craig / ratified 2026-06-20
+- Context: The original spec only strengthened the /write/ side — startup encourages contributing (D1) but never surfaces existing KB lessons to /read/. The wrap-up receipt data shows "consulted no" across recent sessions: agents don't reach for the KB because nothing brings it to their attention at the moment work starts. =knowledge-base.md='s "search the KB first" is reactive and read-once-at-rule-load. A proactive surfacing at startup is the missing counterpart to D1. The cost constraint is the same one D1 dodged: a full Phase A read of matching nodes would spend context every session.
+- Decision: We will add a second startup Phase C nudge (alongside D1's contribute-link, gated on the same roam-clone presence check) that surfaces KB lessons relevant to the current project — a count plus the nodes' declarative /titles only/ (no full-node read), capped at ~5. Relevance is matched cheaply on the project basename and obvious topic words against node titles/filetags/paths, with a most-recent fallback when nothing matches. The agent opens a node on demand. Titles are declarative by the best-practices node's own rule, so a title alone tells the agent whether to open it.
+- Consequences: easier — closes the "consulted no" half with near-zero context cost (titles only), reuses the Phase C nudge pattern and the roam guard, and the consult and contribute nudges sit together as one KB surface; harder — relevance matching is a heuristic that can miss or mis-surface, and it adds a second KB line to Phase C, so both must stay terse to avoid nudge fatigue. If the receipt shows consults rising but the surfaced titles are noise, tighten the match.
+
+* Implementation phases
+
+** Phase 1 — Author the best-practices node
+Write =~/org/roam/agents/<timestamp>-agent-kb-best-practices.org= from the outline in Design, with a generated =:ID:=, =#+title:=, =:filetags: :agent:reference:=, the eight content sections, =[[id:...]]= links to any existing related =:agent:= nodes, and the sources footer. Commit + push the roam repo per =knowledge-base.md='s session discipline. Leaves the KB with one new reference node and nothing else touched.
+
+** Phase 2 — Wire the startup encouragement (contribute + consult)
+Add two one-line Phase C nudges to =claude-templates/.ai/workflows/startup.org= (canonical side), both gated on the roam-clone presence check: (1) D1's contribute-link pointing at the best-practices node by path, and (2) D6's consult-surface listing project-relevant KB node titles (count + titles only, capped ~5, project-basename match with recent fallback). A Phase A read counts =:agent:= nodes cheaply so Phase C only does the title surfacing when there's something to show. Run =scripts/sync-check.sh --fix=, commit both canonical + mirror. Propagates to every project on next startup.
+
+** Phase 3 — Wire the three remaining prompts
+Add the end-of-flow KB reminder to =triage-intake.org= (end of Phase D / Exit Criteria) and =inbox-zero.org= (Phase D Surface), and the early KB prompt to =wrap-it-up.org= (top of Step 1, feeding the existing promotion check). All on the canonical side, then sync-check + commit. Each edit is one short block; the tree stays working after each.
+
+** Phase 4 — Verify propagation + receipt linkage
+Confirm the four edits survive a startup sync into a test project, the wrap-up early prompt's output reaches the existing =KB: promoted N / consulted yes-no= receipt (no duplicate receipt), and the best-practices node is reachable by the =rg= the rule documents.
+
+* Acceptance criteria
+- [ ] Best-practices node exists at =~/org/roam/agents/= with =:agent:reference:= tags, is found by =rg '#\+filetags:.*:agent:' ~/org/roam/=, and cites its sources.
+- [ ] Startup surfaces a single KB-contribution line in Phase C, gated on the roam clone, pointing at the node — and stays silent when the clone is absent.
+- [ ] Startup also surfaces a KB-consult line in Phase C (D6): project-relevant node titles (count + titles only, capped ~5), gated on the clone, silent when nothing matches and the clone is absent.
+- [ ] Triage-intake and inbox-zero each emit one KB reminder line only when the run surfaced something durable; an all-quiet run emits none.
+- [ ] Wrap-up asks the "what did you learn?" reflection early in Step 1, and its candidates feed the existing promotion check — producing exactly one =KB: promoted N / consulted yes-no= receipt, not two.
+- [ ] No workflow blocks, stalls, or fails because a node wasn't written.
+- [ ] All four workflow edits are on the canonical =claude-templates/.ai/= side, mirror synced, sync-check clean.
+
+* Readiness dimensions
+- Data model & ownership: KB nodes are agent-written under =agents/=; the best-practices node is Craig-authored and protected. No new persisted state beyond the one node and the four template edits. Wrap-up receipt ownership unchanged.
+- Errors, empty states & failure: roam clone absent → all KB prompts silently no-op (reuse existing presence guards). Work/unknown project → write boundary in =knowledge-base.md= still refuses with its contract; prompts fire but the agent declines to write per the rule. No silent data loss — nothing is deleted.
+- Security & privacy: no secrets in nodes (rule's exclusion bar). Work-confidential facts never written (the boundary). The best-practices node is reference-only, no sensitive content.
+- Observability: the existing =KB: promoted N / consulted yes-no= receipt is the single metric; grepping session archives for =KB:= answers "are agents using this?" No new instrumentation added.
+- Performance & scale: four one-line prompts; negligible. The startup nudge is a Phase C surface line, not a Phase A read, so no per-session context cost from loading the node.
+- Reuse & lost opportunities: reuses the existing Phase C nudge pattern, the roam-clone presence guard, the wrap-up promotion check + receipt, and =knowledge-base.md='s boundary. Nothing reinvented.
+- Architecture fit & weak points: the four workflows are synced templates; canonical-vs-mirror edit discipline applies (CLAUDE.md). Weak point — nag fatigue if the reminders fire unconditionally; mitigated by the conditional-in-spirit decision. Weak point — the reminders rely on agent judgment ("durable-looking"); mitigated by the best-practices node teaching that judgment.
+- Config surface: none. No new knobs; the prompts are unconditional copy gated only on the existing roam-clone check.
+- Documentation plan: the best-practices node /is/ the user-facing doc. =knowledge-base.md= stays the authoritative rule; this feature adds no new rule file. No migration doc needed.
+- Dev tooling: =scripts/sync-check.sh --fix= keeps canonical + mirror aligned (enforced by =githooks/pre-commit=). =make test= covers the repo's existing gates; no new test target needed for prose-only workflow edits.
+- Rollout, compatibility & rollback: edits propagate via the startup rsync to every project on next session — no migration. Rollback is reverting the four template edits + deleting the node; nothing persisted depends on them. Fully reversible.
+- External APIs & deps: none — no API calls, no new dependencies. The only external surface is the =~/org/roam/= git repo, already in use by the rule.
+
+* Risks, Rabbit Holes, and Drawbacks
+- *Nag fatigue* — the central risk. Four prompts across four frequently-run workflows can train agents to skip them. Dodge: one line each, conditional in spirit, the startup line gated, the triage/inbox reminders firing only on real signal. If the receipt shows agents tuning them out, cut the lowest-value prompt rather than adding more.
+- *Junk-node accumulation* — encouraging contribution without a quality bar grows a junk drawer. Dodge: the best-practices node /is/ the quality bar, and the exclusion list keeps high-churn / session-state facts out. Craig prunes at will (the rule already grants this).
+- *Receipt double-counting* — if the early wrap-up prompt writes its own receipt, the metric breaks. Dodge: the early prompt is explicitly a capture step feeding the existing check; only the existing sub-section emits the receipt. Acceptance criterion guards this.
+
+* References / Appendix
+Sources for the best-practices node's curated content:
+- Sönke Ahrens, /How to Take Smart Notes/ — atomicity, own-words, linking: [[https://www.soenkeahrens.de/en/takesmartnotes][soenkeahrens.de]]; principle of atomicity: [[https://zettelkasten.de/atomicity/guide/][zettelkasten.de atomicity guide]].
+- Andy Matuschak, /Evergreen notes/ — concept-oriented, densely linked, write for yourself: [[https://notes.andymatuschak.org/Evergreen_notes_should_be_concept-oriented][notes.andymatuschak.org]].
+- Org-roam community practice — declarative titles, atomic nodes, capture-then-refine: [[https://www.orgroam.com/manual.html][Org-roam manual]]; [[https://lucidmanager.org/productivity/taking-notes-with-emacs-org-mode-and-org-roam/][lucidmanager.org org-roam guide]].
+- Existing rule this builds on: =~/code/rulesets/claude-rules/knowledge-base.md=.
+
+* Review and iteration history
+** 2026-06-16 Tue — author
+- What: initial draft.
+- Why: Craig wants the org-roam KB to compound into a cross-project asset; needs the workflow wiring + curated best-practices node speced before building.
+- Artifacts: this spec; four target workflows (startup, triage-intake, inbox-zero, wrap-it-up); =knowledge-base.md=.
+** 2026-06-20 Sat — ratified + read-side added
+- What: ratified all five original decisions; added decision D6 (read-side startup consult-nudge) and threaded it through Design, Phase 2, and acceptance. Status draft → approved.
+- Why: receipt data showed the write-only design left "consulted no" across recent sessions. Craig asked for the reverse of contribution — surfacing relevant lessons to read at startup. D6 is that counterpart.
+- Artifacts: this spec; startup.org (now two Phase C nudges); the lint level-2-dated-header checker tracked separately.
diff --git a/docs/specs/agent-knowledge-base-spec.org b/docs/specs/agent-knowledge-base-spec.org
new file mode 100644
index 0000000..e36d897
--- /dev/null
+++ b/docs/specs/agent-knowledge-base-spec.org
@@ -0,0 +1,319 @@
+#+TITLE: Agent Knowledge Base on Org-roam — Spec
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-06-10
+#+TODO: TODO | DONE
+#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+* IMPLEMENTED Agent Knowledge Base on Org-roam — Spec
+:PROPERTIES:
+:ID:       08a5ec99-9e1e-40e4-8241-e8a41e9de49f
+:END:
+- 2026-07-02 Thu @ 00:17:01 -0400 — retrofitted by spec-sort; status set to IMPLEMENTED (reason: v1 (Phases 0-4) shipped 2026-06-10 on Craig's go; KB live at ~/org/roam with the knowledge-base rule installed machine-wide)
+
+* Metadata
+| Status   | implemented                                                                                                                     |
+| Owner    | Craig Jennings                                                      |
+| Reviewer | Craig Jennings; Codex (2026-06-10)                                  |
+| Related  | [[file:../../todo.org][todo.org — "Check that memories are sync'd across machines via git"]] |
+
+This spec supersedes the 2026-06-05 draft (formerly docs/design/2026-06-05-org-roam-knowledge-base-spec.org, removed; content in git history), folding in Craig's 2026-06-10 ratification answers and restructuring to the spec-create format.
+
+* Summary
+
+Agents adopt Craig's existing org-roam knowledge base (~490 org files, curated since 2023) as the shared, cross-project store for durable knowledge. The KB moves from Syncthing to git (D8): it relocates out of the =~/sync/org= share into its own repo with a cjennings.net remote, synced the way every other repo is. Per-project harness memory stays as a fast capture layer; durable facts get promoted into the KB. This replaces two abandoned designs (a dedicated memory repo, a two-tier rules split) with the substrate that already exists — now with git history and revertability on every write.
+
+* Problem / Context
+
+Per-project agent memory lives at =~/.claude/projects/<encoded-cwd>/memory/=, a harness-owned path that is unmanaged and unsynced, so it doesn't survive a new-machine setup or dotfiles restore. Anything durable an agent learns is at-risk by default.
+
+Two fixes were built or designed and dropped. A dedicated =claude-memory.git= repo was built then reversed because it pooled work-confidential and personal memory into one store. A two-tier split (general lessons to rules, project memory to each project's =.ai/=) left public-remote code projects' memory at-risk by design.
+
+The simplification: the knowledge base already exists and already syncs across machines. The task stops being "build a memory-sync system" and becomes "point agents at the existing KB." The 2026-06-10 transport revision (D8) moves that sync from Syncthing to git — history, revertability, and explicit per-machine replication, with the KB syncing like every other repo instead of through a second mechanism.
+
+* Goals and Non-Goals
+
+** Goals
+- Durable, cross-machine agent knowledge synced the same way as every other repo (git, cjennings.net remote).
+- Agents query the KB before relying on remembered project facts, prior decisions, or reference material.
+- Agent-written notes are distinguishable from Craig's, and index cleanly in his org-roam.
+- The work/personal confidentiality boundary is explicit and enforced on the write side.
+
+** Non-Goals
+- The rules layer (=claude-rules/=, =CLAUDE.md=) is untouched. The KB replaces the memory tier, not the rules tier.
+- No Emacs/org-roam package integration; agents never touch the SQLite cache.
+- No autonomy expansion. Free agent writes apply to the KB only — email, Linear comments, PRs, and every other public or external channel still require Craig's review and consent (D6).
+
+** Scope tiers
+- v1: the transport migration (Phase 0), the pointer rule, the write schema, the boundary, the guided memory sweep (Phase 1.5), one verified seed node, the promotion cadence with usage instrumentation, and the monthly hygiene pass (Phase 4).
+- Out of scope: wholesale import of historical harness memory (the sweep is guided and approved, not bulk); auto-promotion.
+- vNext: a =/promote= command if the wrap-up prompt proves insufficient; an =:agent:inbox:= staging tag if free writes prove too noisy.
+
+* Design
+
+The KB is a directory of plain org files with =#+title=, =#+filetags=, =:ID:= property drawers, and =[[id:UUID]]= links. That is the entire interface.
+
+For a *reader* (any agent): query with ripgrep over content and tags; follow a link by grepping for its target =:ID:=. The backlink graph and database are Craig's Emacs conveniences — the files are the agent's interface. Before relying on a remembered project fact, a prior decision, or reference material, search the KB first.
+
+Conflict-file exclusion is part of the command contract, not a prose reminder — the KB carries dozens of Syncthing =*.sync-conflict-*= files (63 at last count) whose contents are stale or duplicated. The canonical commands:
+
+#+begin_src sh
+# content/tag search
+rg --glob '*.org' --glob '!*sync-conflict*' '<query>' <kb-path>/
+# follow an [[id:UUID]] link to its node
+rg --glob '*.org' --glob '!*sync-conflict*' ':ID:[[:space:]]+<uuid>' <kb-path>/
+#+end_src
+
+For a *writer* (personal-project agents only, per D5): create one node per fact, following roam conventions so the next =org-roam-db-sync= indexes it:
+
+#+begin_example
+:PROPERTIES:
+:ID:       <generated uuid>
+:END:
+#+title: <concise title>
+#+filetags: :agent:<scope>:
+
+<the fact, with [[id:...]] links to related nodes>
+#+end_example
+
+Filename follows roam's timestamp-prefix convention (=YYYYMMDDHHMMSS-slug.org=), under the =agents/= subdirectory (see Transport and layout below). The =:agent:= filetag makes =rg '#\+filetags:.*:agent:'= a clean inventory of everything agents wrote, so Craig can review or prune at will.
+
+** Transport and layout (D8)
+
+The KB is a git repo with a cjennings.net remote, not a Syncthing subtree. It relocates out of the =~/sync/org= share (that share's =.stfolder= root sits at =~/sync/org=) to a standalone path — proposed =~/org/roam=, final path set at migration — and =roam-dir= in Craig's =org-roam-config.el= follows it.
+
+- Craig's edits sync via a systemd user timer (the git-sync pattern: =pull --rebase=, commit, push every 15-30 minutes), preserving his zero-touch flow.
+- Agents need no new plumbing: pull before query, commit + push after write — the same session discipline as every other repo. Agent commits are attributable in history.
+- Per-machine replication is opt-in by clone. The work machine simply doesn't clone, which retires D5's replication exposure.
+- The phone needs no roam leg: Craig's on-demand pattern (an agent drops a doc into the separate =~/sync/phone= share, Syncthing pushes it to the phone) replaces syncing all of roam to mobile.
+- Agent writes land under the =agents/= subdirectory. org-roam scans recursively, so it's one database and one search surface with id-links intact, while the agent corpus stays physically corralled — hideable later with one =org-roam-file-exclude-regexp= line, and a future split is a =mv=. The =:agent:= filetag stays on every node regardless; the subdirectory is bulk management, the tag is identity.
+- Once migrated, the Syncthing =*.sync-conflict-*= file class disappears (git merges or conflicts loudly; it never forks silent copies). The query commands keep their exclusion globs through the transition; they're harmless afterward.
+
+** Inclusion criteria — what goes in, what stays out
+
+=knowledge-base.md= carries these as rules, so every project applies the same bar.
+
+In: durable facts with cross-project or cross-machine value — decisions and their why, environment and tooling gotchas, reference pointers (URLs, dashboards, key paths), lessons that transfer beyond the project that learned them.
+
+Out: anything the repo already records (code structure, git history, CLAUDE.md content), session state, task state (todo.org owns that), high-churn facts that will be stale in a month, secrets and credentials, and anything work-confidential (the D5 boundary).
+
+Existing knowledge migrates through a one-time guided sweep per project (Phase 1.5), not a wholesale import: the project's agent reads its harness-memory dir, classifies each fact against the criteria above (KB-worthy / stays local / stale-delete), and proposes the batch for Craig's approval. The memory frontmatter helps: =reference=-type and durable =feedback=-type memories are natural candidates; most =project=-type entries stay local.
+
+** Project classification and write routing (v1)
+
+D5's boundary needs an executable answer to "is this project allowed to write?" — inference from cwd names, remotes, or task content is too much discretion for a confidentiality boundary. The v1 source of truth is an explicit *work-root denylist* carried in =knowledge-base.md= (=~/projects/work= — confirmed complete by Craig, 2026-06-10; archangel is not work-scoped). Classification:
+
+- *Work* — the project root is, or sits under, a denylisted work root. No KB write, ever. The agent records durable facts per that project's own conventions (work already keeps its knowledge in its project tree); v1 adds no new work-side store.
+- *Personal* — the project root sits under a known project parent (=~/code/=, =~/projects/=, =~/.emacs.d=) and is not denylisted. KB writes allowed per D6.
+- *Unknown* — anything else. No KB write. Refuse and report.
+
+The refusal message contract (work and unknown alike): state the classification, name the durable fact in a one-line redacted summary, and say where it was or wasn't written — so Craig can re-route it deliberately instead of losing it silently.
+
+** Harness memory: capture, then promote
+
+Harness memory keeps its current role but is redefined as an ephemeral working set: fast, automatic, per-project, relevance-recalled, and allowed to be at-risk because nothing durable depends on it surviving. Durable or cross-machine-valuable facts get promoted into the KB as a deliberate step (wrap-up, a task audit, or an explicit prompt) — the same capture-on-landing / promote-on-review cadence the pattern catalog uses.
+
+A new =claude-rules/knowledge-base.md= rule (auto-installs via the Makefile RULES glob, like =patterns.md= — no Makefile change expected) is the bridge: it carries the KB path, the query commands, the write schema, the classification denylist, and the D5/D6 boundary with its refusal contract.
+
+* Alternatives Considered
+
+** Dedicated private git repo (=claude-memory.git=)
+- Good, because git gives history, review, and a deliberate sync step.
+- Bad, because it pooled work-confidential and personal memory into one all-machines store — the reason it was built and then reversed (2026-05-23/24).
+- Bad, because it added a new clone + symlink mechanism every machine must maintain.
+- Neutral, because the pooling objection is solved by D5's write boundary, not by avoiding git — D8 (2026-06-10) adopts git as the transport for the existing KB, keeping this alternative's history and revert benefits without its new-store cost.
+
+** Two-tier split (rules file + per-project =.ai/memory/=)
+- Good, because general lessons would load natively into every session via the rules layer.
+- Bad, because project memory in gitignored-=.ai/= projects stays at-risk by design — it solved sync only where =.ai/= was tracked.
+- Neutral, because the promote-general-lessons instinct survives in this design as KB promotion.
+
+** Org-roam KB (chosen)
+- Good, because the substrate already exists and already syncs — no new store to build.
+- Good, because the KB is already Craig's curated knowledge home; agent knowledge lands where he actually looks.
+- Bad as originally specced on Syncthing — no review gate, no history, silent conflict forks. D8 moves the transport to git, which restores history and revert and softens the D6 risk to "revertable after push."
+- Neutral, because agents read it as plain files; no org-roam tooling required or used.
+
+* Decisions
+
+** D1 — The KB is a queried substrate, accessed as files
+- State: accepted (Craig, 2026-06-10)
+- Context: an agent is a harness process, not an Emacs session; it cannot call org-roam's Elisp API or read its SQLite cache.
+- Decision: We will treat the KB directory (pre-migration =~/sync/org/roam/=; post-Phase-0 the D8 git checkout) as a directory of plain org files — ripgrep for search, grep-for-=:ID:= to follow links.
+- Consequences: easier — works in every runtime, no Emacs dependency; harder — no backlink graph or db-backed queries for agents (acceptable: ~490 tagged, linked text files grep well).
+
+** D2 — Capture in harness memory, promote into the KB
+- State: accepted (Craig, 2026-06-10)
+- Context: harness memory is fast and auto-recalled but unsynced; the KB is durable but query-only.
+- Decision: We will keep both with distinct roles — harness memory captures, the KB holds what's promoted. Promotion is deliberate (wrap-up, task audit, or explicit prompt), never automatic.
+- Consequences: easier — the at-risk problem dissolves (what stays unsynced is by definition the regenerable hot set); harder — promotion is a discipline that has to actually happen, or value silts up in the capture layer. D7 (resolved: keep) confirms the capture layer stays, so this decision stands as written.
+
+** D3 — Surfacing via a pointer rule
+- State: accepted (Craig, 2026-06-10)
+- Context: agents need to know the KB exists, where it lives, and how to use it — in every project, on every machine.
+- Decision: We will ship =claude-rules/knowledge-base.md= carrying path, query method, write schema, and boundary. It auto-installs via the existing Makefile RULES glob.
+- Consequences: easier — one rule, machine-wide, same mechanism as =patterns.md=; harder — nothing material.
+
+** D4 — Write schema: roam-valid, =:agent:=-tagged, one node per fact
+- State: accepted (Craig, 2026-06-10: "per fact")
+- Context: agent writes must index cleanly in Craig's org-roam and stay distinguishable from his hand-authored notes. Granularity was open: per-fact nodes vs a per-project appended notes file.
+- Decision: We will write one roam-valid node per fact (=:ID:= drawer, =#+title=, =#+filetags: :agent:<scope>:=, timestamp-prefixed filename), linking related nodes by =[[id:]]=.
+- Consequences: easier — roam-native, linkable, =rg :agent:= inventories everything agents wrote; harder — more files (accepted; that's what roam is).
+
+** D5 — Write boundary: read-shared, write-scoped (option C)
+- State: accepted (Craig, 2026-06-10: "Your recommendation C is the right one.")
+- Context: the KB is personal and replicates to every Syncthing machine — Craig confirmed that includes a work machine. The leak risk is asymmetric and lives on the write side: a work agent writing confidential facts would pool them into the personal store.
+- Decision: We will let any project read the shared KB; only personal projects write to it. Work agents write to work's own project tree, never the shared KB. Craig confirmed C handles the work-machine replication acceptably.
+- Consequences: easier — reading value lands everywhere, confidential work data stays physically out of the KB; harder — work knowledge has no shared home (status quo, unchanged), and the inverse risk (personal facts surfacing in work artifacts) remains governed by the existing content-scope rules in =commits.md=. Under D8, machine replication is opt-in by clone, so the work machine holds no KB copy at all unless deliberately cloned.
+
+** D6 — Agent writes land freely in the KB, and only there
+- State: accepted (Craig, 2026-06-10)
+- Context: Syncthing has no git-style review gate; the alternative was a staging tag (=:agent:inbox:=) Craig promotes from.
+- Decision: We will let agent writes land freely in the KB without a review gate. This autonomy is scoped to the KB alone — it is not permission to send email, comment on Linear tickets, or post to any public or external channel; those still require Craig's review and consent.
+- Consequences: easier — no promotion queue to tend, knowledge lands immediately; harder — a bad write lands without pre-review (mitigated by the =:agent:= inventory, Craig's normal roam curation, and — under D8 — git history making every write attributable and revertable).
+
+** D7 — Harness memory stays as the capture layer
+- State: accepted (Craig, 2026-06-10: "keep")
+- Context: with the KB live, harness memory (=~/.claude/projects/<enc>/memory/= — the per-project store the harness auto-loads into context at session start) could either stay or retire. *Keep* preserves automatic relevance recall at zero query cost, at the price of two stores plus the promotion habit. *Retire* would mean one store and no promotion step, but recall stops being automatic and session start gets heavier.
+- Decision: We will keep harness memory as the ephemeral capture layer. D2 stands as written, and Phase 3's promotion cadence is required, not optional — it's what keeps the capture layer from silting up.
+- Consequences: easier — automatic recall keeps working, no harness behavior changes; harder — two stores and a promotion discipline (mitigated by Phase 3's mechanical wrap-up trigger).
+
+** D8 — Transport: git on cjennings.net, not Syncthing
+- State: accepted (Craig, 2026-06-10: "shape a looks good")
+- Context: the KB lived inside the =~/sync/org= Syncthing share — no history, no review gate, silent =*.sync-conflict-*= forks (63 at count), and share-level replication that pushed roam to a work machine whether wanted or not. The phone constraint dissolved on inspection: Craig's mobile pattern is on-demand (an agent drops a doc into the =~/sync/phone= share), not whole-roam sync, and he prefers it that way. The earlier dedicated-memory-repo design was rejected for pooling work and personal facts, which D5 now solves independent of transport.
+- Decision: We will move the KB out of the Syncthing share into its own git repo (proposed =~/org/roam=, cjennings.net remote). Craig's edits auto-sync via a systemd user timer; agents pull before query and commit + push after write; machines replicate by cloning, and the work machine doesn't clone.
+- Consequences: easier — per-fact history and blame, revertable and attributable agent writes, the conflict-file class disappears, explicit per-machine replication, one sync model everywhere; harder — a one-time migration (conflict-file cleanup, the move, the Emacs =roam-dir= update, a link sweep for the old path, timer setup) and staleness windows between timer runs where Syncthing was near-instant (mitigated by pull-at-session-start on both Craig's and agents' sides). D4's one-node-per-fact files make merge collisions rare by construction.
+
+* Implementation phases
+
+All five phases shipped 2026-06-10 (Craig's go, no-approvals batch). Completion details per phase live under the parent task in todo.org; what remains is the manual-testing child and the other personal machines' one-time clone + timer setup.
+
+** Phase 0 — Transport migration (D8)
+Resolve or delete the 63 =*.sync-conflict-*= files, move the directory out of the =~/sync/org= share to the new path, =git init= + =.gitignore= + initial commit + push to the cjennings.net remote, clone on the other personal machines (not the work machine), and set up the auto-sync timer. Update =roam-dir= in =org-roam-config.el= (handoff to the =.emacs.d= project) and sweep references to the old path (the =protocols.org= task-list pointer at =~/sync/org/roam/inbox.org=, among others) per the keep-links-current rule; a transition symlink at the old location covers stragglers. Verify: Syncthing no longer tracks the tree, org-roam indexes the new location, and an edit round-trips between two machines via the timer.
+
+** Phase 1 — Pointer rule
+The work-root denylist is confirmed (=~/projects/work= only, Craig 2026-06-10). Write =claude-rules/knowledge-base.md=: the KB path (post-migration), the canonical query commands, the D4 schema with the =agents/= subdirectory, the pull-before-query / commit-and-push-after-write discipline, the inclusion criteria, the classification + write-routing rules, the refusal contract, and the D5/D6 boundary. =make install= links it machine-wide via the existing RULES glob — no Makefile change. Tree stays working throughout (pure addition).
+
+** Phase 1.5 — Guided memory migration (one-time, per project)
+A small =migrate-memories= pass each project runs once: read the project's harness-memory dir, classify each fact against the inclusion criteria (KB-worthy / stays local / stale-delete), propose the batch, and write approved facts as =agents/= nodes. Craig approves per project; no wholesale import.
+
+** Phase 2 — Seed node + index verification
+The seed node is the KB's own user-facing documentation: a "How the agent knowledge base works" node — what agents write, the =:agent:= tag, the inventory command, what Craig can prune. A genuine durable fact, it doubles as Craig's doc and validates the schema end-to-end: Craig confirms it indexes and displays (=org-roam-db-autosync-mode= is on, so no manual sync step). Rollback if the schema fails: delete or revert that one node.
+
+** Phase 3 — Promotion cadence + usage instrumentation
+Wire the promotion prompt into the wrap-up workflow (a "anything worth promoting to the KB?" check), and have wrap-up record one line in the session summary: "KB: promoted N / consulted yes-no." That single line makes usage measurable by grepping session archives — the input to the Success metrics checkpoint. Note the cadence in =knowledge-base.md=. Resolves D2's discipline risk with a mechanical trigger.
+
+** Phase 4 — Maintenance automation
+A monthly agent-run hygiene pass: the =:agent:= inventory, orphan and duplicate detection, node-count trend — report dropped in the rulesets inbox, deletions proposed for approval (auto-cleanup allowed only for =:agent:=-tagged nodes). Extends the filed hygiene-reports task; under D8 there is no conflict-file purge left to schedule. Craig's recurring duty collapses to approving one short report a month.
+
+* Acceptance criteria
+
+- [ ] =claude-rules/knowledge-base.md= exists with path, query method, write schema, and the ratified D5/D6 boundary.
+- [ ] An agent in a personal project can find a relevant prior note by querying the KB.
+- [ ] An agent-written node indexes cleanly in Craig's org-roam on the next =org-roam-db-sync= and is identifiable via the =:agent:= filetag.
+- [ ] The capture/promote split and its trigger are documented.
+- [ ] A work-project agent, asked to store a durable fact, writes it to work's own tree, not the KB.
+- [ ] An unknown-classification project, asked to store a durable fact, refuses the KB write and reports the redacted fact per the refusal contract rather than guessing.
+- [ ] The documented query commands find a known note and exclude =*.sync-conflict-*= files.
+- [ ] The KB is a git repo with a cjennings.net remote; Syncthing no longer tracks it; the auto-sync timer round-trips Craig's edits between machines.
+- [ ] Agent writes land under =agents/= as attributable commits.
+- [ ] =knowledge-base.md= states the inclusion criteria and the pull / commit-push discipline.
+- [ ] The 30-day Success-metrics checkpoint is scheduled with its thresholds recorded.
+
+* Success metrics
+
+The acceptance criteria prove the mechanism; these prove the design. Measured from Phase 3's wrap-up instrumentation line and the =:agent:= inventory:
+
+- Usage: node count grows steadily, and promotions appear in wrap-up lines (most multi-session weeks promote at least one fact).
+- Recall wins: a KB query answers something that would otherwise be re-derived or re-asked. The decisive form is cross-project — a fact written in project A used in project B. Target: at least one cross-project win inside 30 days.
+- Boundary integrity: zero work-classified content in the KB, ever. Auditable via the inventory and the git log.
+- Hygiene: orphan and duplicate rate among =:agent:= nodes stays low enough that the monthly report is boring.
+
+30-day checkpoint: review the four together. Writes-but-no-recall-wins means agents aren't querying — strengthen the query trigger in =knowledge-base.md= rather than writing more. No writes at all means the promotion prompt isn't firing or the inclusion bar is set wrong. Either failure revises the rule, not the substrate.
+
+* Readiness dimensions
+
+- Data model & ownership: KB nodes are user-curated (Craig) or agent-authored (=:agent:= tag); agents never edit Craig's hand-authored nodes, only link to them. Harness memory stays agent-owned and ephemeral.
+- Errors, empty states & failure: a missing KB checkout (machine without the clone) means the rule's query step finds nothing — agents proceed without the KB and say so rather than fabricate recall. No write occurs to a nonexistent path.
+- Security & privacy: D5/D6 are the whole story — work never writes; KB writes never extend to public channels. No credentials live in the KB.
+- Observability: =rg '#\+filetags:.*:agent:'= inventories all agent writes; roam's UI shows them tagged in Craig's normal browsing; under D8 the git log is the per-write audit trail.
+- Performance & scale: 484 files today; ripgrep over a few thousand org files is milliseconds. N/A as a concern.
+- Reuse & lost opportunities: maximal — the entire design is reusing an existing synced, curated store instead of building one.
+- Architecture fit & weak points: mirrors the patterns.md pointer-rule shape; the existing Makefile RULES glob installs the new rule with no Makefile change. The legacy weak point (Syncthing conflict files, 63 today) is retired by D8 at migration; the query commands keep their exclusion globs through the transition.
+- Config surface: one path constant in =knowledge-base.md=. No knobs.
+- Documentation plan: =knowledge-base.md= is the documentation; the spec records the why.
+- Dev tooling: N/A because the interface is ripgrep and Write — no build, no tests beyond Phase 2's manual index check.
+- Rollout, compatibility & rollback: pure addition past Phase 0; rollback is deleting the rule file and (optionally) the =:agent:=-tagged nodes — a one-command sweep by tag, or a revert by commit under D8. Phase 0 itself is reversible until the Syncthing copy is retired.
+- External APIs & deps: git + the cjennings.net remote, the same dependency every other repo carries; otherwise plain files. Verified 2026-06-05/10: ~490 org files plus 63 conflict files at the pre-migration location inside the =~/sync/org= share; =org-roam-db-autosync-mode= is on in Craig's config.
+
+* Risks, Rabbit Holes, and Drawbacks
+
+- Un-reviewed writes land without a gate (D6 accepted this). Dodge: under D8 every write is an attributable, revertable commit, and the =:agent:= inventory keeps cleanup cheap.
+- Promotion discipline may not stick (D2). Dodge: Phase 3 makes it a mechanical wrap-up step rather than a memory burden, and its instrumentation line makes a lapse visible at the 30-day checkpoint.
+- Staleness between machines in the gap between auto-sync timer runs (Syncthing was near-instant). Dodge: pull at session start on both Craig's and agents' sides; D4's per-fact files make collisions rare when concurrent edits do happen.
+- The Phase 0 move breaks references to the old path (the Emacs =roam-dir=, the =protocols.org= task-list pointer). Dodge: the migration includes a link sweep plus a transition symlink at the old location.
+- An incomplete work-root denylist would let a work project classify as personal. Dodge: Craig confirmed the denylist (=~/projects/work= only, 2026-06-10), and the classification's safe default (unknown → refuse) covers anything outside the known parents.
+
+* Testing / Verification
+
+From the 2026-06-10 review, the verification surface for v1:
+
+- =make install= links =knowledge-base.md= into =~/.claude/rules/=.
+- In a personal repo, the documented =rg= command finds a known note.
+- In a work repo, a durable-storage request produces no write in the KB and the refusal report names the fact.
+- In an unknown project, the agent refuses or asks rather than guessing.
+- One approved seed node indexes via =org-roam-db-sync= and appears in the =rg '#\+filetags:.*:agent:'= inventory.
+- A =*.sync-conflict-*= file containing a unique token is excluded by the documented query.
+
+- After Phase 0: an edit made on one machine appears on another within the timer interval, no new =*.sync-conflict-*= files appear, and the work machine has no clone.
+
+The first, second, and last checks are agent-runnable; the org-roam display check, the work/unknown behavioral checks, and the cross-machine round-trip are Craig's manual validation (tracked in todo.org).
+
+* Review dispositions
+
+Modified recommendations from the 2026-06-10 Codex review, with reasons. Everything else was accepted as written.
+
+- *Drop-in =[#B]= implementation tasks as standalone top-level TODOs* — modified: the phase tasks hang as children under the existing parent task ("Check that memories are sync'd across machines via git"), per the one-parent-owns-the-effort convention in the response workflow. Content carried over intact.
+- *Update the fact count to the exact recursive number* — modified: the spec now says ~490 (Codex's own alternative), so routine KB growth doesn't churn the spec.
+- *Define the exact work-side write destination* — modified within the review's own options: v1 adds no new work-side store. Work projects keep their existing project-tree conventions, and the KB rule's only work-side behavior is the refusal + report.
+
+* Review and iteration history
+
+** 2026-06-05 Fri @ 05:57:35 -0500 — Claude (rulesets session) — author
+- What: initial one-page draft (five decisions, mechanics recommended), after Craig redirected the memory-sync task onto the existing org-roam KB.
+- Why: the dedicated-repo and two-tier designs both failed on the work/personal boundary or left memory at-risk; the KB already syncs.
+- Artifacts: original draft at docs/design/2026-06-05-org-roam-knowledge-base-spec.org (superseded by this file; content in git history).
+
+** 2026-06-10 Wed @ 14:29:20 -0500 — Craig Jennings (cj annotations) + Claude — author revision
+- What: folded in Craig's ratification answers (D5 = option C; Syncthing does replicate to a work machine and C stands; per-fact node granularity; free KB-only writes with the explicit no-public-channels boundary) and rewrote into the spec-create format. D7 (harness memory's fate) held open with the fuller explanation Craig requested.
+- Why: all but one decision ratified; the 2026-06-05 draft predated the spec-create template.
+- Artifacts: this file; implementation explicitly deferred pending Craig's go-ahead.
+
+** 2026-06-10 Wed @ 14:35:40 -0500 — Codex — reviewer
+- What changed or was recommended: reviewed implementation readiness and wrote a blocking review. The main blockers are unresolved D7 and the missing executable personal/work/unknown write-boundary classifier; medium notes cover concrete =rg= commands for conflict-file exclusion and seed-node approval/rollback mechanics.
+- Why: implementation would otherwise force the agent to invent memory architecture and confidentiality-boundary behavior at write time.
+- Artifacts: docs/agent-knowledge-base-spec-review.org (deleted on disposition completion per the response workflow; content summarized here and in Review dispositions).
+
+** 2026-06-10 Wed @ 14:39:41 -0500 — Claude Code (rulesets) — responder
+- What: processed the Codex review with Craig's D7 ratification ("keep") as a pre-agreed input. Both blockers cleared: D7 accepted (harness memory stays the capture layer, Phase 3 mandatory) and a new "Project classification and write routing" design subsection (work-root denylist as source of truth, unknown → refuse, refusal message contract, no new work-side store). Mediums accepted: canonical =rg= commands with conflict-file exclusion baked in, Phase 2 approval/rollback mechanics, Makefile no-change note, ~490 fact count, Testing/Verification section. Three recommendations modified (see Review dispositions); none rejected.
+- Why: converge to implementation-ready. Rubric: ready with caveats — the one caveat is confirming the work-root denylist contents with Craig before Phase 1 ships the rule.
+- Artifacts: this file; implementation-task breakdown under the parent task in todo.org; review file deleted.
+
+** 2026-06-10 Wed @ 17:29:37 -0500 — Craig Jennings — caveat resolved
+- What: confirmed the work-root denylist is complete at =~/projects/work= alone; archangel is not work-scoped.
+- Why: this was the single "ready with caveats" caveat. The spec is now ready. Implementation still awaits Craig's explicit go.
+- Artifacts: this file (status flipped to ready); the denylist VERIFY in todo.org resolved to a dated entry.
+
+** 2026-06-10 Wed @ 17:57:08 -0500 — Craig Jennings + Claude — amendment: D8 git transport + five design folds
+- What: Craig's five design questions answered and ratified. New D8: the KB moves out of the =~/sync/org= Syncthing share into its own git repo on cjennings.net (Shape A — whole roam), with an =agents/= subdirectory for agent writes, a systemd auto-sync timer for Craig's edits, opt-in-by-clone replication (work machine doesn't clone), and the phone staying on the on-demand =~/sync/phone= pattern. Folded in: inclusion criteria + a Phase 1.5 guided memory migration, a Success metrics section with a 30-day checkpoint, the seed node redefined as the KB's own user-facing documentation, and Phase 4 maintenance automation. Phases renumbered 0-4.
+- Why: the original verification proved mechanism but not success; migration and maintenance were unspecified; and Syncthing's no-history / no-gate / conflict-fork costs were the design's weakest accepted risks — git removes them for the price of a one-time migration. The phone constraint that might have blocked Shape A dissolved: Craig's mobile pattern is on-demand doc drops, not whole-roam sync.
+- Artifacts: this file; todo.org phase tasks updated (Phase 0, 1.5, and 4 added). Implementation remains held pending Craig's go.
+
+** 2026-06-10 Wed @ 18:21:33 -0500 — Claude (rulesets, no-approvals batch) — v1 implemented
+- What: all five phases shipped on Craig's go. Phase 0: roam migrated to =~/org/roam= as a git repo (roam.git on cjennings.net), 63 conflict files resolved by deletion (tarball backup kept), transition symlink at the old path, Emacs constants updated live, roam-sync timer running, old-path references swept. Phase 1: =knowledge-base.md= written and linked machine-wide. Phase 1.5: rulesets memories swept (3 promoted, 2 rule-encoded kept local, 1 de-staled); sweep handoff broadcast to the 10 other memory-bearing personal projects. Phase 2: the seed/doc node written and verified indexed (420 nodes). Phase 3: wrap-up promotion check + the =KB:= receipt line. Phase 4: =kb-hygiene.sh= + monthly timer, live-run verified.
+- Why: spec status was ready with all decisions ratified; Craig authorized the full batch.
+- Artifacts: rulesets commits fcf554a, d071f1f, 242b95e, b014095 (+ todo.org/spec bookkeeping); roam.git initial history; handoffs to .emacs.d, archsetup, and the 10 sweep targets. Outstanding: the manual-testing checklist and other-machine clones.
+
+** 2026-06-10 Wed @ 17:31:10 -0500 — Codex — reviewer
+- What changed or was recommended: re-ran the spec-review workflow after the caveat resolution. Rubric: ready. No new blocking or medium-priority findings; no review file written. Confirmed the implementation phases and test-surface tasks are already represented under the existing parent task in todo.org.
+- Why: the prior blockers are dispositioned, the work-root denylist is confirmed, the pointer-rule install path matches the current Makefile RULES glob, and v1's manual/agent-runnable verification surface is explicit.
+- Artifacts: this file; [[file:../../todo.org][todo.org]] parent task "Check that memories are sync'd across machines via git".
diff --git a/docs/specs/inbox-workflow-consolidation-spec.org b/docs/specs/inbox-workflow-consolidation-spec.org
new file mode 100644
index 0000000..4543e77
--- /dev/null
+++ b/docs/specs/inbox-workflow-consolidation-spec.org
@@ -0,0 +1,199 @@
+#+TITLE: Inbox Workflow Consolidation — Spec
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-06-23
+#+TODO: TODO | DONE
+#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+* READY Inbox Workflow Consolidation — Spec
+:PROPERTIES:
+:ID:       a7fe2a10-dfa8-4ba3-a11a-e7b1288b7573
+:END:
+- 2026-07-02 Thu @ 00:17:01 -0400 — retrofitted by spec-sort; status set to READY (evidence-based, human-confirmed)
+
+* Metadata
+| Status   | ready                                                       |
+|----------+-------------------------------------------------------------|
+| Owner    | Craig                                                       |
+|----------+-------------------------------------------------------------|
+| Reviewer | Craig                                                       |
+|----------+-------------------------------------------------------------|
+| Related  | [[file:../../todo.org][Consolidate inbox/triage workflows + scheduled inbox check]] |
+|----------+-------------------------------------------------------------|
+
+* Summary
+
+Four inbox-named workflows (=inbox-zero=, =process-inbox=, =monitor-inbox=, plus the startup/wrap-up nudges) circle the same disposition logic across three different surfaces. This spec consolidates them into one =inbox= engine with explicit modes, keeps =triage-intake= (external accounts) and =no-approvals= (session mode) separate, and adds an interactive recurring roam check (=auto inbox zero=). The fully-unattended cron pass is named but deferred to vNext.
+
+* Problem / Context
+
+"Too many inbox related workflows" (Craig, roam capture 2026-06-23). The word "inbox" is overloaded onto three genuinely different surfaces, and the workflows that serve them have grown to circle the same logic:
+
+- *Project-local =inbox/= dir* (handoffs from other projects/scripts/Craig) → =process-inbox.org= owns the value gate and disposition; =monitor-inbox.org= is a thin cadence layer on top of it ("loop process-inbox every 15 min + act-vs-file + reply discipline").
+- *Global roam inbox* (=~/org/roam/inbox.org=, GTD capture) → =inbox-zero.org=, whose Phase A already *calls* =process-inbox= for the local dir before doing the roam-routing part.
+- *External accounts* (email / calendar / PRs) → =triage-intake.org= + six source plugins.
+
+So a reader (or a non-Claude agent) facing "deal with my inbox" has to know which of four files to invoke, and the shared concepts — the three-question value gate, the skeptical review, the implement/fold/file/defer/reject disposition, the reply-to-sender discipline, the capture-guard before a roam write, the priority-scheme check before filing — are spread across and cross-referenced between them. The duplication is real (=monitor-inbox= and =inbox-zero= both lean on =process-inbox='s machinery) and the count is the symptom Craig named.
+
+A second gap surfaced in the same capture: there's no documented way to run a *recurring* inbox check, and Craig wants a keyword trigger for it. v1 answers this with an interactive in-session loop (=auto inbox zero=); a fully-unattended cron pass that fires while Craig is away is a larger contract (mutation safety, surfacing-when-away, cross-run state) and is deferred.
+
+* Goals and Non-Goals
+
+** Goals
+- One engine is the single entry point for the inbox surfaces, with the shared value-gate / disposition / reply / capture-guard / priority-scheme logic living in exactly one place.
+- Mode selection is unambiguous from the trigger phrase and the caller (startup, wrap-up, on-demand).
+- Every existing trigger phrase still works, routing to the right mode — no relearning.
+- A documented interactive recurring check (=auto inbox zero=, =/loop=-based). The fully-unattended cron pass (=/schedule=) is vNext, not v1.
+- INDEX.org, protocols.org, and the startup/wrap-up callers reconciled to the new shape with no dangling references.
+- No behavior regression: the value gate, disposition rules, capture-guard, and reply discipline behave exactly as today.
+
+** Non-Goals
+- *Not* merging =triage-intake.org=. External-account triage ("what's new across my email/cal/PRs") is a different domain from "my inbox dirs"; keeping it distinct is correct, not redundancy.
+- *Not* merging =no-approvals.org=. It's a session mode, not an inbox workflow (it's referenced by the monitor cadence, not part of it).
+- *Not* changing value-gate semantics or disposition rules. This is a structural merge, behavior-preserving.
+- *Not* the domain-aware whole-roam-inbox routing (still deferred, unchanged).
+- *Not* the agent-neutral language sweep over these files — that is the parked half of the agent-source task and runs *after* this merge, over fewer files.
+- *Not* renaming =CLAUDE.md=, =.claude/=, or other structural paths.
+
+** Scope tiers
+- v1: merge =process-inbox= + =monitor-inbox= + =inbox-zero= into one =inbox.org= engine with =process= / =monitor= / =roam= modes; preserve all trigger phrases; reconcile INDEX + protocols + startup + wrap-up; add the interactive =auto inbox zero= recurring check (=/loop=).
+- Out of scope: =triage-intake= merge, =no-approvals= merge, domain-aware roam routing, the agent-neutrality sweep.
+- vNext (log to todo.org): the fully-unattended =/schedule= cron pass — needs its own contract (read-only vs may-mutate =todo.org= / =~/org/roam/inbox.org=, how a find surfaces when Craig is away, how dedup state survives across runs, auth/session constraints); a later umbrella unifying =triage-intake='s "what's new" with the inbox engine; the agent-neutrality pass over the consolidated =inbox.org=.
+
+* Design
+
+The consolidation produces one engine file, =inbox.org=, structured as a shared core plus three thin modes. The core holds every concept that today is duplicated or cross-referenced: the three-question value gate, the skeptical review (with the cross-project battery for shared-asset proposals), the disposition ladder (implement-now / fold / file / defer / reject-by-source / park), the reply-to-sender discipline, the capture-guard before any roam-inbox disk write, and the priority-scheme check before filing. A mode is a short front section that says which surface it reads, how it enters and exits, and which core steps it runs.
+
+*Two altitudes.*
+
+For the *user*: the trigger phrase picks the mode, and the phrases are unchanged. "process inbox" / "handle the inbox" → process mode (the local =inbox/= dir). "monitor the inbox" / "watch the inbox" → monitor mode (process mode on a loop, with the act-vs-file and reply discipline and the clean-tree/green-suite gates). "inbox zero" / "process the roam inbox" → roam mode (route the global roam inbox by =<project>:= prefix, sweep empties, capture-guard the write). Startup calls process mode for the local dir and the read-only roam nudge; wrap-up calls process mode then the roam sweep.
+
+For the *implementer*: =inbox.org= is one file. The core sections are written once. Each mode is a section that references core steps by name rather than restating them ("run the value gate (core §X) on each item", "guard and reconcile the roam write (core §Y)"). The old three files are deleted; their content is absorbed, not copied. The =triage-intake= engine and its plugins are untouched and keep their own namespace.
+
+*Routing and callers.* protocols.org's terminology section and the startup workflow's INDEX-driven routing both key off trigger phrases, so the phrase→mode map is the contract. Each caller that today names =process-inbox.org= / =monitor-inbox.org= / =inbox-zero.org= (startup Phase C, wrap-up Step 3, protocols, INDEX) is repointed at =inbox.org= and the relevant mode. INDEX gets one entry for =inbox.org= listing every trigger phrase, grouped by mode.
+
+*Auto inbox zero (the scheduled mode).* The trigger phrase =auto inbox zero= starts a recurring roam-mode pass. On invocation the engine *asks Craig for the interval* (e.g. 30 min, 2 hours), then drives the loop with =/loop <interval>= running roam mode. It's in-session and interactive by design — each cycle reports, and a find waits for Craig's go before any work happens. Per cycle:
+
+- *Nothing found* → no inbox summary. A single acknowledgement line: ran at =HH:MM=, nothing found. Nothing else.
+- *Items found* → summarize the found items, file them as tasks, and *append them to a displayed queue* (the harness task list, =TaskCreate=) so the queue accumulates across cycles. Then ask: "run this batch next?" If Craig says yes, the engine launches into implementing the found items (each through the normal disposition + verify flow); if no, they stay queued for a later go. Subsequent cycles add only newly-found items to the same displayed queue, never re-surfacing what's already there.
+
+The acknowledge-only-on-empty rule keeps a quiet inbox quiet — no noise when there's nothing to do — while a find is always surfaced and gated on Craig's yes. =auto inbox zero= is the interactive =/loop= shape because its execute step waits for a yes, so it is inherently in-session.
+
+A fully-unattended =/schedule= cron pass (firing while Craig is away) is a different contract and is *vNext, not v1*: it can't wait for a yes, so it has to decide up front whether it may mutate =todo.org= and the roam inbox or stays read-only, how a find reaches Craig asynchronously, how dedup state persists between runs that don't share a session, and what session/auth context a cron run carries. v1 ships only the interactive loop; the unattended contract is logged to =todo.org= for its own design pass.
+
+* Alternatives Considered
+
+** Option A — One engine with modes (chosen)
+- Good, because it cuts four inbox-named files to one and puts the shared logic in a single authoritative place, which is exactly the "too many" complaint.
+- Good, because every trigger phrase can re-home to a mode with no user relearning.
+- Bad, because =inbox.org= becomes a larger file with internal mode branching.
+- Neutral, because =triage-intake= and =no-approvals= stay separate either way.
+
+** Option B — Keep three files, extract a shared include
+- Good, because the diffs are smaller and the per-surface entry points stay familiar.
+- Bad, because it does not reduce the file count — Craig's actual complaint is the number of files, and this keeps three plus adds an include.
+- Neutral, because the dedup of logic happens, just without the count reduction.
+
+** Option C — Merge only process-inbox + monitor-inbox, leave inbox-zero
+- Good, because it fixes the tightest, least-ambiguous redundancy (monitor is literally a loop over process) at the lowest risk.
+- Bad, because roam vs local stays two files; the consolidation is partial (4→3, not 4→2).
+- Neutral, because it could be a first phase of Option A rather than a competing end state.
+
+** Option D — Do nothing, just document which file is which
+- Good, because zero risk to load-bearing synced workflows.
+- Bad, because it doesn't reduce the count at all; the complaint stands.
+
+* Decisions [4/4]
+
+** DONE Engine shape — one file with modes vs partial merge
+- Context: Option A (one =inbox.org=, 4→1) maximally addresses "too many" but is the biggest single change to load-bearing synced files. Option C (merge the process/monitor pair only, 4→3) is lower-risk and could be A's first phase.
+- Decision: We will build Option A — one =inbox.org= engine with =process= / =monitor= / =roam= modes. (Craig, 2026-06-23.)
+- Consequences: easier discovery and one home for the logic; harder single-file size and a bigger, higher-blast-radius diff, mitigated by the shared-core + thin-mode structure and the plugin-namespace escape hatch for a mode that wants depth.
+
+** DONE Trigger-phrase routing — preserve all existing phrases
+- Context: protocols + startup route by phrase; users have these in muscle memory.
+- Decision: We will keep every existing trigger phrase, re-homing each to its mode on the one engine, adding only the new =auto inbox zero= phrase. (Craig, 2026-06-23 — accepted as recommended.)
+- Consequences: easier — no relearning, no broken muscle memory; harder — the engine must document a longer phrase→mode table and guard against collisions.
+
+** DONE triage-intake stays separate
+- Context: external-account triage is a different surface; folding it in would re-bloat the engine.
+- Decision: We will leave =triage-intake.org= and its plugins untouched, out of this consolidation. (Craig, 2026-06-23 — accepted as recommended.)
+- Consequences: easier — smaller, coherent inbox engine; harder — two "what's arriving" entry points remain (inbox engine vs triage-intake), documented so the boundary is clear.
+
+** DONE Scheduled-check mechanism + behavior + keyword
+- Context: Craig wants a recurring inbox check with a keyword, an interactive find-then-execute flow, and a running queue.
+- Decision: The trigger phrase is =auto inbox zero=. On invocation it asks Craig for the interval, then runs roam-mode on =/loop <interval>=. Empty cycle → one acknowledgement line (ran at HH:MM, nothing found), no inbox summary. Find → summarize, file as tasks, append to the displayed task queue (=TaskCreate=), and ask "run this batch next?"; on yes, implement the found items; subsequent cycles append only new finds to the same queue. =/schedule= stays available for a fully-unattended pass. (Craig, 2026-06-23.)
+- Consequences: easier — a quiet inbox stays quiet, a find is always gated on a yes, and the queue is one accumulating view; harder — the loop must dedup against already-queued items so it doesn't re-surface them, and the in-session =/loop= shape means the unattended case still needs =/schedule=.
+
+* Review findings [2/2]
+
+** DONE Fully unattended scheduled behavior is not specified :blocking:
+Disposition: accepted via the narrow option. v1 ships only the interactive =auto inbox zero= (=/loop=); the fully-unattended =/schedule= pass is deferred to vNext with its open contract questions named (read-only vs may-mutate, surface-when-away, cross-run dedup state, auth/session). Folded into Summary, Goals, Problem/Context, Scope tiers (vNext), and the Design "Auto inbox zero" subsection; the vNext contract is logged to todo.org. This sequences the unattended pass rather than dropping it, preserving Decision 4's intent.
+The Summary and Goals promise a scheduled unattended inbox check with trigger keywords, but the concrete =auto inbox zero= design is intentionally interactive: it asks for an interval, runs =/loop <interval>= in the live session, and waits for Craig before executing found work. The only unattended behavior is the sentence that =/schedule= remains available for a fully-unattended cloud-cron pass. That leaves an implementer to invent the actual scheduled contract: trigger phrase(s), whether the pass is read-only or may mutate =todo.org= / =~/org/roam/inbox.org=, how findings are surfaced when Craig is away, how dedup state survives across runs, and what auth/session constraints apply. Add a distinct =/schedule= subsection and acceptance criteria for the fully unattended mode, or narrow the Summary/Goals to say v1 ships only the interactive =/loop= mode and log the unattended cron shape as vNext. (blocking)
+
+** DONE Stale-reference verification relies on a checker that does not check workflow links
+Phase 2 and the Risks section rely on the workflow-integrity / INDEX-drift check as the backstop for missed references to =process-inbox.org=, =monitor-inbox.org=, and =inbox-zero.org=. Current =scripts/workflow-integrity.py= checks INDEX coverage, script references, plugin parentage, orientation sections, and duplicate trigger phrases; it does not validate arbitrary =[[file:...org]]= workflow links or prose references in workflows/protocols/rules. That means a deleted-workflow link in =startup.org=, =wrap-it-up.org=, or =protocols.org= can survive the named checker. Keep the grep requirement, but make it an explicit acceptance item with the exact scope: at minimum =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= after caller rewrites, allowing only intentional historical/spec/todo mentions. Optionally extend =workflow-integrity.py= to validate local workflow links, but do not imply it already catches this class. (non-blocking)
+
+Disposition: accepted. Added the exact grep as an acceptance item and a Phase 2 step, reworded the Risks "missed caller reference" dodge and the Dev-tooling readiness line so the integrity checker is no longer implied to validate workflow links, and noted the optional =workflow-integrity.py= extension as not-required-for-v1.
+
+* Implementation phases
+
+** Phase 1 — Author the inbox engine
+Write =inbox.org= (canonical =claude-templates/.ai/workflows/=): the shared core (value gate, skeptical review, disposition ladder, reply discipline, capture-guard, priority-scheme check) plus the three mode sections, absorbing the content of the three source files. No caller changes and no deletions yet — the tree still works with the old files in place, the new engine sits alongside for review.
+
+** Phase 2 — Reconcile callers and retire the old files
+Repoint INDEX.org (one =inbox.org= entry, phrases grouped by mode), protocols.org terminology, startup.org Phase C, and wrap-it-up.org Step 3 at =inbox.org= + mode. Delete =process-inbox.org=, =monitor-inbox.org=, =inbox-zero.org=. Grep for stale references — =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= — and clear every live caller (the integrity checker covers INDEX coverage and trigger duplication, not workflow links). Run the workflow-integrity / INDEX-drift check. Sync the mirror.
+
+** Phase 3 — Auto inbox zero + scheduled check
+Add the =auto inbox zero= mode to =inbox.org=: ask-for-interval, =/loop <interval>= over roam mode, the empty-cycle acknowledgement, and the find → summarize → file → queue → ask-to-execute flow with cross-cycle dedup against the displayed queue. Document the =/schedule= recipe for the fully-unattended pass alongside it.
+
+** Phase 4 — Verify
+Trigger-phrase coverage (every old phrase resolves to a mode), startup + wrap-up dry-run against the new engine, capture-guard still gates the roam write, INDEX drift clean, mirror in sync.
+
+* Acceptance criteria
+- [ ] Every trigger phrase that today routes to =process-inbox= / =monitor-inbox= / =inbox-zero= resolves to a mode of =inbox.org=.
+- [ ] The three old workflow files are deleted and absent from INDEX; the integrity check reports no orphan or stale entry.
+- [ ] After caller rewrites, =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= returns only intentional historical / spec / todo mentions — no live caller reference to a deleted file. (The integrity checker validates INDEX coverage, not arbitrary workflow links, so this grep is the real backstop.)
+- [ ] Startup still processes the local inbox and produces the read-only roam nudge; wrap-up still sweeps the project's roam items.
+- [ ] The capture-guard runs before any roam-inbox disk write in the consolidated engine.
+- [ ] The value gate, disposition ladder, and reply-to-sender discipline are present once and unchanged in behavior.
+- [ ] =auto inbox zero= asks for an interval, then runs roam mode on =/loop <interval>=.
+- [ ] An empty auto cycle emits only a timestamped acknowledgement (ran at HH:MM, nothing found) — no inbox summary.
+- [ ] A find summarizes the items, files them as tasks, appends them to the displayed queue, and asks before executing; "yes" runs the batch; later cycles append only newly-found items, never re-surfacing queued ones.
+- [ ] Canonical and mirror copies are in sync (=sync-check.sh=).
+
+* Readiness dimensions
+- Data model & ownership: the engine reads two files it doesn't own (project =inbox/= dir contents, =~/org/roam/inbox.org=) and writes =todo.org= + the roam file. Ownership unchanged from today; the merge moves no data.
+- Errors, empty states & failure: empty inbox → report and stop (preserved per surface); roam pull blocked or dirty → surface and stop, never auto-stash (preserved); live org-capture on the roam file → capture-guard blocks the write (preserved).
+- Security & privacy: N/A because no credentials or sensitive data; the engine moves task text between local files.
+- Observability: the user sees which mode ran and its disposition summary; INDEX drift check surfaces a mis-wired routing.
+- Performance & scale: N/A because the inputs are small text files triaged by hand-scale counts.
+- Reuse & lost opportunities: the whole point — the shared core is written once instead of three times; =triage-intake='s plugin pattern is intentionally not reused here (different surface).
+- Architecture fit & weak points: integration points are INDEX.org, protocols.org terminology, startup Phase C, wrap-up Step 3. Weak point: a missed caller reference to an old filename breaks routing — mitigated by the explicit stale-reference grep (acceptance item), since the integrity check covers INDEX coverage, not workflow links.
+- Config surface: trigger phrases (the phrase→mode table) and the scheduled-check keyword set + cron expression.
+- Documentation plan: the engine file is the doc; INDEX entry updated; protocols terminology updated. No separate user doc needed.
+- Dev tooling: the workflow-integrity check + startup INDEX-drift check cover INDEX coverage and trigger-phrase duplication; they do *not* validate workflow file links, so the stale-reference grep (acceptance item) is a manual step. Optionally extend =workflow-integrity.py= to validate local =[[file:...org]]= workflow links — not required for v1.
+- Rollout, compatibility & rollback: the merge lands via the template sync; =rsync --delete= removes the three retired files from every consuming project on its next startup, and the new =inbox.org= arrives the same pass. Rollback = git revert of the rulesets commit, then the next sync restores the old files. Trigger-phrase preservation is the compatibility guarantee.
+- External APIs & deps: N/A because no external API; =/schedule= and =/loop= are harness features, not deps.
+
+* Risks, Rabbit Holes, and Drawbacks
+- *Missed caller reference.* A lingering mention of =process-inbox.org= / =monitor-inbox.org= / =inbox-zero.org= in a workflow, protocol, skill, or the INDEX would break routing after the files are deleted. Dodge: =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= and clear every live caller before deleting. The workflow-integrity checker validates INDEX coverage and trigger-phrase duplication, *not* arbitrary =[[file:...]]= links, so the grep — not the checker — is the real backstop here.
+- *Single-file sprawl.* One engine with three modes risks becoming the wall-of-text the workflows were split to avoid. Dodge: the shared-core + thin-mode structure and the terseness pass; if a mode wants real depth, it can become a =inbox.<mode>.org= plugin under the engine namespace (the same pattern =triage-intake= uses) rather than bloating the core.
+- *Sequencing with the agent-neutrality sweep.* If the neutrality sweep runs first, it edits three files about to be deleted. Dodge: this consolidation lands first by construction (it's why the sweep was parked).
+
+* Review and iteration history
+** 2026-06-23 Tue @ 21:51:51 -0400 — Claude — author
+- What: initial draft.
+- Why: Craig chose to spec the inbox-workflow consolidation before building a load-bearing 3-to-1 merge of synced workflows.
+- Artifacts: docs/inbox-workflow-consolidation-spec.org; todo.org "Consolidate inbox/triage workflows + scheduled inbox check".
+** 2026-06-23 Tue @ 22:05:00 -0400 — Craig — decision-maker
+- What: resolved all four decisions; added the =auto inbox zero= scheduled mode (ask-for-interval, empty-cycle acknowledgement only, find → summarize → file → displayed queue → ask-to-execute, cross-cycle dedup). Status → ready for review.
+- Why: chose Option A (4→1 engine) and specified the recurring-check behavior in full.
+- Artifacts: Decisions [4/4]; Design "Auto inbox zero" subsection.
+** 2026-06-23 Tue @ 22:15:58 -0400 — Codex — reviewer
+- What: spec-review pass rated the spec =Not ready= and added two findings: the fully unattended =/schedule= behavior is not specified, and stale-reference verification leans on a checker that does not validate workflow links.
+- Why: the current design is strong enough for the consolidation and interactive =/loop= mode, but the stated scheduled/unattended goal would force implementers to invent behavior before shipping.
+- Artifacts: Review findings [0/2].
+** 2026-06-23 Tue @ 22:28:00 -0400 — Claude — responder
+- What: both findings accepted and folded. Finding 1 (blocking) resolved by narrowing v1 to the interactive =auto inbox zero= (=/loop=) and deferring the fully-unattended =/schedule= contract to vNext with its open questions named — Summary, Goals, Problem/Context, Scope tiers, and Design updated. Finding 2 resolved by adding the exact stale-reference grep as an acceptance item + Phase 2 step and dropping the over-claim that the integrity checker validates workflow links. Findings [2/2], Decisions [4/4]; scope narrowed (not expanded), so no readiness-rubric rerun needed. Status → Ready.
+- Why: the scheduled/unattended promise outran what Craig actually specced (he detailed the interactive loop); sequencing the cron pass to vNext keeps v1 honest. The checker genuinely doesn't catch stale workflow links.
+- Artifacts: Decisions [4/4]; Review findings [2/2]; vNext task to be logged for the unattended cron contract.
diff --git a/docs/specs/wrapup-routing-spec.org b/docs/specs/wrapup-routing-spec.org
new file mode 100644
index 0000000..1a150fc
--- /dev/null
+++ b/docs/specs/wrapup-routing-spec.org
@@ -0,0 +1,225 @@
+#+TITLE: Wrap-Up Inbox/Transcript Routing — Spec
+#+AUTHOR: Craig Jennings
+#+DATE: 2026-06-13
+#+TODO: TODO | DONE
+#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+* DOING Wrap-Up Inbox/Transcript Routing — Spec
+:PROPERTIES:
+:ID:       00b47414-2213-4a99-be35-48ceb266fc08
+:END:
+- 2026-07-02 Thu @ 00:17:01 -0400 — retrofitted by spec-sort; status set to DOING (evidence-based, human-confirmed)
+
+* Metadata
+| Status   | doing                                                            |
+|----------+-----------------------------------------------------|
+| Owner    | Craig Jennings                                      |
+|----------+-----------------------------------------------------|
+| Reviewer | Codex (spec-review)                                 |
+|----------+-----------------------------------------------------|
+| Related  | [[file:../../todo.org][todo.org: wrap-up routing task]] · [[file:../design/2026-06-13-wrapup-inbox-transcript-routing-proposal.org][archsetup proposal]] |
+|----------+-----------------------------------------------------|
+
+* Summary
+
+At wrap-up, an inbox handoff that belongs to another project, once accepted and filed locally, has no clean home in the current project's =todo.org=. This adds an optional routing step to =wrap-it-up.org=: surface the filed keepers whose home is elsewhere, recommend a destination for each, and on one confirmation deliver each to that project's =inbox/= via =inbox-send= (one handoff per task), removing it from the local =todo.org=. The destination's own next session files it through =process-inbox=, applying that project's value gate, priority scheme, and =todo-format.md=. A parallel step (vNext) files meeting-transcript recordings into the right project's =assets/=.
+
+* Problem / Context
+
+=process-inbox.org= dispositions each handoff as act / fold / file / reject, and "file as TODO" lands the task in the *current* project's =todo.org=. When the real home is a different project, the choices today are: file it locally and let it rot in the wrong tracker, hand-edit two projects' =todo.org= files, or defer it and carry the debt to next session.
+
+The wrap-up's existing Step 3 "Inbox sanity check" only counts unprocessed items and blocks the wrap until they clear. It answers "is the inbox clean?" — it doesn't route anything.
+
+Meeting transcripts have the same homelessness: a recording dropped during a session belongs in some project's =assets/=, but nothing moves it there at wrap.
+
+The friction is small per-item but recurring, and the manual cross-project edit is error-prone (two files, two repos, easy to leave one half-done).
+
+* Goals and Non-Goals
+
+** Goals
+- At wrap-up, surface filed keepers whose home is a different project, with a recommended destination each.
+- Route the whole batch on one confirmation ("go with recommendations") or leave it entirely ("skip"). No per-item triage.
+- Deliver each routable keeper to the destination's =inbox/= via =inbox-send=, one handoff per task, and remove the keeper from the local =todo.org= on send. The destination files it through its own =process-inbox=.
+- Provenance is automatic: =inbox-send= stamps the source project and date on every handoff (the =from-<source>= filename and =#+SOURCE:= line). The delivery shows in the destination inbox; the removal shows in the source's git diff.
+- The destination set is any project with an =inbox/= — reuse =inbox-send='s existing discovery.
+
+** Non-Goals
+- Not a wrap gate. A skip is a clean, complete wrap.
+- Not per-item triage. The interaction is batch-level: go or skip.
+- Not a replacement for =process-inbox.org='s value gate. Routing assumes the item is already an accepted keeper.
+- Not a confidence-free auto-mover. A low-confidence destination recommendation says so, and the batch "go" stays trustworthy because the surfaced list is reviewable before the keystroke.
+
+** Scope tiers
+- v1: task/event routing by =inbox-send= delivery to the destination's =inbox/=. The interaction, the recommendation engine, the candidate-set marker stamped at file time, reusing =inbox-send='s discovery and delivery.
+- Out of scope: per-item destination editing, an interactive correction loop, moving items that aren't accepted keepers, a new cross-repo =todo.org= move primitive (the superseded direct-move design).
+- vNext: meeting-transcript filing (gated on the unresolved source-location decision and the file-vs-file+extract question — see Decisions).
+
+* Design
+
+** User-facing (the wrap interaction)
+
+The router is a new sub-step of =wrap-it-up.org='s Step 3, running after the existing inbox sanity check. Its input is filed keepers, not raw inbox files (decision: Reading B): tasks =process-inbox= accepted and filed into the local =todo.org= this session whose inferred home is a different project. When the router finds such a keeper, it surfaces it in a list, one line each: the task, the recommended destination project, and a confidence marker when the inference is weak. Then two options, batch-level:
+
+1. Go with the recommendations — route every recommended item (inbox-send to the destination + local removal).
+2. Skip — leave the whole batch in place. A skip is a clean wrap.
+
+That is the entire interaction. No per-item walk. The surfaced list is the review surface; the single keystroke is trustworthy because the list was reviewable and low-confidence recommendations flagged themselves.
+
+On "go", each routable keeper is delivered to its recommended destination's =inbox/= via =inbox-send= (one handoff per task) and removed from the local =todo.org=; the destination's own next session files it through =process-inbox=. A skipped or no-match item stays where it is; the existing sanity check still governs whether the wrap is clean.
+
+** Implementer (the mechanics)
+
+*Candidate set (what the router considers).* Reading B means the router does not scan the whole local backlog — it would otherwise suggest moving legitimate local tasks every wrap. The candidate set is keepers =process-inbox= filed this session whose inferred home differs from the current project, identified by a marker stamped at file time (decision D8): =process-inbox='s "file as TODO" step stamps =:ROUTE_CANDIDATE: <inferred-project>= on any keeper whose inferred home is not the current project. At wrap, the router's candidate set is exactly the local tasks carrying that property — never the standing backlog.
+
+*Destination discovery.* Reuse =inbox-send.py='s existing =discover_projects= (a project is a directory with =.ai/= AND =inbox/=). The destination must have an =inbox/= to receive a handoff, so that is the natural destination set — no new discovery code. A project with a =todo.org= but no =inbox/= cannot receive an inbox handoff and must be bootstrapped first; in practice every active project has an =inbox/=.
+
+*Delivery.* For each candidate, on "go": (1) =inbox-send <destination> --file= a one-task handoff into the destination's =inbox/= (one file per task, so the destination's =process-inbox= dispositions it as a single item), then (2) remove the keeper from the local =todo.org=. Step 1 is a cross-project write, but it uses the =cross-project.md=-sanctioned path (dropping a file in another project's inbox needs no confirmation); step 2 is a single-file edit in the current project's own =todo.org=, which the wrap is already committing. No new cross-repo move primitive, no foreign =todo.org= edit.
+
+*Provenance and filing.* =inbox-send= stamps the source and date automatically (=from-<source>= filename + =#+SOURCE:= line), so the destination's session knows where the item came from. That session files it through its own =process-inbox= — value gate, priority scheme, =todo-format.md= — so the task lands per the destination's conventions rather than as an externally-authored insertion.
+
+*Recovery (mis-route).* If the recommendation engine picks a wrong destination, the receiving session rejects it via =process-inbox='s reject-from-another-project flow (write a response, =inbox-send= it back to the source named in the provenance, delete the local copy). The task returns to the source project's inbox; nothing is lost or corrupted. This is why removing the source on send is safe — the reject path is the undo.
+
+*Recommendation engine.* Infer the destination from the item's content — project names, file paths, topic words — matched against the discovered project list, with a confidence tier: *strong* = a destination project's name or path appears literally in the item; *weak* = topic-word overlap only; *none* = no match, the item stays put and is never surfaced as a route. "Go" routes strong and weak items (weak visibly labeled); a no-match item is left in place. Pure function =(item, project-list) → (destination, confidence)=, unit-tested directly. The engine is the interesting, uncertain part; it earns the spec.
+
+* Alternatives Considered
+
+** Per-item triage instead of batch go/skip
+- Good, because it gives precise control over each destination.
+- Bad, because it taxes the common case (a batch that's all-correct, or all-stay) with a walk. Craig explicitly asked for two options, not a triage loop.
+- Neutral, because per-item correction could return as a vNext refinement if batch-only proves too blunt.
+
+** Fold the router into the existing Inbox sanity check step
+- Good, because one inbox step is simpler than two.
+- Bad, because the sanity check *gates* the wrap (blocks until clean) and the router is *optional* (skip is clean). Merging a blocking check with an optional action muddies both.
+- Neutral, because the two share discovery code while staying separate steps. (Resolved: D1 keeps them separate, with the router acting on filed keepers rather than inbox files.)
+
+** Reuse process-inbox's "file as TODO" with a destination argument
+- Good, because it avoids a second mechanism.
+- Bad, because =process-inbox= runs per-item mid-session against the local project; the router runs at wrap, batch-level, cross-project. Different cadence, different scope.
+- Neutral, because both ultimately call the same atomic move helper — the helper is the shared primitive, the two callers stay distinct.
+
+* Decisions [9/9]
+
+** DONE Reuse the Open Work matcher for destination anchoring
+- Context: the move needs a reliable insertion point in the destination =todo.org=; guessing risks corrupting another project's file.
+- Decision: We will reuse =todo-cleanup.el='s =tc--find-section "open work"= matcher, which already handles the unique / missing / ambiguous cases, and skip+surface any destination without a clean Open Work heading.
+- Consequences: easier — no new parser, consistent with =--archive-done=. Harder — destinations must carry the "Open Work" heading convention, so a project with a differently-named section is silently unroutable until it conforms.
+
+** SUPERSEDED Move atomically through a helper, never hand-edit two repos
+Superseded 2026-06-21 by "Deliver via inbox-send" below. The original plan built a new atomic helper to insert a subtree into a foreign =todo.org= and remove the source. The inbox-route delivers the keeper to the destination's inbox instead, so no cross-repo move primitive is built.
+- Context: a move touches two files in two repos; a half-done move loses or duplicates a task.
+- Decision (superseded): route every move through one helper that inserts under the destination's Open Work heading and removes the source as one operation.
+
+** SUPERSEDED Cross-project writes stay visible and carry provenance
+Superseded 2026-06-21 by "Deliver via inbox-send" below. =inbox-send= already stamps provenance (=from-<source>= filename + =#+SOURCE:= line), so the hand-stamped note is unnecessary; the destination files the item through its own gate rather than receiving an externally-authored insertion.
+- Context: writing into another project's =todo.org= crosses the =cross-project.md= scope boundary.
+- Decision (superseded): treat the batch "go" as authorization, leave the move visible in the destination's git diff, and stamp a one-line provenance note on each moved task.
+
+** DONE Separate router step, operating on filed keepers (Reading B)
+- Context: the sanity check gates the wrap on inbox/ contents; the router is optional. The deeper question was the router's input — raw inbox files (Reading A, which overlaps the sanity check) or already-filed keepers that belong elsewhere (Reading B, a todo-routing concern).
+- Decision: We will keep the router a separate optional sub-step after the sanity check, and its input is Reading B: accepted keepers process-inbox filed into the local =todo.org= whose inferred home is another project. The sanity check stays a pure inbox gate; the router is a todo-routing action that shares only the destination-discovery code.
+- Consequences: easier — each step has one job, the gate can't be muddied by an optional action, and the router never competes with the inbox gate over the same files. Harder — the candidate set (which local tasks the router considers) needs a marking mechanism (see the Implementer "candidate set" note); Reading A's "dispose raw inbox files at wrap" convenience is given up.
+
+** DONE Transcript routing deferred to vNext
+- Context: transcripts file as artifacts, not tasks, and a meeting usually produces both a recording to keep and action items to track. Two unknowns block it: where recordings accumulate (a recordings inbox, a downloads dir, wherever the meeting tooling drops them), and whether filing should also extract action items into the destination's =todo.org=.
+- Decision: We will defer transcript routing to vNext. Both the source-location dependency and the file-only-vs-extract-action-items question are deferred with it, to be settled when the vNext work is specced. v1 ships task routing only.
+- Consequences: easier — v1 isn't blocked on the unresolved source location. Harder — until vNext, a meeting recording still has no automatic home; only its action items (if filed as tasks) route through v1.
+
+** DONE Keep defer-and-stage and the router as distinct policies
+- Context: the 2026-06-12 Skeptical Review added a defer-and-stage path in =process-inbox.org= that files a =[#B]= VERIFY for shared-asset proposals parked for review. That also turns an inbox item into a =todo.org= task — overlapping surface with this router.
+- Decision: We will keep them distinct. Defer-and-stage parks a proposal-under-review locally as a VERIFY; the router moves an accepted keeper to its home project as a TODO. They differ on review status (proposal vs accepted) and destination (local vs cross-project), and share only the atomic move helper, not the policy. Reading B makes the split clean: the router acts on accepted keepers, never on proposals under review.
+- Consequences: easier — two clear, non-competing policies on one shared primitive. Harder — the workflow prose must name the boundary so a future reader doesn't collapse them and reintroduce the ambiguity.
+
+** DONE Deliver via inbox-send to the destination's inbox, not a direct todo.org move (supersedes D2/D3)
+- Owner / by-when: Craig / ratified 2026-06-21 (spec-response)
+- Context: D2/D3 built a new atomic helper that edits a foreign =todo.org= and removes the source, with a hand-stamped provenance note. =inbox-send= + =process-inbox= already do cross-project delivery: inbox-send writes the handoff with =from-<source>= provenance, and the destination's process-inbox files it through that project's own gate. =cross-project.md= names the inbox as the sanctioned cross-scope write path. A verified precondition reversed the old assumption — some projects have =inbox/= but no =todo.org=, so direct-move's discovery silently drops keepers headed there while inbox-route delivers.
+- Decision: We will route each keeper by =inbox-send= into the destination's =inbox/= (one handoff per task) and let the destination's own =process-inbox= file it; we will not edit the destination's =todo.org= directly. D2 (atomic move helper) and D3 (hand-stamped provenance) are superseded — the helper isn't built, and provenance is inbox-send's by construction.
+- Consequences: easier — no new cross-repo write primitive, no foreign-tracker corruption risk, provenance and per-project filing for free, graceful when the destination lacks a =todo.org=. Harder — filing is deferred to the destination's next session (self-resolving, since startup auto-runs =process-inbox= on a non-empty inbox), and a project never opened accumulates a visible inbox backlog rather than a silent foreign insertion.
+
+** DONE Candidate-set marking: tag :ROUTE_CANDIDATE: at process-inbox file time (Option A)
+- Owner / by-when: Craig / ratified 2026-06-21 (spec-response)
+- Context: the router must consider only this-session-filed inbox keepers whose home is elsewhere, never the standing backlog. Two options: tag at file time (process-inbox stamps a marker) or infer from a =CREATED=-this-session stamp + content. =process-inbox= does not stamp =:CREATED:= today, so the inference option would need that paired edit anyway, removing its only advantage.
+- Decision: We will tag at file time. =process-inbox='s "file as TODO" step stamps =:ROUTE_CANDIDATE: <inferred-project>= on any keeper whose inferred home differs from the current project; the router's candidate set is the local tasks carrying it.
+- Consequences: easier — precise (zero standing-backlog false positives), the inference happens once where context is richest, and the marker doubles as the router's "go" trigger. Harder — a paired edit to =process-inbox.org= Phase D ships coupled with the router.
+
+** DONE Source removal is a local todo.org edit on send; recovery via the reject flow
+- Owner / by-when: Craig / ratified 2026-06-21 (spec-response)
+- Context: the review left source-handling vague ("leave the source until the destination confirms by filing"), but there is no confirmation callback, so leaving it duplicates the task once the destination files. The keeper was filed into the *current* project this session and doesn't belong there.
+- Decision: On "go" we will remove the routed keeper from the *current* project's =todo.org= (a local single-file edit, not a cross-repo write) right after the =inbox-send=. If the destination rejects the handoff, =process-inbox='s reject-from-another-project flow returns it to the source's inbox, so the removal is reversible.
+- Consequences: easier — no duplication, the only deletion is from a file we own and are already committing, the reject path is the undo. Harder — a brief window exists where the task lives only as an in-flight inbox handoff (between send and the destination's filing); acceptable because the handoff file is durable and the reject path recovers a mis-route.
+
+* Implementation phases
+
+** Phase 1 — Destination discovery (reuse inbox-send)
+Reuse =inbox-send.py='s =discover_projects= (a directory with =.ai/= AND =inbox/=) as the destination set — no new discovery code. Confirm the destination universe: if a real destination has a =todo.org= but no =.ai/+inbox/=, name it and bootstrap its inbox; otherwise the existing filter already covers it. Leaves the tree working.
+
+** Phase 2 — Candidate-set marking in process-inbox
+Extend =process-inbox.org='s "file as TODO" step (Phase D) to stamp =:ROUTE_CANDIDATE: <inferred-project>= on any keeper whose inferred home differs from the current project (decision D8). Sync the =.ai/= mirror. This is the paired workflow edit that lets the wrap-up router find candidates without scanning the standing backlog. (Replaces the superseded atomic-move helper.)
+
+** Phase 3 — Recommendation engine
+Infer destination from item content against the discovered list, with a confidence tier. Pure function =(item, project-list) → (destination, confidence)=. Unit-tested: strong match (destination project named or path present literally → high) , weak match (topic-word overlap only → low, still routed but labeled), no match (stays put, never surfaced), two-project tie (lowest-confidence / tie-break), empty project list (all stay put). The engine is shared by process-inbox's file-time marker (Phase 2) and the wrap-up router (Phase 4), so it lives where both can call it.
+
+** Phase 4 — Wrap-up step wiring
+Add the optional router sub-step to =wrap-it-up.org= Step 3, after the inbox sanity check: surface the candidate batch (one line each: task, destination, delivery mode, confidence), the two options (go / skip). On "go", for each candidate, =inbox-send= a one-task handoff to the destination's =inbox/= and remove the keeper from the local =todo.org=. Empty candidate set = zero interaction (silent). Name the gate-vs-optional split in the prose (the sanity check gates; the router is optional). Sync the =.ai/= mirror.
+
+** Phase 5 — Transcript routing (vNext, gated on the transcript decision)
+Only after the transcript-scope decision resolves. File a recording into the destination =assets/= per =working-files.md=, batch go/skip mirroring the task router.
+
+* Acceptance criteria
+- [ ] At wrap, a filed keeper naming another project is surfaced with that project as the recommended destination.
+- [ ] "Go" delivers every recommended item as a one-task =from-<source>= handoff into its destination's =inbox/= and removes it from the local =todo.org=.
+- [ ] "Skip" leaves every item in place and the wrap completes cleanly.
+- [ ] An empty candidate set produces zero interaction (no prompt, no "0 items" line).
+- [ ] A weak (low-confidence) recommendation is visibly labeled in the surfaced list; a no-match item is never surfaced as a route.
+- [ ] A candidate whose destination has an =inbox/= but no =todo.org= still delivers (degrades gracefully).
+- [ ] A mis-routed handoff is recoverable via =process-inbox='s reject-from-another-project flow, returning it to the source's inbox.
+- [ ] The router considers only =:ROUTE_CANDIDATE:=-tagged keepers, never the standing backlog.
+
+* Readiness dimensions
+- Data model & ownership: items are org subtrees; the destination owns the moved task after the move (provenance note records origin). N/A for remote/cached state — all local files.
+- Errors, empty states & failure: missing/ambiguous Open Work heading → skip+surface; failed move → atomic no-op; empty routable set → router stays silent (no prompt).
+- Security & privacy: N/A — local org files, no credentials or external services.
+- Observability: the move shows in the destination's git diff plus the provenance line; the surfaced batch list is the pre-move view.
+- Performance & scale: bounded by inbox size (single digits) and project count (tens); no hot path.
+- Reuse & lost opportunities: reuses =tc--find-section= and todo-cleanup's subtree-move; widens existing discovery rather than adding a parallel one.
+- Architecture fit & weak points: the recommendation engine is the weak point (a wrong-confident destination is the worst failure) — mitigated by the confidence label and reviewable batch list.
+- Config surface: possibly a discovery-root list (defaults to =~/projects/=, =~/code/=, matching =inbox-send.py=). Name it if it needs to be user-visible.
+- Documentation plan: =wrap-it-up.org= step prose; a note in =cross-project.md= that the router is a sanctioned cross-project write path.
+- Dev tooling: ERT for the elisp helper + discovery; the existing =make test= picks up new test files by glob.
+- Rollout, compatibility & rollback: additive workflow step; rollback is removing the sub-step. No persisted-data migration.
+- External APIs & deps: none.
+
+* Risks, Rabbit Holes, and Drawbacks
+- *Recommendation accuracy is the rabbit hole.* A confidently-wrong destination silently files a task in the wrong project. Dodge: keep the engine conservative, label low confidence, and keep the batch list reviewable before the keystroke. Don't chase a clever inference model in v1.
+- *Two inbox-touching steps* (sanity check + router) risk reading as redundant. Dodge: the D1 decision states the gate-vs-optional split in the workflow prose.
+- *Scope creep into transcripts* before the source-location question is answered would stall v1. Dodge: transcripts are explicitly vNext behind decision D4.
+
+* Review dispositions
+
+Everything in the 2026-06-21 review was accepted, with one modify:
+
+- *Modified — H1 source-handling.* The review proposed leaving the source keeper in place "until the destination confirms by filing." There is no confirmation callback, so leaving it would duplicate the task once the destination files. Resolved instead (decision D9) to remove the keeper from the *local* =todo.org= on send — a single-file edit in the project we already own and are committing, with =process-inbox='s reject flow as the undo for a mis-route. Keeps the no-foreign-write safety win without the duplication.
+
+Everything else accepted as written: H1 (inbox-route supersedes direct-move; D2/D3 superseded), H1a (one handoff per task), H1b (reuse =inbox-send= discovery; Phase 1), H2 (tag at file time; D8), M1 (confidence tiers defined in Phase 3 + acceptance), M2 (empty-set silence; acceptance), M3 (paired =process-inbox= edit; Phase 2), M4 (=cross-project.md= note adjusted to "the router uses the sanctioned inbox path").
+
+* Review and iteration history
+
+** 2026-06-13 Sat @ 01:23:13 -0500 — Claude Code (rulesets) — author
+- What: initial draft. Problem, goals/scope tiers, two-altitude design, alternatives, six decisions (three DONE from grounding, three TODO for Craig), five implementation phases, acceptance criteria, readiness dimensions, risks.
+- Why: the archsetup 2026-06-13 handoff cleared the spec bar in inbox triage and was filed spec-bound rather than applied. This draft turns the proposal into a reviewable design with the open questions isolated as decision tasks.
+- Artifacts: proposal source at =docs/design/2026-06-13-wrapup-inbox-transcript-routing-proposal.org=; grounded against =wrap-it-up.org= Step 3, =todo-cleanup.el= =tc--find-section=, and =inbox-send.py= discovery.
+
+** 2026-06-13 Sat @ 01:36:28 -0500 — Craig Jennings + Claude Code (rulesets) — author
+- What: resolved all three open decisions. The router's input is Reading B (filed keepers that belong elsewhere, not raw inbox files), so D1 keeps it a separate sub-step from the inbox gate and D5 keeps it distinct from the defer-and-stage router; D4 defers transcript routing to vNext. Reworked the design (input definition, a candidate-set note bounding the router to session-filed keepers) and Phase 3 to match. Cookie now [6/6]; Status moved to ready-for-review.
+- Why: Craig chose Reading B after the A-vs-B input ambiguity surfaced as the root under D1 and D5. Reading B keeps the inbox gate, the router, and defer-and-stage each simple instead of entangling three mechanisms.
+- Artifacts: this spec; the candidate-set marking mechanism is the one detail flagged for spec-review to pin.
+
+** 2026-06-21 Sun @ 01:58:41 -0400 — Claude Code (rulesets) — reviewer
+- What: spec-review pass. Rubric *Not ready*, two blocking findings. H1: the inbox-route alternative (inbox-send each routable keeper to the destination's inbox/, let its own process-inbox file it) supersedes the direct-move design — reshape D2, drop Phase 2 and D3's provenance burden. H2: pin the candidate-set marking to Option A (tag =:ROUTE_CANDIDATE:= at process-inbox file time). Four medium findings (M1 confidence tiers, M2 empty-set silence, M3 paired process-inbox edit phase, M4 cross-project.md note). Full review + drop-in implementation tasks in the review file.
+- Why: Craig challenged D2 directly (why edit a foreign todo.org rather than use the sanctioned inbox-send path). The review confirmed it: inbox-send already emits the exact provenance D3 reinvents, process-inbox already files per-item with the destination's own gate, cross-project.md sanctions the inbox path, and a verified precondition reverses the spec's assumption — chime and yt-sync have inbox/ but no todo.org, so direct-move silently drops keepers headed there while inbox-route degrades gracefully.
+- Artifacts: [[file:wrapup-routing-spec-review.org][review file]]. Next: spec-response to disposition H1/H2 (recommend accept both), which moves the rubric to Ready.
+
+** 2026-06-21 Sun @ 02:06:37 -0400 — Craig Jennings + Claude Code (rulesets) — responder
+- What: folded the spec-review in. Accepted H1 (inbox-route) and H2 (tag at file time); superseded D2 and D3; added D7 (deliver via =inbox-send=), D8 (=:ROUTE_CANDIDATE:= marker at file time), D9 (local source removal + reject-flow recovery). Rewrote Summary, Goals, Design mechanics, Implementation phases (dropped the atomic-move helper — Phase 2 is now the =process-inbox= marker edit), and Acceptance criteria for the inbox-route. One modify (D9) refines H1's vague source-handling. Cookie [9/9]; Status → Ready.
+- Why: Craig's inbox-route challenge held up under review — it reuses the sanctioned cross-project path, gets provenance and per-project filing for free, and degrades gracefully where direct-move drops the task. D9 closes the duplication gap the review left open.
+- Artifacts: review file deleted on this pass. Next: Phase 6 implementation-task breakdown into =todo.org= on the author's go.
author	Craig Jennings <c@cjennings.net>	2026-07-02 00:19:56 -0400
committer	Craig Jennings <c@cjennings.net>	2026-07-02 00:19:56 -0400
commit	f4b64d6141156cf0ee2a2c2a13cda256f0bf0c84 (patch)
tree	76534c13c9b8f07d8f5315cf437c1aed0e0e1b67 /docs/specs
parent	80ca5d00c4ddd481308ed8ce0c2f270bd34604c0 (diff)
download	rulesets-f4b64d6141156cf0ee2a2c2a13cda256f0bf0c84.tar.gz rulesets-f4b64d6141156cf0ee2a2c2a13cda256f0bf0c84.zip