13 files changed, 408 insertions, 20 deletions
diff --git a/.ai/workflows/INDEX.org b/.ai/workflows/INDEX.org
index a474b29..88721ed 100644
--- a/.ai/workflows/INDEX.org
+++ b/.ai/workflows/INDEX.org
@@ -54,6 +54,11 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e
   - Roam-mode triggers: "inbox zero", "empty the inbox", "process the roam inbox", "triage my roam inbox"
   - Auto-mode trigger: "auto inbox zero" (match before "inbox zero")
 
+- =work-the-backlog.org= — the autonomous task-execution loop, the single home for working a batch of marked tasks unattended: takes an ordered task set (explicit list or tag query) + session mode (=file-only= default / =autonomous-commit= + paging) + a hard run cap; each candidate passes the mechanical eligibility gate (status =TODO= + =:solo:= per the project's scheme header) and the four-item defer checklist, then is implemented to the full quality bar (TDD, =/review-code=, =/voice=) as its own logical commits. Fed by the inbox auto-loop's chain step (yes-gated, file-only, cap 1) and the no-approvals speedrun preset (pre-flight Q&A → autonomous-commit + always-push + end-of-set page over an explicit ordered list).
+  - Speedrun triggers: "speedrun", "no approvals speedrun", "speedrun these: <task set>" — any phrase containing "speedrun" routes here (the preset), never to =no-approvals.org=
+  - Manual triggers: "work the backlog", "work the backlog with <task set>" (file-only defaults)
+  - Synthesis trigger: "synthesize backlog metrics" — read the per-project metrics logs, compute trends + the corrections signal, write one =:agent:metrics:= KB node (personal projects only)
+
 ** Calendar
 
 - =add-calendar-event.org= — create a calendar event.
@@ -117,6 +122,7 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e
   - Triggers: "session harvest", "harvest the sessions", "let's run the session-harvest workflow", "monthly harvest", "mine the sessions"
 - =no-approvals.org= — drop the interaction-level approval gates for a pre-agreed batch while keeping engineering-discipline gates (=/review-code=, =/voice personal=, tests, session-log updates, subagent reviews, destructive-action consent). Mode stays on until Craig turns it off, a real question arises, the queue empties, or the conversation switches topics.
   - Triggers: "no-approvals mode", "no approvals", "no-approval", "no need for approval gates", "stop asking, just keep going", "I'll check back in when you're done or stuck", "do all =<selector>= with no-approval"
+  - Exception: any phrase containing "speedrun" routes to =work-the-backlog.org='s no-approvals speedrun preset instead
 
 * Living Document
 
diff --git a/.ai/workflows/clean-todo.org b/.ai/workflows/clean-todo.org
index dd33056..a1b2af5 100644
--- a/.ai/workflows/clean-todo.org
+++ b/.ai/workflows/clean-todo.org
@@ -27,7 +27,17 @@ Deletes bogus =- State "X" from "X" [date]= log lines (state didn't actually cha
 
 To preview without writing, run =--check= first: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --check todo.org=.
 
-** Step 2: Archive completed work
+** Step 2: Convert done sub-tasks to dated entries
+
+#+begin_src bash
+emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org
+#+end_src
+
+Rewrites every heading at level 3 or deeper whose TODO state is DONE/CANCELLED/FAILED into a dated event-log entry (=<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>=), dropping the keyword, priority cookie, and tags, and removing the =CLOSED:= line. Enforces the depth rule that a completed sub-task becomes dated history — a shape interactive org closes and =--archive-done= (level-2 only) leave unapplied. Timestamp comes from each entry's =CLOSED= cookie; heading text kept verbatim; idempotent; a done sub-task with no parseable =CLOSED= is flagged and left alone. Run before archiving so a parent's sub-tasks are already dated when it moves. Capture the output.
+
+To preview without writing: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks --check todo.org=.
+
+** Step 3: Archive completed work
 
 #+begin_src bash
 emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org
@@ -37,10 +47,11 @@ Moves every level-2 subtree whose TODO state is DONE or CANCELLED out of the "Op
 
 To preview the moves without writing: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done --check todo.org=.
 
-** Step 3: Summarize
+** Step 4: Summarize
 
-Report to Craig from the two captured outputs:
+Report to Craig from the three captured outputs:
 - Hygiene: how many bogus state-log lines were deleted; any orphan-planning warnings (file:line + heading), or "none".
+- Convert: how many done sub-tasks were rewritten to dated entries (heading + line), any flagged for no =CLOSED= date, or "nothing to convert".
 - Archive: how many subtrees moved and which (heading + line), or "nothing to move" / the skip reason if a section was missing or ambiguous.
 - If the file changed, note that =todo.org= now has an uncommitted edit — review =git diff -- todo.org= and commit it (in this repo's commit style) if it looks right. If nothing changed, say so and stop.
 
@@ -49,7 +60,7 @@ Don't auto-commit. The summary is the review point; Craig decides whether the di
 * Principles
 
 - *Both passes apply, not just preview.* The workflow is invoked because cleanup is wanted. Use the =--check= variants only when Craig asks for a dry run.
-- *Two passes, two invocations.* =--archive-done= is its own mode and does not run the hygiene pass; run both.
+- *Separate modes, separate invocations.* =--convert-subtasks=, =--archive-done=, and the hygiene pass are each their own mode and don't run the others; run all three.
 - *Never auto-commit todo.org.* Surface the diff and let Craig commit it. The cleanup is a working-tree change, fully reversible until committed.
 - *Trust the script.* It's fast and idempotent; if there's nothing to do, it reports zero and exits clean. No pre-checks.
 
diff --git a/.ai/workflows/inbox.org b/.ai/workflows/inbox.org
index 5fc855f..b28fdaa 100644
--- a/.ai/workflows/inbox.org
+++ b/.ai/workflows/inbox.org
@@ -114,6 +114,14 @@ The item extends a task already filed. Update the parent TODO's body with a date
 ** File as TODO
 Substantive but waits, or needs design/triage before implementation. Add the TODO under =* <Project> Open Work= with priority + tags per the priority-scheme check (core §6). Body summarizes the proposal and links the inbox content if it's been moved to =docs/design/=. Delete the inbox file (or move it to =docs/design/= first if the content survives).
 
+*Route-candidate marking (feeds the wrap-up router).* After filing, check whether the keeper's inferred home is a different project:
+
+#+begin_src bash
+python3 .ai/scripts/route_recommend.py --item "<the keeper's heading + body text>" --exclude "$(basename "$PWD")"
+#+end_src
+
+On a =<destination>\tstrong= or =<destination>\tweak= result, stamp the new TODO's property drawer with =:ROUTE_CANDIDATE: <destination>= (create the drawer if the task has none). A =none= result stamps nothing, and a local keeper stays unstamped. The marker is the wrap-up router's entire candidate set — =wrap-it-up.org= Step 3 surfaces exactly the =:ROUTE_CANDIDATE:=-tagged tasks and offers to deliver each to its destination's inbox, never scanning the standing backlog. Stamping is cheap and reversible (the router's skip leaves the task in place; a wrong marker is one property line to delete), so prefer stamping on any plausible match — the human reviews the batch at wrap time.
+
 *Blocking-dependency handoff.* A special shape: another project sends a note that *this* project's work is blocking one of theirs ("your task X is blocked on us — we need Y"). File or link the owning task, tag it =:blocker:=, and name the requesting project in the body (see the cross-project dependency convention in =todo-format.md=). The =:blocker:= tag makes =open-tasks.org= surface that task *first*, since clearing it unblocks the other project. Dedup against an existing task rather than filing a duplicate. When the work later lands, drop =:blocker:= and notify the waiting project (=inbox-send <their-project> --text "Delivered: <what> — you're unblocked."=) so it can lift its own =:blocked:=.
 
 ** Defer
@@ -453,18 +461,19 @@ Take these up when the single-destination version is in use and the multi-projec
 
 * Mode: auto inbox zero
 
-A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports, and a find waits for Craig's go before any work happens.
+A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports what it found and filed.
 
 ** Per cycle
 
 1. Run roam mode's scan (Phase A local check + Phase B roam scan), read-only — no =git pull=. The capture-guard still gates any write: use =capture-guard --wait= (core §5) so a transient capture clears itself; if it's still open after the wait, *defer this cycle's roam reconcile to the next cycle* rather than surfacing — the loop cadence is the retry, and the filed items get swept next time. The rare write hands its git to =roam-sync= (roam Phase D).
 2. *Nothing found* → no inbox summary. One acknowledgement line: =ran at HH:MM, nothing found=. Nothing else. The acknowledge-only-on-empty rule keeps a quiet inbox quiet.
 3. *Items found* → summarize the found items, file them as tasks (roam Phase C), and *append them to a displayed queue* — the harness task list, via =TaskCreate= — so the queue accumulates across cycles. Then ask: "run this batch next?"
-   - *Yes* → launch into implementing the found items, each through the normal disposition ladder (core §3) + verify flow.
+   - *Yes* → chain into =work-the-backlog.org= as an explicit second step after routing completes: pass it the eligibility query over the queued items (status =TODO= + =:solo:= per the scheme header, priority-ordered), =file-only= mode, paging off, cap 1. The highest-priority eligible candidate runs; the rest wait for the next tick or a later yes.
    - *No* → they stay queued for a later go.
+   This mode never implements anything itself — routing ends here, and the execution loop lives in =work-the-backlog.org=, its one home.
 4. *Cross-cycle dedup.* Subsequent cycles add only *newly-found* items to the same displayed queue, never re-surfacing what's already there. Dedup against the queue (the =TaskCreate= list), not against what's already been implemented — a find that was queued-but-not-yet-run must not reappear, and one already filed into =todo.org= is dropped by roam Phase C's status check.
 
-A find is always surfaced and gated on Craig's yes; a quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its execute step waits for a yes.
+A find is always surfaced and filed; execution happens only through the =work-the-backlog.org= chain and waits for Craig's yes. A quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its chain step waits for that yes.
 
 ** Fully-unattended pass (=/schedule=) — vNext, not v1
 
diff --git a/.ai/workflows/no-approvals.org b/.ai/workflows/no-approvals.org
index 1efce82..9e1c894 100644
--- a/.ai/workflows/no-approvals.org
+++ b/.ai/workflows/no-approvals.org
@@ -22,6 +22,8 @@ Craig activates the mode with any of:
 - Queuing several tasks in =todo.org= followed by any phrase above
 - Any equivalent phrasing that signals he doesn't want to be re-asked between items
 
+*Not this mode:* any phrase containing "speedrun" ("speedrun", "no approvals speedrun") routes to =work-the-backlog.org='s no-approvals speedrun preset — an autonomous batch over an explicit ordered task set, with a pre-flight Q&A, autonomous commits, always-push, and an end-of-set page. This mode is the general interaction-gate suspension for whatever work is already underway; the speedrun is the dedicated backlog-batch workflow.
+
 Mode resets when:
 
 - Craig says approvals are back on
diff --git a/.ai/workflows/open-tasks.org b/.ai/workflows/open-tasks.org
index 4ba29dd..02a0847 100644
--- a/.ai/workflows/open-tasks.org
+++ b/.ai/workflows/open-tasks.org
@@ -23,15 +23,16 @@ Don't route "task review" / "review tasks" here — those trigger the hygiene ha
 
 * Phase A: Data Gathering (both modes)
 
-** Phase A pre-step — archive any freshly-DONE tasks
+** Phase A pre-step — normalize freshly-closed tasks
 
-Before reading =todo.org=, run the cleanup script's archive-done sweep so completed level-2 subtrees move from =* $Project Open Work= to =* $Project Resolved=:
+Before reading =todo.org=, run two cleanup sweeps so the read reflects current state. First convert any done sub-tasks to dated entries, then archive completed level-2 subtrees from =* $Project Open Work= to =* $Project Resolved=:
 
 #+begin_src bash
+emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org
 emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org
 #+end_src
 
-Costs a few hundred milliseconds. Without it, a task that completed earlier in the session sits as =** DONE= under Open Work until the next =clean-todo= or wrap-up pass, and Next Mode would surface it as a "what's next" candidate. The sweep makes Phase A's read of =todo.org= reflect current state.
+Costs a few hundred milliseconds. Without the archive sweep, a task that completed earlier in the session sits as =** DONE= under Open Work until the next =clean-todo= or wrap-up pass, and Next Mode would surface it as a "what's next" candidate. The convert sweep runs first so a completed parent's sub-tasks are already dated when it archives; it also keeps interactive level-3 closes from lingering as DONE keywords. Together they make Phase A's read of =todo.org= reflect current state.
 
 Skip the sweep if the workflow is invoked in an explicit read-only or dry-run context. Default is to run it.
 
diff --git a/.ai/workflows/spec-create.org b/.ai/workflows/spec-create.org
index 508b969..1249181 100644
--- a/.ai/workflows/spec-create.org
+++ b/.ai/workflows/spec-create.org
@@ -82,8 +82,9 @@ This is where the spec earns a "Ready" from review: an engineer must be able to
 
 ** Phase 5 — Wire it up (conventions)
 
-- *Filename + location:* =docs/<problem-slug>-spec.org=. Org-mode. The slug names the *problem/feature*, not a date. Must end in =-spec.org=.
-- *Metadata header:* a small table at the top — Status, Owner, Reviewer(s), Date, Related (link to the task/ticket).
+- *Filename + location:* =docs/specs/YYYY-MM-DD-<problem-slug>-spec.org= — formal specs live in =docs/specs/=, never =docs/design/= (that's for notes, brainstorms, inventories; see =claude-rules/docs-lifecycle.md=). Org-mode. The slug names the *problem/feature*; no status suffixes ever — status lives in the file. Must end in =-spec.org=.
+- *Status heading (first element after the file header):* a top-level heading carrying the lifecycle keyword, stamped =DRAFT= at authoring — spec-create owns this flip. It holds an =:ID:= UUID (generate with =uuidgen=) and dated history lines, newest first. The keyword is authoritative; the Metadata =Status= field mirrors it in lowercase. Transitions are three lines in one file (keyword + history line + mirror): spec-review flips =READY=, spec-response flips =DOING= at decomposition, the final build task flips =IMPLEMENTED=. Terminal states always record a reason.
+- *Metadata header:* a small table at the top — Status (the lowercase mirror), Owner, Reviewer(s), Date, Related (link to the task/ticket).
 - *Review-and-iteration-history stub:* add a =Review and iteration history= section at the bottom and seed it with the author's first entry. =spec-review= and =spec-response= append provenance entries here, so the heading shape is a contract: =YYYY-MM-DD Day @ HH:MM:SS -ZZZZ — Contributor — Role=, body fields What / Why / Artifacts.
 - *Cross-link both ways:* the spec links its task; the task links the spec (replace the task's inline plan with a terse description + a =file:= link to the spec).
 
@@ -103,7 +104,14 @@ Then it's ready for =spec-review.org=. Snapshot-vs-living rule: keep the spec li
 ,#+TITLE: <Feature> — Spec
 ,#+AUTHOR: <author>
 ,#+DATE: <YYYY-MM-DD>
-,#+TODO: TODO | DONE SUPERSEDED CANCELLED
+,#+TODO: TODO | DONE
+,#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+,* DRAFT <spec short name>
+:PROPERTIES:
+:ID:       <uuid — generate with uuidgen>
+:END:
+- <YYYY-MM-DD Day @ HH:MM:SS -ZZZZ> — drafted.
 
 ,* Metadata
 | Status   | draft         |
diff --git a/.ai/workflows/spec-response.org b/.ai/workflows/spec-response.org
index de5b1c8..7628e49 100644
--- a/.ai/workflows/spec-response.org
+++ b/.ai/workflows/spec-response.org
@@ -130,9 +130,11 @@ When related specs were reviewed together, two reviews can recommend opposite th
 
 This is the *last* step of the workflow, and it runs *only after the author confirms the spec is Ready* — never during review iterations. A Ready spec nobody can act on is unfinished; this phase turns it into tracked work. It applies to every project type (library, application, service, docs set).
 
-1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it.
+*This phase owns the =READY= → =DOING= lifecycle flip* (docs-lifecycle convention): when the decomposition below lands, update the spec's top-level status heading keyword to =DOING=, add a dated history line, and set the Metadata =Status= mirror to =doing= — three lines, one file.
 
-2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks.
+1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it. *Stamp the binding:* the parent task's =:PROPERTIES:= drawer gets a =:SPEC_ID:= line holding the spec's status-heading UUID. That property is the durable join task-audit uses to police =DOING= specs (a =DOING= spec whose bound parent is closed, archived, or missing gets flagged).
+
+2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks. *Always end the set with the flip task:* a final "flip the spec to IMPLEMENTED (+ dated history line + mirror)" task under the same parent — the tracked obligation that closes the lifecycle loop when the build finishes. Never skip it; "a human remembers" is the failure mode this exists to prevent.
 
 3. *Turn a critical eye on completeness.* Re-read the spec — every phase, every acceptance criterion, every named deliverable, every data-safety/principle rule — and confirm each has a home in a task. The work is not done when the tasks merely exist; it is done when nothing in the spec is left untracked. This completeness pass is mandatory regardless of project type.
 
diff --git a/.ai/workflows/spec-review.org b/.ai/workflows/spec-review.org
index 833dfc9..d4998eb 100644
--- a/.ai/workflows/spec-review.org
+++ b/.ai/workflows/spec-review.org
@@ -50,6 +50,11 @@ Run it *early* — design review exists to catch viability problems and costly m
 
 Before Phase 1, verify the file under review ends with =-spec.org=. Every design, decision, or planning document under a project's =docs/= directory carries that suffix as its identifier. The =.org= extension alone is not enough because =docs/= holds non-spec org files too (tutorials, frozen inventories, reference material).
 
+*Location expectation (docs-lifecycle convention).* Formal specs live in =docs/specs/=. Whether that's enforced depends on whether the project has run its one-time =spec-sort= retrofit:
+
+- =:LAST_SPEC_SORT:= present in =.ai/notes.org= Workflow State → the project has sorted; a =-spec.org= file outside =docs/specs/= fails this precondition. Surface it: "this spec sits outside docs/specs/ — move it (and update inbound links) before review."
+- Marker absent → legacy locations (=docs/= root, =docs/design/=) stay reviewable; add one nudge line to the review output ("this project's docs pile has never been spec-sorted — say 'run spec-sort' to sort it") and proceed. No legacy spec is ever unreviewable during the transition.
+
 If the file does not end with =-spec.org=, stop immediately and surface the mismatch:
 
 #+begin_example
@@ -128,6 +133,7 @@ Work the spec against these. Each is a source of concrete findings, not a box to
 - *Performance & scale.* Expected counts (issues/comments/labels/teams/projects/views)? Server-side filtering where possible? Bounded, visible pagination? Cached name→ID lookups? Sync calls in the command path acceptable? Could a save hook or whole-file scan make N network calls? Rendering linear? Full-file rewrites avoided? Long-running operations async/cancellable/observable? Is concurrency/queueing/backpressure defined? Are high-output process filters throttled and cheap? Is progress/ETA exposed only when defensible, and are hung/stalled operations detectable and killable? Identify UI freezes, repeated network calls, unbounded pagination — without premature optimization.
 - *Security & privacy.* API keys safe? Debug logs leaking secrets or private issue text? Confirmations before mutating shared workspace objects? Personal vs shared distinguished? Local files holding sensitive descriptions/comments? Anything to redact from messages/logs? Any work-tracker integration may handle private company data.
 - *UX & accessibility.* Discoverable commands? Recoverable mistakes? Prompts ordered to the task? Safe, useful defaults? Informative-not-noisy status messages? Does the UI avoid implying unsupported actions are supported? Match the upstream product's permissions/concepts? Are customizations named in user language, with clear defaults and docstrings? For Emacs packages, command names, completion candidates, buffer layout, defcustom names, and message wording *are* the UX.
+- *Operational-panel UI traps.* Applies when the spec covers a user-facing panel, dialog, or control surface; skip otherwise. Lists that mix saved, current, and generated items must name each item's source. Refresh or scan actions must not gate data that could be shown immediately. Add-forms must not ask the user to retype values the system already discovered. Destructive confirmations read in future tense before the action and verified-result tense after it. Diagnostics, performance, logging, and repair affordances are reviewed as one coherent flow before extra pages or buttons are added. A popup launched from a bar, tray, or tool surface should visually belong to that launcher. (Promoted from archsetup's Waybar network-panel review, 2026-06-30.)
 - *Test strategy and coverage.* Characterization tests before behavior changes? Pure functions to unit-test? API responses needing fixtures? Command flows needing stubs? Regression tests for prior bugs? Boundary/error cases? What's covered elsewhere and shouldn't be re-tested? Which existing tests must change? How is coverage generated, summarized, and used to find untested/refactor-worthy code? Prefer tests that lock contracts: representation shape, query compilation, sync no-op, conflict refusal, pagination, dirty-buffer protection, log redaction, and long-running/slow-operation behavior via fakes rather than flaky live dependencies.
 - *Observability & operations.* How does a user see what the package is doing? Progress messages for long ops? Useful, safe debug logging? Are logs structured enough to isolate issues from a bug report? Are commands provided to inspect/clear caches, test connectivity, diagnose backends/tools, copy redacted debug info, or reproduce command invocations? How are terminal states discovered: completion, failure, partial success, stalled/hung, cancelled, cleanup-unverified, and "needs user action"? Does the product notify only when useful, avoid noisy success spam, and keep non-success states visible until acknowledged? For generated org files, headers should often carry source, filter/view name, refresh time, count, truncation state.
 - *Comparable-product sentiment.* When there are obvious adjacent products, research what users love and hate about them from official docs plus current community reports. Do not cargo-cult their feature set; translate findings into the spec's scope. For each loved behavior, say whether the spec provides it, intentionally omits it, or defers it. For each hated behavior, say whether the spec avoids, resolves, inherits, or accepts it.
@@ -166,6 +172,8 @@ Assign one label consistently:
 
 The most useful reviews move a spec from =Not ready= to =Ready with caveats= or =Ready= once decisions are captured.
 
+*The =Ready= verdict flips the spec's lifecycle status.* spec-review owns the =DRAFT= → =READY= transition (docs-lifecycle convention): on assigning =Ready= (or =Ready with caveats= the author accepts), update the spec's top-level status heading keyword to =READY=, add a dated history line under it naming the review that passed, and set the Metadata =Status= mirror to =ready= — three lines, one file. Any other rubric label leaves the keyword where it stands (a re-review that finds new blockers on a =READY= spec demotes it back to =DRAFT= the same three-line way, with the reason in the history line).
+
 Finding severity maps to blocking power: *high-priority findings block =Ready=* — they hold the rubric at =Not ready= (or =Ready with caveats= if the author accepts and tracks them) until dispositioned; *medium-priority findings are the author's discretion* and don't block. State the blocking status on each finding so the author running spec-response knows which ones gate the rubric.
 
 Then update the spec's review history. Specs should carry a bottom section named =Review and iteration history= (or the nearest existing equivalent) that tracks each material author/reviewer pass. Add a concise entry for this review even when the spec is ready and no findings are recorded.
diff --git a/.ai/workflows/startup.org b/.ai/workflows/startup.org
index 5e8f61e..9488dd0 100644
--- a/.ai/workflows/startup.org
+++ b/.ai/workflows/startup.org
@@ -151,7 +151,7 @@ These calls have no dependencies on each other. Issue them all together in one m
 8. =[ -f todo.org ] && .ai/scripts/task-review-staleness.sh todo.org 7 || true= — count top-level tasks overdue for review (the daily task-review habit's startup nudge). The =[ -f todo.org ]= guard skips projects without a root todo.org; =|| true= keeps Phase A from failing if the script isn't synced yet. Threshold 7 days is one review cycle of slack — softer than the wrap-up health check's 30-day alarm.
 9. =bash ~/code/rulesets/scripts/sync-language-bundle.sh "$PWD" 2>/dev/null || true= — language-bundle freshness for the current project. Fingerprint-detects which bundle (if any) the project has, auto-fixes drifted rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), and surfaces drift in =settings.json= without writing it (a project may have customized it). =CLAUDE.md= is deliberately left untracked — it's seed-only in =install-lang= and project-owned afterward, mirroring how =diff-lang= skips it. Quiet when there's no bundle or everything's clean. Hardcodes the rulesets path because =languages/= is the canonical source and lives only there — the same absolute-path dependency the rsyncs already carry. =|| true= keeps Phase A from failing on older checkouts where the script isn't present yet. The =.ai/= rsyncs and this call write to disjoint paths (=.ai/= vs =.claude/=/=githooks/=), so the batch stays parallel-safe.
 10. =[ -f "$HOME/org/roam/inbox.org" ] && grep -cE '^\*\* ' "$HOME/org/roam/inbox.org" || true= — count items in the roam global inbox (=~/org/roam/inbox.org=), the roam-mode startup nudge. Silent if the roam clone isn't on this machine. Phase C reads the file when the count is non-zero, splits total vs items related to this project, and surfaces the offer (see =inbox.org= roam mode). Read-only; never files at startup.
-11. KB surface prep (the read + contribute startup nudges; see =docs/design/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute).
+11. KB surface prep (the read + contribute startup nudges; see =docs/specs/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute).
 
     #+begin_src bash
     ra="$HOME/org/roam/agents"
@@ -166,6 +166,16 @@ These calls have no dependencies on each other. Issue them all together in one m
     fi
     #+end_src
 
+12. Spec-sort probe (the docs-lifecycle retrofit nudge; see the docs-lifecycle spec in =docs/specs/=). Read-only; prints one line when the project has an unsorted docs pile — a =docs/design/= directory or stray =docs/*-spec.org= root files — and no =:LAST_SPEC_SORT:= marker in =.ai/notes.org=. Silent for projects with nothing to sort or an already-stamped marker (the marker permanently clears it).
+
+    #+begin_src bash
+    { [ -d docs/design ] || [ -n "$(find docs -maxdepth 1 -name '*-spec.org' -print -quit 2>/dev/null)" ]; } \
+        && ! grep -qs ':LAST_SPEC_SORT:' .ai/notes.org \
+        && echo "spec-sort: unsorted docs present" || true
+    #+end_src
+
+    The stray-root check uses =find= rather than a glob so the probe behaves identically under bash and zsh (=compgen= is bash-only, and zsh aborts on an unmatched glob).
+
 Notes on the rsync commands:
 - Trailing slashes on both source and destination matter — they tell rsync to sync /contents/ rather than nest a directory inside.
 - =--delete= on the directory syncs lets retired template files actually disappear from each project on next startup.
@@ -199,6 +209,7 @@ This phase touches the user and runs sequentially:
    - *Roam inbox nudge.* If the Phase A roam-inbox count is greater than zero, read =~/org/roam/inbox.org=, split total vs items related to this project (claimed by the =<project>:= prefix, plus any unprefixed item whose topic plainly concerns this project), and surface one line: "Roam inbox: =<N>= total, =<M>= appear related to this project — say 'inbox zero' to file them." Offer it as a priority option; never auto-file. If the count is zero or the file is absent, say nothing. See =inbox.org= roam mode.
    - *KB consult nudge (read side).* If the Phase A KB-surface prep returned any =kb-relevant-titles=, surface one line listing them (capped 5): "KB lessons that may be relevant: =<title>=; =<title>=… — open the node before related work." The titles are declarative, so the list alone tells you whether to open one. Gated on the roam clone; silent when the clone is absent or nothing relevant surfaced. See the best-practices node and =knowledge-base.md=.
    - *KB contribute nudge (write side).* Once per session, surface one line pointing at the best-practices node (the =kb-bestpractices= path from Phase A): "Learned something durable? See =<path>= for how to write a KB node — contributing cross-project facts is welcome (personal projects only; work/unknown projects never write per =knowledge-base.md=)." Light encouragement, never a gate. Gated on the roam clone; silent when absent.
+   - *Spec-sort nudge.* If the Phase A spec-sort probe printed =spec-sort: unsorted docs present=, surface one line: "this project's docs pile has never been spec-sorted — say 'run spec-sort' to sort it." If the probe was silent, say nothing. A project with nothing to sort never sees the line; a stamped =:LAST_SPEC_SORT:= marker permanently clears it. See the docs-lifecycle rule and the spec in =docs/specs/=.
    - *Language-bundle sync.* If the Phase A step-12 call (=sync-language-bundle.sh=) printed anything, surface it. =fixed= lines are informational — the drift was already repaired (note that =.claude/= is now dirty if the project commits it). A =drift= line on =settings.json= is surface-only and needs the printed =make install-<lang> PROJECT=.= to reconcile; flag it so the user can decide. If the call was silent, say nothing.
    - *Newly-installed symlinks.* If the Phase A.0 =make install= step printed any =link= / =relink= / =WARN= line, surface it. A =link= line means a skill, rule, hook, or script added to rulesets is now linked into =~/.claude= for the first time on this machine. For a newly-linked *skill*, check the agent's available-skills list: if the harness already registered it mid-session, note it's available and move on; if it's absent, stop and tell Craig to restart the agent so it loads (whether a mid-session reload works is harness-version-dependent). For a newly-linked *hook*, note that the harness reads hooks at session start — it fires from the next session (or after Craig opens =/hooks= once); its settings.json wiring travels with the tracked file, so the link is usually the only missing piece. A =WARN ... not a symlink= line is a real collision at the target path — surface it; it needs a human. If the step printed only "nothing new to link", say nothing.
    - *Template-sync churn (safety net).* Check whether Phase A's rsync left uncommitted churn in the synced =.ai/= paths — accumulated from a prior session that crashed before wrap-up, or freshly added this session when rulesets advanced. Without surfacing, it builds up silently until it blocks Phase A.0's auto-ff (git won't ff a dirty tree). Skip in the rulesets repo itself (there =.ai/= is a committed mirror, kept honest by the pre-commit hook). The check is sequential here, after the rsync has finished — not a Phase A step, to keep that batch race-free.
diff --git a/.ai/workflows/task-audit.org b/.ai/workflows/task-audit.org
index 94b99da..7d2b758 100644
--- a/.ai/workflows/task-audit.org
+++ b/.ai/workflows/task-audit.org
@@ -61,6 +61,8 @@ For each open task, read its body and cross-check its claims against the actual
 - *Calendar* — did a scheduled event happen; is a SCHEDULED/DEADLINE date now past.
 - *Meeting recordings* — if a task hinges on "did this conversation happen / what was said," check the recording queue (e.g. =~/sync/recordings/=) and transcribe via =process-meeting-transcript.org= if the answer lives in an un-transcribed recording. (This is exactly how a "did the interview happen?" task gets resolved instead of guessed.)
 
+*Spec lifecycle reconcile (docs-lifecycle convention).* If the project has a =docs/specs/=, run the =:SPEC_ID:= query as part of this phase: for each spec whose top-level status heading reads =DOING=, find the =todo.org= task whose =:SPEC_ID:= property matches the spec's =:ID:=. Flag the spec NEEDS-USER when that bound parent is =DONE=/=CANCELLED=, archived, or missing — the build finished (or evaporated) without the =IMPLEMENTED= flip, exactly the drift this check exists to catch. Check the parent's own keyword, not its children (completed children become dated entries and the final flip task is a child, so child-counting misleads).
+
 Assign each task a bucket (CURRENT / STALE / NEEDS-USER) and, for STALE, the specific factual update.
 
 *Scale tactic.* For a large open-task set, dispatch read-only investigation sub-agents over batches of tasks (parallel-safe per =subagents.md= — independent read-only domains). Each returns a per-task bucket + suggested update. *Never* let sub-agents write to =todo.org= concurrently — apply all edits serially in the main thread (concurrent writes to one file race and lose work).
@@ -79,7 +81,7 @@ For every STALE task, edit it in the main thread:
 - *Ensure priority is set per the project scheme.* The top of the project's =todo.org= should carry the priority legend (=[#A]= through =[#D]=). Every task should carry an explicit priority cookie. If a cookie is missing, or no longer matches the reconciled facts, assign the right level per the legend. If the level is unambiguous from the body, do it autonomously; if it's a judgment call (especially the [#A] / [#B] line for important-but-not-urgent work), flag NEEDS-USER. Also enforce the [#A]-discipline rule from the legend — an [#A] task without a =SCHEDULED:= or =DEADLINE:= line is mis-graded and is either down-graded to [#B] (when reconciled facts say "important but not urgent") or surfaced as NEEDS-USER for the user to date.
 - *Ensure a type tag is set.* Every task carries one type tag from the project's tag legend (typically =:feature:= / =:chore:= / =:spec:= / =:bug:=). If missing or wrong, assign or correct it from the body when the type is unambiguous. If two tags fit (a refactor that also fixes a bug; a spec that's also a chore), flag NEEDS-USER rather than picking one silently.
 - *Enforce the project's declared tag vocabulary.* If the project's tag legend declares an *exhaustive* set of allowed tags, strip from each task any tag outside that set — the heading and parent section already carry topic/scope context, so ad-hoc tags only fragment the vocabulary and defeat tag-based filtering. Normalize near-duplicate spellings to the canonical tag (a plural to its singular, say). Where the legend does not declare the set closed, leave existing tags alone; this step applies only where the allowed set is exhaustive by design.
-- *Re-assess the =:quick:= and =:solo:= tags* — reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the definitions in the project's tag legend (and [[file:task-review.org][task-review.org]]) when the reconciled facts make the call clear. When they don't — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess.
+- *Re-assess the =:quick:= and =:solo:= tags (mandatory — an audit that skips this is incomplete).* Reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the hard definitions in [[file:../../claude-rules/todo-format.md][todo-format.md]] ("Hard definitions: :solo: and :quick:"; task-review carries the same three-gate walk). Autonomous execution reads =:solo:= as its eligibility gate and trusts the tag, so a stale one is a run-time hazard, not cosmetic drift. When the call isn't clear — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess.
 - Bump =:LAST_REVIEWED:= on each edited task.
 
 Follow =todo-format.md= for completion mechanics (depth-based DONE vs dated-rewrite) and the working-files / link-hygiene rules when moving artifacts.
@@ -99,6 +101,21 @@ Never merge or re-parent autonomously — which tasks belong together, and wheth
 
 When no clear cluster exists, say so in one line and move on — most audits won't find one, and forcing a merge fragments worse than it consolidates.
 
+** Phase C.6 — Retire completed parents and promote stragglers (interactive)
+
+Phase C.5 consolidates related *open* tasks. This step retires parent tasks whose work is *finished*, so completed containers don't linger in Open Work as scaffolding.
+
+Run =todo-cleanup.el --convert-subtasks= first (it's part of the =clean-todo= / wrap-up cleanup, and =open-tasks.org= runs it too) so every completed sub-task is a dated event-log entry rather than a lingering =DONE= keyword. The closure logic below reads "open child" as a child heading still carrying a task keyword (=TODO=/=DOING=/=WAITING=/=VERIFY=/=NEXT=/=PROJECT=/=STALLED=/=DELEGATED=); a dated entry is correctly not open.
+
+Two shapes, both proposed to Craig (inline numbered options per =interaction.md=, no popup) before applying:
+
+- *Zero open children → close the parent.* A parent whose child *tasks* are all resolved (now dated) and that carries no open child task is finished: close it per =todo-format.md= (=**= parent → =DONE=/=CANCELLED= + =CLOSED:=), and it moves to Resolved on the next =--archive-done=. If the work resurfaces later, a fresh task is created then; a completed container shouldn't sit open as a placeholder.
+- *One or two open children → promote, then close.* When a parent has only one or two open children, pull them out and rewrite them as standalone =**= level-2 tasks — give each a priority per the project scheme, and make the heading stand alone without the parent's context — then close the now-childless parent and let it move to Resolved. The former children become first-class Open Work tasks; the retired parent stops being scaffolding for one or two stragglers.
+
+*The leaf-with-notes carve-out (important).* "Zero open children" is not the same as "done." A =**= leaf task whose only descendants are dated *notes* — a captured "Ideas", "Goals", or "Current State" entry, not a real completed sub-task — is unstarted work with a note attached, not a finished container. Do not close it. Tell the two apart by intent: a container reads as a grouping (a =PROJECT= keyword, an explicit "parent grouping ..." line, or several dated entries that were genuinely separate sub-tasks that shipped); a leaf-with-notes is a single feature/bug task whose title names unstarted work and whose lone dated child is a design note. When the call is ambiguous, flag it NEEDS-USER rather than closing.
+
+Never close or promote autonomously past the ambiguous line — surface the candidates with a recommendation and let Craig ratify, the same interactive stance as Phase C.5. Clear container completions (a =PROJECT= whose every child is dated) can be proposed as a batch; leaf-with-notes ambiguities are flagged individually. Verify open-vs-done counts against the actual headings (a real scan of the subtree), not a fragile regex that a shell's =\b= support can silently break — a miscount here closes live work.
+
 ** Phase D — Flag the judgment calls (interactive)
 
 Present the NEEDS-USER bucket as a short, scannable list — one line per task, naming the decision or the fact required. Adjudicate with the user one item at a time (inline numbered options per =interaction.md=, no popup). Apply the user's calls as they come (which may itself produce more autonomous updates, or new tasks).
@@ -147,3 +164,7 @@ Two Phase C behaviors added, both surfaced by an Emacs-config =todo.org= audit:
 
 - *Tag-vocabulary enforcement.* That project declares a closed tag set (=bug=, =feature=, =refactor=, =test=, =quick=, =solo=); the audit had to strip ~44 ad-hoc tags that had accumulated across the file. The prior workflow only checked that a type tag was *present* — it had no concept of an exhaustive allowed set. The new bullet enforces a declared closed vocabulary and leaves open-vocabulary projects untouched.
 - *Code-complete-but-unverified closing.* Many tasks had shipped (tests green, live in the daemon) but stayed open awaiting a manual or visual verification, so they accumulated as half-open. Leaving them open is noise; auto-closing them would violate "never claim a fix verified before the user confirms." The fix routes the pending human check into the project's =Manual testing and validation= parent (dedup-checked) per =verification.md='s manual-verification hand-off, then closes the implementation task. The work is done and the check is tracked; a failed check promotes to a bug.
+
+** 2026-07-01 — Retire completed parents (Phase C.6)
+
+Added Phase C.6: retire a parent task once its child *tasks* are all done. Zero open children → close the parent; one or two open children → promote them to standalone level-2 tasks, then close. Surfaced by an Emacs-config =todo.org= audit where several PROJECT containers had all children complete. Depends on =todo-cleanup.el --convert-subtasks= running first so completed sub-tasks are dated (not lingering =DONE= keywords) and the open-child count is accurate. Carries a leaf-with-notes carve-out: a =**= leaf task whose only descendant is a dated design note ("Ideas"/"Goals") is unstarted work, not a finished container, and must not be closed — the ambiguous case is flagged NEEDS-USER. The step also warns against counting open-vs-done with a fragile regex (a =\b= that a given shell/awk silently drops miscounts and closes live work).
diff --git a/.ai/workflows/task-review.org b/.ai/workflows/task-review.org
index 69e172d..ba1571a 100644
--- a/.ai/workflows/task-review.org
+++ b/.ai/workflows/task-review.org
@@ -57,7 +57,9 @@ Keep is the common case — most tasks are still right and just need re-stamping
 
 *** Tagging =:quick:= — small tasks
 
-While reviewing each task, estimate its effort. If you judge it *30 minutes or less* and it doesn't already carry =:quick:=, add the tag to the heading line. If the heading and body don't tell you how long it'll take, *ask Craig* — don't guess. A wrong =:quick:= is worse than none: the tag exists so Craig can grab a genuinely small task in a spare moment, and a mislabeled one wastes that moment.
+The =:quick:= and =:solo:= assessments (this section and the next) are *mandatory* for every reviewed task except a Kill — a review that skips them is incomplete. The hard definitions live in [[file:../../claude-rules/todo-format.md][todo-format.md]] ("Hard definitions: :solo: and :quick:"); autonomous execution (work-the-backlog / the no-approvals speedrun) reads =:solo:= as its eligibility gate and trusts the author's tag, so the run-time gate is only as trustworthy as this pass.
+
+While reviewing each task, estimate its effort. If you judge it *30 minutes or less* and it doesn't already carry =:quick:=, add the tag to the heading line. If the heading and body don't tell you how long it'll take, *ask Craig* — don't guess. A wrong =:quick:= is worse than none: the tag exists so Craig can grab a genuinely small task in a spare moment, and a mislabeled one wastes that moment. =:quick:= is an effort hint only, never an eligibility gate — size does not decide what runs autonomously.
 
 This is orthogonal to the action chosen — a task can be kept (or re-graded, or marked DOING) *and* tagged =:quick:= in the same pass. Skip the assessment on a Kill, since it's leaving the pool. Tags go on the heading line per [[file:../../claude-rules/todo-format.md][todo-format.md]], sharing one =:tag1:tag2:= cluster.
 
@@ -67,7 +69,7 @@ While reviewing each task, judge whether Claude could build *and* verify it with
 
 1. *Buildable* — Claude has the capability and access to do the work.
 2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste.
-3. *No upfront decision* — no design or preference call Craig must make before Claude can begin.
+3. *No deliberation* — no open design question and no "weigh these approaches" with real tradeoffs. At most one or two *quick, upfront-answerable* factual decisions are allowed — the speedrun preset batches those into its pre-flight Q&A, so they don't break the hands-off run. A genuine design or preference call disqualifies.
 
 If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
 
@@ -96,6 +98,8 @@ The exact date string matters: =task-review-staleness.sh= and the wrap-up health
 
 Follow the completion rules in [[file:../../claude-rules/todo-format.md][todo-format.md]]. A killed top-level =**= task stays task-shaped: change the keyword to =CANCELLED=, add a =CLOSED: [YYYY-MM-DD Day]= line under the heading (generate with =date "+%Y-%m-%d %a"=), and leave the priority and tags intact. It's then a candidate for =--archive-done= at the next cleanup. Don't stamp =:LAST_REVIEWED:= on a kill — it's leaving the review pool anyway.
 
+A killed *sub-task* (=***= or deeper, under a parent task) instead becomes a dated event-log entry per the depth rule — but you don't have to hand-format it here. =todo-cleanup.el --convert-subtasks= (run in the =clean-todo= and wrap-up cleanup passes) rewrites any level-3+ DONE/CANCELLED/FAILED heading into its dated form mechanically from the =CLOSED= cookie, so a keyword-plus-=CLOSED= close at depth gets normalized on the next cleanup rather than lingering. =lint-org.el= flags any that slip through (checker =subtask-done-not-dated=).
+
 * Phase D: Close out
 
 When the batch is done (or Craig calls it early):
diff --git a/.ai/workflows/work-the-backlog.org b/.ai/workflows/work-the-backlog.org
new file mode 100644
index 0000000..642162d
--- /dev/null
+++ b/.ai/workflows/work-the-backlog.org
@@ -0,0 +1,263 @@
+#+TITLE: Work the Backlog
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-07-02
+
+* Overview
+
+The single home for the autonomous task-execution loop: take a set of marked, solo-doable tasks from the project's =todo.org= and work them unattended, each held to the full quality bar, under a fixed safety contract. Spec: =rulesets/docs/specs/2026-06-16-autonomous-batch-execution-spec.org=.
+
+Two callers feed it, differing only in how they build the task set and which session mode they pass:
+
+- The *inbox auto-loop* (=inbox.org= auto mode) chains here after its routing completes, with a tag/priority query, file-only mode, cap 1.
+- The *no-approvals speedrun* preset feeds an explicit ordered list with autonomous-commit + always-push + paging-on, after a pre-flight Q&A that front-loads every decision.
+
+This workflow owns the execution logic — eligibility gate, defer checklist, quality bar, run cap. Callers own input assembly and mode selection. Capture-routing (inbox surfaces) stays entirely in =inbox.org=; this file never reads an inbox.
+
+* When to Use This Workflow
+
+Invoked by its two callers, or directly by phrase:
+
+- *Speedrun triggers:* "speedrun", "no approvals speedrun", "speedrun these: <task set>" — run the no-approvals speedrun preset (below). The word "speedrun" always routes here, even when the phrase also says "no approvals": plain =no-approvals.org= is the general session mode; the speedrun is this workflow's preset over an explicit task set.
+- *Loop caller:* =inbox.org= auto mode chains here after its routing (below). Not phrase-triggered.
+
+Manual fallback: "work the backlog" / "work the backlog with <task set>" — gather the three inputs below (ask for whichever are missing, defaulting to file-only mode; default cap is the list length for an explicit set, 1 for a query) and run the loop.
+
+* Inputs — the caller contract
+
+A caller hands this workflow three things:
+
+1. *A task set* — an ordered list of candidate task headings from the project's =todo.org=. Either an explicit ordered list (speedrun) or the result of a tag/priority query (the loop). The loop does not care how the set was assembled; it receives an ordered list of candidates.
+2. *A session mode* — two orthogonal flags:
+   - *Commit autonomy:* =file-only= (default) or =autonomous-commit=. See "Commit autonomy" below.
+   - *Paging:* on or off. End-of-set only.
+3. *A run cap* — the hard maximum number of tasks to complete this run.
+
+It returns a per-task outcome and a run summary.
+
+* Outcomes — the per-task vocabulary
+
+Every task in the set ends in exactly one of:
+
+- =implemented-committed= — implemented, committed (and pushed per the project's flow) under =autonomous-commit=.
+- =implemented-diff-surfaced= — implemented, diff surfaced, *not* committed (=file-only=).
+- =deferred-VERIFY= — a defer-checklist hit; a =VERIFY= filed naming what's missing or risky.
+- =dropped-by-craig= — removed from the run at the speedrun pre-flight Q&A ("skip this").
+- =skipped-ineligible= — failed the mechanical eligibility gate.
+- =failed= — implementation was attempted and abandoned: the tree is left working (never commit a broken state), the failure is surfaced in the run summary, and the run continues to the next task.
+
+The run summary lists each task with its outcome, plus the remaining set when the cap stopped the run.
+
+* The loop
+
+For the task set, in order, until the run cap is hit:
+
+1. *Eligibility gate* (below). Ineligible → record =skipped-ineligible=, next task.
+2. *Scope read* of the relevant code. Cheap; just enough to run the defer checklist.
+3. *Defer checklist* (below). Any hit → defer: file the =VERIFY= naming the gap and record =deferred-VERIFY= (or, under the speedrun preset, route a quick-question gap to the pre-flight Q&A), next task.
+4. *Implement* under the project's commit discipline: TDD red→green→refactor, then =/review-code --staged=, fix all Critical/Important findings, then close the task per =todo-format.md='s completion rules. Decompose into as many logical commits as the change needs — size is not capped. If implementation fails partway, leave the tree working, record =failed=, surface it, and continue to the next task.
+5. *Commit autonomy branch:*
+   - =file-only= → surface the diff, do *not* commit. Record =implemented-diff-surfaced=.
+   - =autonomous-commit= → =/voice personal= on the message, commit individually, push per the project's flow. Record =implemented-committed=.
+6. *Record metrics* for the task (the JSONL append — see Metrics below).
+7. Decrement the cap. At zero, stop.
+
+After the set: if the paging flag is set, fire the end-of-set page (below). Surface the run summary either way.
+
+* Eligibility gate — mechanical, no judgment
+
+A task is autonomous-safe when *both* hold. This layer is a lookup, not a judgment; all the judgment lives in the defer checklist.
+
+1. *Status is =TODO=* — never =VERIFY=, =DOING=, =DONE=, or =CANCELLED=. =VERIFY= marks "awaiting Craig's input"; auto-implementing one defeats the check it represents. The do-not-implement set is safe-by-omission: anything not plainly =TODO= (plus any project-declared "hold" marker) is out.
+2. *Tagged =:solo:=* — the autonomy tag, resolved against the project's priority/tag scheme header in =todo.org= (never hardcoded). =:solo:= carries the hard definition in =todo-format.md=: completable and verifiable without Craig beyond at most one or two quick decisions answerable up front, no design deliberation. A project whose scheme declares a different autonomous-safe tag set overrides the default.
+
+Priority and =:next:= drive *ordering* within the eligible set, not eligibility ([#A] before [#B] before [#C], then the author's ordering). =:quick:= is an effort hint for batching and duration estimates — never a gate.
+
+Task *size* is deliberately absent from this gate. A large but well-specified, decision-free task is in scope and gets decomposed into per-logical-commit chunks during implementation. Size never sends a task away; only *deliberation* or *risk* does (the checklist below).
+
+*No scheme header → don't run.* The gate reads =:solo:= semantics from the project's scheme header; a =todo.org= without one leaves the tag undefined (=todo-format.md= makes the header mandatory). Surface that the header is missing and stop rather than guessing eligibility.
+
+* The defer checklist — act vs file
+
+After the scope read, run each eligible candidate through the checklist. Each item is a concrete, answerable question, not an adjective. *Any* hit — or any "unsure" — defers the task. Only a task that clears every item is implemented.
+
+1. *Test-writability (the keystone).* Can I write the failing test from the task text — plus any decisions gathered up front — without inventing a requirement? *No / unsure* → underspecified. Under the speedrun preset, if the gap is one or two quick answerable questions, route it to the pre-flight Q&A; otherwise file a =VERIFY= noting what's missing. Under the unattended loop, file the =VERIFY= (no one to ask).
+2. *Data-loss / irreversible / external operation.* Does implementing it require any of: =rm= of non-scratch data, =git reset --hard= / force-push, =DROP= / =DELETE= / =TRUNCATE=, file truncate/overwrite of persisted content, a schema or data migration, any external or shared-state mutation, any credential touch? *Yes* → do NOT implement; file a =VERIFY= naming the risk. This is the hard safety gate; an upfront answer never overrides it without an explicit checkpoint.
+3. *Already-satisfied.* Does the scope read show the desired end-state already holds? *Yes* → file a =VERIFY= noting it and move on. Don't make a no-op change.
+4. *Design deliberation.* Does the task carry an unresolved design question, a "weigh these approaches" with real tradeoffs, or a TBD that isn't a quick factual answer? *Yes* → under the speedrun preset, if it collapses to one or two quick questions, route to the pre-flight Q&A; otherwise file and surface as a =/start-work= candidate. Under the loop, file. The discriminator is *quick-answerable question* vs *deliberation* — never task size.
+
+When genuinely unsure which side a task falls on, defer — a wrong auto-implement costs a revert *and* the next-session correction.
+
+** Filing the deferral =VERIFY=
+
+Every checklist hit files a =VERIFY= in the project's =todo.org=, per =todo-format.md='s VERIFY rules:
+
+- *Dedup first.* If a =VERIFY= sibling for this deferral already exists (a prior run filed it), don't file another — record the outcome as =deferred-VERIFY= with a "previously filed" note and move on. The deferred task keeps its =TODO= status and tags, so without this check every subsequent run would re-defer and re-file.
+- *Placement:* sibling of the deferred task (the deferred task is the trigger) — a =**= task gets its =VERIFY= at =**=, a =***= sub-task gets it at =***= under the same parent, never deeper.
+- *Heading:* carries the question or risk on its own ("VERIFY <topic> — migration touches persisted rows").
+- *Body:* which checklist item hit, what's missing or risky, and what answer or action would make the task runnable. For an already-satisfied hit, the evidence that the end-state already holds.
+
+** Routing a quick-question gap (speedrun only)
+
+Under the speedrun preset, a checklist-1 or checklist-4 hit that collapses to one or two quick answerable questions routes to the pre-flight Q&A instead of deferring (see the preset section below). The discriminator: a *quick question* is a factual or preference pick answerable in one line without weighing tradeoffs ("cap at 5 or 8?", "which config key name?"); *deliberation* is anything that needs tradeoffs weighed, options explored, or code read by Craig. A task needing three or more questions isn't quick-question-gapped — it's underspecified; file the =VERIFY=. Checklist item 2 (data-loss / irreversible) never routes to the Q&A: an upfront answer doesn't override the hard safety gate.
+
+The unattended loop has no one to ask — every hit defers there.
+
+* Per-task quality bar
+
+Autonomy changes who approves, not what quality means. Per task, non-negotiable:
+
+- *TDD* per =testing.md=: red first, green, refactor. The keystone checklist item already proved the failing test is writable.
+- *Verification* per =verification.md=: fresh evidence, full suite green before any commit.
+- *=/review-code --staged=* before every commit; Critical and Important findings block until fixed.
+- *=/voice personal=* on every commit message on the =autonomous-commit= path (or the patterns walked inline if the skill is unavailable), message printed inline so the log shows what landed.
+- *Task closure* per =todo-format.md=: depth-based completion (keyword + =CLOSED:= at level 2, dated rewrite at level 3+).
+- *One logical change per commit.* A large task becomes several commits, not one omnibus.
+
+* Commit autonomy
+
+=file-only= is the default: surface the diff, never commit. =autonomous-commit= is honored only when the project carries the commit-autonomy waiver, read fresh each run — never from memory of past runs or "this project usually allows it."
+
+The waiver lives in the project's =.ai/notes.org= *Workflow State* section as marker lines, the same shape as the workflow markers already there:
+
+#+begin_example
+:COMMIT_AUTONOMY: yes
+:LOOP_MAY_COMMIT: yes
+#+end_example
+
+- =:COMMIT_AUTONOMY: yes= — the project has the waiver. An =autonomous-commit= request (the speedrun preset, or a manual run asking for it) is honored.
+- =:LOOP_MAY_COMMIT: yes= — the *unattended loop caller* may also commit. It requires =:COMMIT_AUTONOMY:= alongside it; the split exists because "Craig-initiated speedrun may commit" and "the recurring loop may commit unattended" are different levels of trust. Without this flag the loop stays =file-only= even when the project holds the waiver.
+
+An absent marker means no. Anything other than a plain =yes= value also means no. The read is one grep of the Workflow State section — a lookup, not a judgment.
+
+*The degrade contract.* When a caller requests =autonomous-commit= and the required marker is missing, degrade to =file-only= and surface it in both the run intro and the run summary: "autonomous-commit requested, no :COMMIT_AUTONOMY: waiver in notes.org — running file-only." Never honor the request without the marker, and never drop to file-only silently — the first commits into a project that didn't opt in, the second hides why nothing got committed.
+
+* Bounding the run
+
+The cap is a hard per-run task ceiling passed by the caller — the kill switch a runaway can't exceed:
+
+- *Loop caller default: 1.* Implement the highest-priority eligible candidate, record, stop; the next tick continues.
+- *Speedrun: the length of the explicit list*, capped at a ceiling — the human bounded the set by naming it.
+
+Even the speedrun stops at the cap and surfaces (and, with paging on, pages) the remainder. The cap bounds task *count*, not cost; a token budget is logged as vNext.
+
+* Context hygiene — auto-flush between tasks
+
+Task boundaries are clean boundaries by construction: the previous task is closed and committed (or filed), nothing is half-edited. When the context window grows heavy mid-run, run the flush skill's *auto mode* between tasks: checkpoint the session anchor with the remaining task set, session mode, and cap in Next Steps (so the resumed context continues the run blind), arm the self-injection (=.ai/scripts/self-inject.sh= via =tmux run-shell -b=), and end the turn. The fresh context resumes from the anchor and works on. Unattended runs only — the keystroke-collision hazard and the full mechanism live in the flush skill.
+
+* End-of-set page
+
+With paging on, fire one page when the set is done or the cap is hit — end-of-set only, never per-task:
+
+#+begin_src sh
+notify alarm "Page" "<project>: <N> done, <M> remaining — <one-line summary>" --persist
+#+end_src
+
+=--persist= keeps it on screen until dismissed (the page-me convention). The page fires when the set completes *or* the cap stops the run — either way exactly once. The message carries the project name, the completed count, and the remaining count (with skipped tasks noted in the run summary) so Craig can confirm ready and name the next project in one reply. There is no separate page-signal call — =notify= is the paging surface.
+
+* Metrics
+
+Each task outcome appends one JSON line to the project's =.ai/metrics/work-the-backlog.jsonl= — git-tracked, append-only, =jq=-queryable. Create the directory and file on the first append. Logging is a side effect only: a failed append surfaces a warning in the run summary but never blocks, reorders, or aborts execution.
+
+One record per task, written at the moment its outcome is decided:
+
+| Field              | Meaning                                                                                         |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =ts=               | ISO-8601 timestamp of the task outcome                                                          |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =run_id=           | UUID shared by every record in one run (=uuidgen= at run start)                                 |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =project=          | project basename                                                                                |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =caller=           | =loop= / =speedrun= / =manual=                                                                  |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =task=             | the task heading (slug)                                                                         |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =outcome=          | =implemented-committed= / =implemented-diff= / =deferred-verify= / =skipped-ineligible= /       |
+|                    | =dropped-by-craig= / =failed=                                                                   |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =defer_reason=     | =underspecified= / =data-loss= / =already-satisfied= / =needs-deliberation= — set on            |
+|                    | =deferred-verify= records only                                                                  |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =upfront_decision= | =true= when a pre-flight answer was recorded and used for this task                             |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =wall_clock_s=     | seconds from task start to outcome                                                              |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =commit_sha=       | committed tasks: the commit SHA (comma-separated when the task decomposed into several); empty  |
+|                    | otherwise                                                                                       |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =review_findings=  | count of =/review-code= Critical + Important findings on this task                              |
+|--------------------+-------------------------------------------------------------------------------------------------|
+
+The =outcome= slugs map one-to-one onto the outcome vocabulary above (=implemented-diff= is =implemented-diff-surfaced=; =deferred-verify= is =deferred-VERIFY=). Per-run rollups (attempted / completed / deferred / dropped, wall-clock total, findings per commit) are computed at synthesis, not stored per record. The =commit_sha= field is what the synthesis step's corrections signal keys on — whether a later commit reverted or hand-fixed an autonomous one — so never omit it on a committed task.
+
+* Caller: the inbox auto-loop
+
+=inbox.org= auto mode chains here as an explicit second step *after* its routing completes — never as a phase inside inbox processing. When a cycle files new items and Craig answers "run this batch next?" with yes, auto mode invokes this workflow with:
+
+- *Task set:* the eligibility query over the queued/filed items — status =TODO= + =:solo:= per the scheme header, priority-ordered.
+- *Session mode:* =file-only=, paging off. (A project carrying both =:COMMIT_AUTONOMY:= and =:LOOP_MAY_COMMIT:= markers opts the loop into commits — see Commit autonomy above.)
+- *Cap: 1.* The highest-priority eligible candidate runs, gets recorded, and the loop's next tick (or the next yes) continues from there.
+
+The loop has no human at kickoff of each task, so a needs-quick-decisions task defers with a =VERIFY= — the pre-flight Q&A is a speedrun capability, not a loop one. Startup and wrap-up never invoke this workflow.
+
+* Preset: the no-approvals speedrun
+
+The named preset is a label for one flag combination, not a second code path: *explicit ordered list + =autonomous-commit= + always-push + paging-on*, with every approval front-loaded into a single pre-flight step. "No approvals" means all input first, then hands-off — not no input ever. =autonomous-commit= still requires the =:COMMIT_AUTONOMY:= waiver (Commit autonomy above); without it the preset degrades to =file-only= and says so in the pre-flight intro.
+
+When Craig names a task set and says "speedrun":
+
+1. *Gather* the named task set.
+2. *Scope-read and classify* each task against the eligibility gate + defer checklist: *ready* (clears everything), *needs-quick-decisions* (one or two upfront-answerable questions — checklist item 1 or 4), or *drop* (data-loss/irreversible, or deliberation that isn't a quick question).
+3. *Order* the list — priority, then the author's ordering / =:next:=.
+4. *Intro the work* — present the ordered plan: what will run, what was dropped and why, and the batched questions for the needs-quick-decisions tasks.
+5. *Craig answers each question or says "skip this"* — a skip removes the task (recorded =dropped-by-craig=; the task itself stays =TODO=); an answer is recorded so implementation works from the decision, not a guess.
+6. *Run the finalized list autonomously* — no further approvals until done. Cap = the list length (the human bounded the set by naming it), still one commit per logical change, always-push per the project's flow, auto-flushing between tasks when the context grows heavy (see Context hygiene above).
+7. *End-of-set page* with completed + remaining + skipped.
+
+The batch-ask (step 4-5) is one message: each question names its task, puts the recommended answer at item 1 when there is one (per =interaction.md= — inline numbered, no popup), and offers "skip this" as the last option. Before the run starts, write each answer into its task's body in =todo.org= as a dated line — the implementation works from the recorded decision, and the record survives the session. The Q&A fires only under this preset; the loop caller never asks (its decision-needing tasks defer).
+
+*** Per-item disposition rule
+
+For every item the run picks up (this holds for any executing caller, including an auto-inbox-zero run given a standing yes):
+
+- *Feature-level task* → write a spec first (=spec-create=), don't implement directly. The spec is the run's deliverable for that item.
+- *Needs decisions you can't confidently guess* → file it as a =VERIFY= carrying the question (under this preset, one or two quick questions route to the pre-flight Q&A instead).
+- *Well-defined* → implement it, taking the time it needs.
+
+This extends the defer checklist: the checklist decides *act vs file*; this rule decides the *shape* of the act.
+
+* Synthesis: metrics → org-roam KB
+
+Trigger: "synthesize backlog metrics" (optionally a weekly scheduled run). This is the read side of the metrics log — Craig's ask was "gather data and create org-roam articles we can look at later," and this step is the second half. It is read-only over the logs plus exactly one KB write.
+
+1. *Gather the JSONL union.* Discover =.ai/metrics/work-the-backlog.jsonl= across the project roots (dirs carrying =.ai/protocols.org= under =~/code=, =~/projects=, =~/.emacs.d=). Classify each project per =knowledge-base.md= (work-root denylist, never inference) before reading it into the union.
+2. *Enforce personal-only.* A work-classified or unknown project's metrics never enter the KB write — they stay in that project's own log. Report the exclusion per the KB refusal contract: the classification, a one-line redacted summary, and where the data stayed.
+3. *Compute the rollups and trends.* Per run: attempted / completed / deferred (by reason) / dropped / failed, wall-clock total, commits landed, review findings per commit. Trends across runs: completion rate over time, defer-reason distribution, findings-per-commit trend.
+4. *Compute the corrections signal* — the key metric. For each =commit_sha= in the window, check that project's history for a later commit (within ~14 days) that reverts it or carries a fix touching the same files. A clean run is one whose autonomous commits survive untouched; a flagged run is what Craig reviews by hand. This is a cheap proxy, not proof — it flags candidates, it doesn't convict.
+5. *Write one KB node* at =~/org/roam/agents/YYYYMMDDHHMMSS-backlog-metrics-<window>.org= per =knowledge-base.md=: =:agent:metrics:= filetags, a concise title, the rollup table, the trend narrative, and =[[id:...]]= links to prior synthesis nodes so the series is traceable. Pull before writing, commit and push after — the normal KB session discipline.
+
+The KB node is the artifact Craig reads later: "are the runs completing more and getting corrected less?" should read off the trend table without touching raw logs. Synthesis never mutates the JSONL, todo.org, or any project tree.
+
+* Common Mistakes
+
+1. *Implementing a =VERIFY= or =DOING= task.* The gate is status =TODO= only — a =VERIFY= exists precisely because Craig's input is pending.
+2. *Treating =:quick:= as eligibility.* It's an effort hint. =:solo:= is the gate.
+3. *Deferring on size.* A large, well-specified, decision-free task runs — decomposed into logical commits. Size is not a checklist item.
+4. *Guessing past the keystone.* If the failing test isn't writable from the task text, the task isn't ready. Inventing the requirement is the failure the checklist exists to stop.
+5. *Rationalizing through the data-loss list.* "The migration is small" doesn't clear checklist item 2. Enumerated operations defer, full stop.
+6. *Committing in =file-only= mode.* The diff is the deliverable; the commit is Craig's.
+7. *One omnibus commit for the whole run.* Every logical change is its own reviewed commit.
+8. *Skipping =/review-code= or =/voice= because nobody's watching.* Autonomy removes interaction gates, never engineering-discipline gates (same contract as =no-approvals.org=).
+9. *Running past the cap.* The cap is the kill switch; hitting it means stop and surface, even mid-set.
+10. *Paging per-task.* One page, end of set.
+11. *Honoring =autonomous-commit= from memory.* The waiver is the marker line in =notes.org=, read fresh each run. "This project usually allows it" isn't a read.
+12. *Re-filing the same deferral =VERIFY= every run.* The deferred task stays =TODO=, so a run that skips the existing-sibling check spams =todo.org= with duplicates.
+13. *Routing a data-loss hit to the pre-flight Q&A.* Checklist item 2 is the hard gate — an upfront answer never clears it without an explicit checkpoint.
+
+* Living Document
+
+Refine as the dogfooding signal arrives — the metrics log and the corrections-in-next-session signal are the feedback loop. Fold recurring adjustments in rather than accumulating caller-side workarounds.
+
+* History
+
+Created 2026-07-02 as Phase 1 of the autonomous-batch execution spec, reconciling the inbox-zero "Phase E" proposal and the =.emacs.d= speedrun proposal into one execution loop. The auto-inbox-zero execute step in =inbox.org= reverted to routing-only in the same change so this file is the loop's only home. Phases 2-6 (same day) wired both callers, pinned the commit-autonomy waiver markers, fleshed the defer/Q&A/page mechanics, and added the metrics record + KB synthesis step.
diff --git a/.ai/workflows/wrap-it-up.org b/.ai/workflows/wrap-it-up.org
index 5d2cdd2..d0c4e75 100644
--- a/.ai/workflows/wrap-it-up.org
+++ b/.ai/workflows/wrap-it-up.org
@@ -137,6 +137,22 @@ Run the report-only variant first if you want to see what would change without w
 emacs --batch -q -l .ai/scripts/todo-cleanup.el --check todo.org
 #+end_src
 
+*** Convert done sub-tasks to dated entries
+
+#+begin_src bash
+[ -f todo.org ] && emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org
+#+end_src
+
+=--convert-subtasks= rewrites every heading at level 3 or deeper whose TODO state is DONE/CANCELLED/FAILED into a dated event-log entry (=<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>=), dropping the keyword, priority cookie, and tags, and removing the now-redundant =CLOSED:= line. This enforces the =todo-format.md= depth rule that a completed *sub-task* (a heading under a parent task) becomes dated history, not a lingering DONE keyword — a shape an interactive org close (=org-log-done= → DONE + CLOSED) never applies and =--archive-done= (level-2 only) never reaches. The timestamp comes from each entry's own =CLOSED= cookie; a date-only close yields =00:00:00=. Heading text is kept verbatim. Idempotent (an already-dated heading has no keyword to match), and a done sub-task with no parseable =CLOSED= is flagged and left alone rather than stamped with a fabricated date.
+
+Run this *before* =--archive-done= so that when a completed level-2 parent is archived, its sub-tasks already carry their dated form. Any rewrites show up in the wrap-up commit's diff for review before push.
+
+Preview without writing:
+
+#+begin_src bash
+emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks --check todo.org
+#+end_src
+
 *** Archive completed work
 
 #+begin_src bash
@@ -244,6 +260,32 @@ The check exempts =lint-followups.org= explicitly because lint-org runs earlier
 
 This integrates with =inbox.org= process mode, which stamps =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section on completion. Wrap-up doesn't double-stamp. It only ensures the inbox carries nothing but the expected pipeline artifacts at session end.
 
+*** Cross-project router (optional — route filed keepers to their home projects)
+
+Runs directly after the inbox sanity check. The split between the two: the sanity check *gates* the wrap (a dirty inbox blocks until resolved); the router is *optional* (skipping it never blocks anything — the candidates just stay local until a future wrap). Spec: =docs/specs/wrapup-routing-spec.org= (D7/D8/D9).
+
+The candidate set is exactly the local tasks carrying a =:ROUTE_CANDIDATE:= property — keepers that inbox process mode filed this session whose inferred home is another project. Never scan the standing backlog.
+
+#+begin_src bash
+.ai/scripts/route-batch --list
+#+end_src
+
+*Empty set = zero interaction.* =--list= prints nothing when there are no candidates; continue the wrap silently — no prompt, no "0 items" line.
+
+When candidates exist, surface the batch as one line per task — the task heading, the destination project, the delivery mode (=inbox-send= file handoff), and the engine's confidence — then offer exactly two options: *go* (route the whole batch) or *skip* (leave everything local). Derive each confidence label by running the engine on the task's heading + body (=python3 .ai/scripts/route_recommend.py --item "..." --exclude "$(basename "$PWD")"=); label weak matches visibly ("weak — verify the destination") so a low-confidence route gets a human glance before the keystroke.
+
+On *go*:
+
+#+begin_src bash
+.ai/scripts/route-batch --go
+#+end_src
+
+Per candidate, the helper writes the task's subtree (children ride along; =:ROUTE_CANDIDATE:= stripped, headings promoted to top level) to a one-task handoff, delivers it via =inbox-send <destination> --file= (so the =from-<this-project>= provenance is stamped and the destination's inbox process mode dispositions it as a single item), and only after a successful send removes the subtree from the local =todo.org= — a single-file local edit the wrap is already committing. A failed send leaves that task in place and exits non-zero; report it and continue the wrap. Never write the destination's =todo.org= directly; its own inbox processing files the task per its conventions.
+
+On *skip*, leave every candidate in place, marker included — they resurface next wrap.
+
+Mis-routes are recoverable: the receiving project rejects via inbox process mode's reject-from-another-project flow, which returns the item to this project's inbox with the rationale. That reject path is why removing the local source on send is safe.
+
 *** Review-habit health check (surface a slipped daily task-review)
 
 The daily task-review habit walks the open top-level tasks on a rotating cycle, stamping =:LAST_REVIEWED:= as it goes (see =task-review.org=). This check is the watchdog for that habit. When tasks have gone too long unreviewed, the habit has slipped, and the wrap-up says so in one line — it does not re-list the tasks.
@@ -536,7 +578,7 @@ Before considering wrap-up complete:
 - [ ] The Summary ends with the =KB: promoted N / consulted yes-no= line (promotion check ran)
 - [ ] File renamed to =.ai/sessions/YYYY-MM-DD-HH-MM-description.org=
 - [ ] =.ai/session-context.org= no longer exists
-- [ ] =todo-cleanup.el= ran — hygiene pass + =--archive-done= + =--sync-child-priority= (if =todo.org= exists at project root)
+- [ ] =todo-cleanup.el= ran — hygiene pass + =--convert-subtasks= + =--archive-done= + =--sync-child-priority= (if =todo.org= exists at project root)
 - [ ] =lint-org.el= ran on =todo.org= — mechanical fixes applied, judgments appended to follow-ups file (if =todo.org= exists)
 - [ ] Any orphan-planning-line warnings reviewed (fix or accept)
 - [ ] Inbox carries nothing but expected pipeline artifacts (=.gitkeep=, =lint-followups.org=, =PROCESSED-*= prefixes), OR each remaining handoff has an explicit deferral logged in the valediction